Skip to content

[DPL Analysis] Simple workflow suffix solution#5558

Merged
ktf merged 7 commits intoAliceO2Group:devfrom
saganatt:object-suffix
Mar 4, 2021
Merged

[DPL Analysis] Simple workflow suffix solution#5558
ktf merged 7 commits intoAliceO2Group:devfrom
saganatt:object-suffix

Conversation

@saganatt
Copy link
Collaborator

Hi @jgrosseo @ktf ,

turns out the workflow suffix needs to be appended early enough. The task names and hashes based on them are propagated to objects already inside the adaptAnalysisTask(). Any following task name changes prevent proper matching of output objects to tasks in sink.
I tried to do some workaround on the Framework side but we don't have access to the objects after tasks creation and before the sink. We could e.g. record task names in object headers, then in the sink function append the suffix and re-generate the hashes but this actually defies the purpose of having task hashes...

@saganatt saganatt requested review from a team, iarsene and jgrosseo as code owners February 25, 2021 14:44
@jgrosseo jgrosseo marked this pull request as draft February 25, 2021 15:00
@jgrosseo
Copy link
Collaborator

Thanks! This means at present every task would need to add the suffix by hand?
(I changed this PR to draft as it is just an illustration.)

@saganatt
Copy link
Collaborator Author

Yes :/

Comment on lines 299 to 307
Copy link
Member

@ktf ktf Mar 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not simply passing the ctx to the adaptAnalysisTask?

One possible solution we could have is to make the adaptAnalysisTask() function be a member of (an object returned by) the ConfigContext. Something like:

ctx.taskMaker().makeAnalysisTask<ETask>();

or alternatively have an helper method for the workflow, rather than for the simple task:

return adaptAnalysisWorkflow(ctx, TaskSpec<ETask>{"output-obj-test"}, TaskSpec<ATask>{"eta-and-phi-histograms"}});

What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another solution would be that defineDataProcessing does not return a std::vector<DataProcessorSpec>, but a ConfigureDataProcessorAction which then can be used to do the lazy configuration of the DataProcessor. I think we can even guarantee backward compatibility by having a constructor for ConfigureDataProcessorAction(DataProcessorSpec).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,
true, we can just pass the ctx and append the suffix inside the adaptAnalysisTask().

ConfigureDataProcessorAction - hm, still we need to pass the ctx somewhere, so it'll still mean a change in analysis workflows. I don't see how it could be much different from just passing the ctx to adaptAnalysisTask().

adaptAnalysisWorkflow looks also good to me since ctx would be used just once. Though it is a bigger change on the user side ;)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding ConfigureDataProcessorAction: my idea would be that the context is passed outside the user code.

Basically we would go from std::vector<DataProcessorSpec> runDataProcessing(ConfigContext...) to std::vector<ConfigureDataProcessorAction> runDataProcessing(ConfigContext...), where ConfigureDataProcessorAction would be something like:

enum struct DataProcessorAction : char {
  CopyFullFromProtoToTarget,         // The default action, copy the whole DataProcessorSpec to the one labelled as target
  AppendInputsFromProtoToTarget, // Take the inputs in proto and replace them in the one labelled.
  AppendOutputsFromProtoToTarget
};

struct ConfigureDataProcessorAction {
  std::string target;
  DataProcessorSpec *proto;
  DataProcessorAction action; 
};

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me it'd be then the same solution as your primary one, with appending the suffix after the tasks creation in overrideSuffix().

We have direct access to output object only inside adaptAnalysisTask(). So, later it is not possible to change the task hashes inside objects and copying the spec doesn't help.

Maybe I am wrong and changing something in DataProcessorSpec will be enough to match the hashes properly in sinks...? Though I haven't suceeded with it previously. In that case, wouldn't it be simplier to stay with overrideSuffix()?

@saganatt
Copy link
Collaborator Author

saganatt commented Mar 2, 2021

Ciao @ktf , what do you think about the current solution?

@ktf
Copy link
Member

ktf commented Mar 3, 2021

So we discussed this a bit with @jgrosseo. In the end I think the easiest is to pass the context to the adaptAnalysisTask, since there are places were tasks are added programmatically and adapting to adaptAnalysisWorkflow would require extra manipulations.

The more complex solution I was trying to propose is to use the adaptAnalysisTask to capture outputs and manipulate them afterwards, but after having a look at the current adaptAnalysisTask that is indeed quite of a development so I would just go for the modified adaptAnalysisTask, at least for now.

@saganatt
Copy link
Collaborator Author

saganatt commented Mar 3, 2021

Ok, I adjusted the code.

@jgrosseo
Copy link
Collaborator

jgrosseo commented Mar 3, 2021

Looks very good to me. I guess now we have to adjust all existing tasks, right? @saganatt do you feel like grepping for them..? ;-)

@saganatt
Copy link
Collaborator Author

saganatt commented Mar 3, 2021

ok, I will do ;-)

adaptAnalysisTask<BTask>("etaphi-histogram"),
adaptAnalysisTask<CTask>("pt-histogram"),
adaptAnalysisTask<DTask>("output-wrapper"),
adaptAnalysisTask<ATask>("eta-and-phi-histograms", ctx),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make it first argument please.

@saganatt saganatt marked this pull request as ready for review March 3, 2021 16:25
@saganatt saganatt requested a review from ginnocen as a code owner March 3, 2021 16:25
@ktf ktf self-requested a review March 3, 2021 17:01
ktf
ktf previously approved these changes Mar 3, 2021
@jgrosseo
Copy link
Collaborator

jgrosseo commented Mar 3, 2021

Thanks a lot for this tedious work!

@jgrosseo
Copy link
Collaborator

jgrosseo commented Mar 4, 2021

@ktf the errors seem only in the tests and unrelated. Is this due to the not running tests earlier? Can we merge?

@ktf ktf merged commit 4e73f99 into AliceO2Group:dev Mar 4, 2021
EmilGorm pushed a commit to EmilGorm/AliceO2 that referenced this pull request Nov 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants