Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-22162: Add metadata writing to PipelineTask execution logic (pipe_base) #110

Merged
merged 2 commits into from Dec 10, 2019

Conversation

andy-slac
Copy link
Contributor

Add support for a special dataset type to store task metadata. The dataset type is added to quantum outputs like regular dataset so that other tasks can specify it as an input. The dataset type does not appear in standard connections config, instead it is added dynamically based on task config parameter saveMetadata (bool).

Copy link
Contributor

@natelust natelust left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall strait forward and well implemented

self.assertTrue(config.saveMetadata)
config.saveMetadata = False
self.assertFalse(config.saveMetadata)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So while I am all for unit tests, this test does not actually seem to be testing anything about code in pipe_base, but rather that pex_config works the way pex_config should, which presumably pex_config tests. If you wanted to test something about saveMetaData in a unit test, maybe just checking that the descriptor is present in the PipelineTaskConfig class, and or possibly its default value if you think changing it could cause future problems.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think about this test as testing for a default value of a parameter and that it can be changed to a different value. I can probably remove it altogether, this test initially was done for a different parameter (metadataDataset) with more complicated set of allowed values, it's not too useful for simple boolean option.

@@ -212,6 +222,13 @@ def addTask(self, task: Union[PipelineTask, str], label: str):
else:
raise ValueError("task must be either a child class of PipelineTask or a string containing"
" a fully qualified name to one")
if not label:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually Thinking about this now, I dont think is possible with the new pipeline object to have a task that does not have a label specified.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm looking at the code in ctrl_mpexec for task with no label, I didnt remove the ? in the regex, and or change make pipeline to use the task name if there is no label. This will lead to weird/broken behavior. If you dont want to change on this ticket, that's fine. I can make a ticket and fix that behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't want to force users to always provide label on the command line so if label is missing then it should come from _DefaultName. To know _DefaultName I need to import task class and this is where I thought is the most natural place for it. I do not want to import anything when command line is parsed, and another potential place for that is in CmdLineFwk class but I think that if I do it here it will be more generic. Of course if you say that Pipeline.addTask method has to receive non-empty label then I'd simply add a check here and move doImport to CmdLineFwk instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@natelust, let me know if you want me to move that doImport to CmdLineFwk before I merge both branches, should be easy for me to do, certainly faster than opening another ticket.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is fine here, I really dont like the doImport in either place, as it is done again later. One thing we talked about doing in the future is not having a _DefaultName at all, and using the name of the task in places where _DefaultName would have been used. How would you feel about just using the string name of the class for the label here? I think @TallJimbo might have had and opinion as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK with using class name instead of _DefaultName, but I know that _DefaultName has a long history and this should probably be discussed with wider audience. Just tell me what to do, I'll do it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets stick with _DefaultName for now. I'd like to replace that with the unqualified Task name or something derived from it eventually (and then take advantage of that to e.g. avoid the doImport here), but until we've done that more globally, using the unqualified Task name here just exacerbates the problem of having too many names for a Task.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I'll merge it as it is now.

@andy-slac andy-slac merged commit 204a9c6 into master Dec 10, 2019
@andy-slac andy-slac deleted the tickets/DM-22162 branch July 10, 2022 02:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants