Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-17038: Refactoring of CmdLineFwk class #8

Merged
merged 6 commits into from Jan 18, 2019
Merged

Conversation

andy-slac
Copy link
Collaborator

@andy-slac andy-slac commented Jan 15, 2019

CmdLineFwk class was split into small set of (mostly) reusable components. QuantumGraph execution is now handled by new QuantumGraphExecutor class which can be customized for particular execution model. multiprocessing-based execution model is implemented on top of that in MPGraphExecutor class.

Remaining code in CmdLineFwk class was slightly refactored to to make structure less confusing.

@andy-slac andy-slac force-pushed the tickets/DM-17038 branch 3 times, most recently from 073ce59 to 3a2d871 Compare January 16, 2019 21:37
Copy link
Member

@TallJimbo TallJimbo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Mostly small comments, though I think we should consider moving some operations out of QuantumExecutor into a separate component.

self.taskFactory = taskFactory

def execute(self, taskClass, config, quantum):
"""Execute super-task on a single Quantum.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now PipelineTask, not super-task.

# if hasattr(dataRef, "dataId"):
# lsst.log.MDC("LABEL", str(dataRef.dataId))
# elif isinstance(dataRef, (list, tuple)):
# lsst.log.MDC("LABEL", str([ref.dataId for ref in dataRef if hasattr(ref, "dataId")]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you intend to comment all of this out?

Copy link
Collaborator Author

@andy-slac andy-slac Jan 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was intentionally delaying it 🙂. I'll comment it out now and make sure it works (that was just a copy from CmdLineTask, I guess we have to match that).

Instance of ``taskClass`` type.
"""
# call task factory for that
return self.taskFactory.makeTask(taskClass, config, None, self.butler)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want any custom logging, I imagine you'd need to pass that information here through the TaskFactory into the Task constructor (maybe just via **kwarg forwarding?).

Side note: I assume the TaskFactory is here just to handle the complexity of getting InitInput datasets from Butler, since taskClass is defined here to really be a PipelineTask class, not the string name of one (which is good - the fewer places that need to worry about tasks-as-strings, the better). Does that mean we can factor that logic out of the TaskFactory ABC so it's just a function call that classes like this can use?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what do you mean by custom logging and what info do you want to pass to a task constructor, do you have some examples?

We can certainly factor out implementation of the makeTask from TaskFactory - it does not need that dependency on a TaskLoader so it may be beneficial to be able to use that code without instantiating TaskLoader.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to say that Task.__init__ takes a lsst.log.Log to be used for all of the logging it and its subtasks do, so if you want to customize that you'd need to do it here.

But looking closer, it seems it just uses that to set the Log instance's name (which we probably don't want to change here), and I presume that means customization (e.g. making all log messages include the data ID) would happen via some other API that doesn't work directly on the Log instance.

return self.taskFactory.makeTask(taskClass, config, None, self.butler)

def updateQuantumInputs(self, quantum):
"""Update qunatum with extra information.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: qunatum

Some methods may require input DataRefs to have non-None
``dataset_id``, but in case of intermediate dataset it may not be
filled during QuantumGraph construction. This method will retrieve
missing info from registry.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DataRefs -> DatasetRefs

Is the concern about needing dataset_id about executors or PipelineTasks? I don't think PipelineTasks will care, as butler.get will do a find itself if dataset_id is None.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found that registry.addQuantum() needs it, so it is Executors concern mostly. Not sure if anything else would ever need it, I just think that it's safer to add it here for consistency with pre-existing datasets that have dataset_id.

# TODO: see if Pipeline and software versions are already written
# to butler and associated with Run, check for consistency if they
# are, and if so skip writing TaskInitOutputs (because those should
# also only be done once). If not, write them.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This TODO represents one way we may want to approach this one (and probably the most likely one), but I think we should think about possible alternatives and high-level use cases before treating it as a straightforward TODO item. Comparing configs and versions and disabling that with --clobber-config has been very messy in Gen2, and I'd like to find a better way.

# copy all collected refs to output collection
collection = butler.run.collection
registry = butler.registry
registry.associate(collection, list(id2ref.values()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least after DM-16227, list here should be unnecessary; any iterable would be fine.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Will change it once that branch is merged.


Implementation of this method depends on particular execution model
and it has to be provided by a subclass. Execution model determines
what happens here, it can be either acual running of the task or,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: acual. Also, semicolon is better after "here" instead of comma.

for example, generation of the scripts for delayed batch execution.

Any exception raised in this method will be propagated to the caller
of `execute` method.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the exception propagation behavior should either be left to subclasses or controllable some other way; I think it's quite likely an activator might want to just warn and/or remember failures and proceed with executing any Quanta that did not depend on them.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is consistent with what is written in docstring - sub-classes are free to raise exceptions or do anything else according their own policy, same applies to activators. Error handling strategy is not very well defined at this point, we may will to think about it more.

args : `argparse.Namespace`
Parsed command line
"""
# If output collections are given then use them to override
# butler-configured ones.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There can only be one output collection, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know 🤔 There is one default output collection, but I think we also support per-dataset type collections too. I'll change the comment to make it more explicit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose I shouldn't have phrased this as a question - I don't think Butler can handle multiple output collections in the same run.

This new class implements small piece of cmdLineFwk funcionality which
is responsible for execution of single Quantum. Adds a bunch of
customization points so that new activators can extend default behavior.

No unit test yet for this class, it needs butler and registry mocks,
wouls be easier to make that when we have some simple actual butler for
unit tests.
One new class QuantumGraphExecutor which implements most common
pre-execution operations but defers actual execution to a sub-class.
MPGraphExecutor implements multiprocess-based execution on top of that.
@andy-slac
Copy link
Collaborator Author

@TallJimbo, one more commit to look at: 29e87d4. I hope this is last refactoring on this ticket 🙂. It splits initialization steps of QuantumGraphExecutor into a separate non-abstract class.

Copy link
Member

@TallJimbo TallJimbo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! One more (smaller) design thought in an in-line comment, and I wonder if we should move the new class to pipe_base (I don't feel super strongly about that; I imagine most other activators will depend on ctrl_mpexec, too, anyway).

"""Initialization of registry for QuantumGraph execution.

This class encapsulates all necessary operations that have to be performed
on butler and registry to prepare them for QuanrumGraph execution.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: QuanrumGraph.


Parameters
----------
graph : `~lsst.pip.base.QuantumGraph`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: pip.

Data butler instance.
"""
def __init__(self, butler):
self.butler = butler
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that it would be nice to run at least some of these operations on a Pipeline in the future as well as a QuantumGraph, what do you think about passing the QuantumGraph (either instead of or in addition to the Butler) at construction instead of to each of the methods? That would let us have a different class in the future with (mostly) the same method interfaces to handle Pipelines.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, some of that interface cannot even work with Pipeline, so I'm not sure how we can make it work. We could probably define some polymorphic interface for that but if it's going to work differently for graph and pipeline it will not be truly polymorphic. I'd say that having separate interface for Pipeline (maybe even in this same class) is probably better and less confusing.

@andy-slac andy-slac merged commit 1c85ce7 into master Jan 18, 2019
@timj timj deleted the tickets/DM-17038 branch April 23, 2020 17:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants