Skip to content

Conversation

kfindeisen
Copy link
Member

This PR adds support for multiple pipelines that may be attempted for the same visit, in sequence. This required a significant reorganization of MiddlewareInterface, moving decisions that depend on the choice of pipeline to as late in the workflow as possible. However, the new code has fewer implicit dependencies, making it more robust to future changes.

The TODO proposed enforcing that there be only one coadd type in the
central repo. Since the repo will be /repo/embargo in practice, such
enforcement will not be possible.
This change removes the requirement that the output run be part of a
long-lived chain in the local repo, allowing more flexibility in how
the output run is determined in the future.
This change makes all use of the output run collection explicit.
This change does not affect MiddlewareInterface, which was only using
the run explicitly, but it does break several tests that assumed the
Butler always had a run.
The class now stores the datestamp on construction, and generates the
run names on demand. This change gives the same behavior, but will be
much easier to adapt to a design where potentially multiple runs
are attempted.
The output chain, which combined an output run, raw inputs, and a
defaults that excluded raws, was originally based on the ap_verify
dataset organization, which in turn imitated the collection style
created by pipetask. However, a distinction between raw and default
collections will not be needed in production, and the pipetask layout
is difficult to reconcile with a workflow that may use a different run
for every visit.

The output runs were removed from the "output" collection on e38bda3.
This commit removes the output collection itself, replacing it with
direct use of the defaults collection. make_local_repo and prep_butler
now guarantee that the run collection is part of the default chain in
the dev environment.

The removal of the <instrument>/prompt chain is a behavioral change,
though one that should only affect the local repo and not the
central repo.
Moving the choice of pipeline to the caller makes it possible to
prepare a choice of pipelines depending on procedural rather than
object state.
Moving the choice of pipeline to the caller makes it possible to get
the run(s) corresponding to any or all pipelines that it may need.
The test actually triggers (and tests for) a failure to define visits;
quantum graph generation is not even attempted before the looked-for
exception is raised.
Pipeline execution now loops over the list of available pipelines,
until it finds one that can be run. All other operations can simply
be applied to all pipelines.
This involves renaming the get_pipeline_file method to
get_pipeline_files, and replacing None return values with an empty
list. I've not tried to support more generic sequences in the spec,
since it's much harder to test and unlikely to be needed in an
internal class.
This is a breaking change to the pipeline config format, which now
always maps visit parameters to lists of files rather than single
files.
Since runs and configs are generated for *all* pipelines before
attempting graph generation, and runs are named based on the pipeline
file, pipelines with the same name would interfere with each other.

Allowing different subsets of a pipeline (using # notation) should be
safe, but I'm disallowing it to avoid confusion over exactly what
got run.
These pipelines can be used for visits that aren't compatible with the
full AP pipeline, or when faster processing is desired.
in_collections=output_run,
)
if exports:
exported = True
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If one pipeline yields outputs and the outputs are exported, is there a scenario that another pipeline has anything to be exported?
I'm thinking if we still need to go to the next item of the for loop after exported is True.

Copy link
Member Author

@kfindeisen kfindeisen Jul 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't expect more than one collection to contain outputs, but I'm not confident that that will always be true in the presence of failures or future code changes (e.g., doing pre-execution as a separate step, or expanding fallback to more situations). This way seems more robust.

``(survey="<survey>")=[<pipelines>]``. The zero or more pipelines
are comma-delimited, and each pipeline path may contain environment
variables. The list may be replaced by the keyword "None" to mean
no pipeline should be run. No key or value may contain the "="
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In slaclab/rubin-usdf-prompt-processing@537d212 I see you replaced all the None's with [] for those surveys with no pipelines to run. From this PR it looks like either way should work. As I misunderstanding?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that's correct -- the replacement of None with [] was not strictly necessary. See the commit message for 537d212. 🙂

"see previous logs for details.")
continue
# Past this point, partial execution creates datasets.
# Don't retry -- either fail (raise) or break.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand correctly that for now, the only scenario to fallback is a failure in making the quantum graph?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, at least for now. Do you know of any other scenarios where this is a good strategy?

Copy link
Collaborator

@hsinfang hsinfang Nov 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got another one that might be relevant here: quantum graph builds fine, but the needed datasets aren't really available so it fails. For example, the needed refcats aren't in the central repo in the searched collections. It can show as something like
FileNotFoundError: Not enough datasets (0) found for non-optional connection...

Copy link
Member Author

@kfindeisen kfindeisen Nov 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this happens only with PrerequisiteInput? Or does it also affect templates?

Copy link
Collaborator

@hsinfang hsinfang Nov 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like only with PrerequisiteInput
https://github.com/lsst/pipe_base/blob/main/python/lsst/pipe/base/connections.py#L852-L855

I haven't traced down the middleware enough to understand whether the qgraph would be built successfully in this case (I don't know when adjustQuantum happens). When I encountered this error, I don't think the PP fallback pipeline ran so I'd guess the qgraph was built it seems the qgraph failed to build. But it's not generating an empty qgraph which is the case the PP fallback pipeline is designed for. [I'm not sure of anything..]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like, despite the docstring, adjustQuantum is actually run right before the quantum is executed.

Copy link
Member Author

@kfindeisen kfindeisen Nov 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the confusion on Slack! I've filed DM-41596 to address this case.

@kfindeisen kfindeisen merged commit 5c47b77 into main Jul 26, 2023
@kfindeisen kfindeisen deleted the tickets/DM-39655 branch July 26, 2023 17:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants