Skip to content

Commit

Permalink
Add FAQ guidance Re: writing injected outputs
Browse files Browse the repository at this point in the history
  • Loading branch information
leeskelvin committed Apr 26, 2024
1 parent d26fd9d commit 62856f3
Showing 1 changed file with 61 additions and 0 deletions.
61 changes: 61 additions & 0 deletions doc/lsst.source.injection/faqs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -344,6 +344,67 @@ For this reason, it is usually recommended that your input catalog is ingested f

Please also note that it is not currently possible to perform source injection on the command line without first ingesting an input catalog into the data butler.

.. _lsst.source.injection-faqs-write-into-butler:

Is it possible to write injected exposures back into a data butler?
===================================================================

When working in a command line environment and using ``pipetask run``, injected outputs are automatically written back into the data butler.
However, when working in a Python/Jupyter notebook environment, calling injection tasks directly and running them will not ingest any outputs back into the data butler.

Whilst it is *possible* to write bespoke injected outputs into the data butler, e.g., to facilitate onward processing downstream, this is not normally recommended.
Rather, injecting sources into data within a notebook envionment is designed for quick-look analyses alone.
Datasets ingested into the butler have strict format requirements, without which it would be impossible to guarantee the integrity of a given pipeline.

If you do wish to perform source injection inside a Python/Jupyter notebook environment and store the outputs in the data butler for subsequent processing, it is strongly recommended to make use of the `SimplePipelineExecutor`.
The `SimplePipelineExecutor` is a lightweight high-level executor for running pipelines in Python.
The example below demonstrates how to use the `SimplePipelineExecutor` to run a source injection task in a pipeline already constructed (see :ref:`lsst.source.injection-ref-make`) and write the outputs back into the data butler:

.. code-block:: python
import os
from lsst.ctrl.mpexec import SimplePipelineExecutor
from lsst.pipe.base import Pipeline
user = os.getenv("USER")
# Create a Butler instance with collections appropriate for processing.
butler = SimplePipelineExecutor.prep_butler(
REPO,
inputs=["HSC/runs/RC2/w_2024_10/DM-43178", "injection/defaults"],
output=f"u/{user}/my_injected_outputs",
)
# Load the pipeline from a YAML file.
pipeline = Pipeline.fromFile(os.path.join("PATH", "TO", "DRP-RC2+injection.yaml#inject_coadd"))
# Optionally configure a task.
pipeline.addConfigOverride(
label="inject_coadd",
key="process_all_data_ids",
value=True,
)
# Check that the config has been applied.
inject_coadd_task = pipeline.to_graph().tasks.get("inject_coadd")
print(inject_coadd_task.config.process_all_data_ids) # returns: True
# Create an executor; build a QuantumGraph from an in-memory pipeline.
executor = SimplePipelineExecutor.from_pipeline(
pipeline=pipeline,
where="instrument='HSC' AND skymap='hsc_rings_v1' AND tract=9813 AND patch=42 AND band='i'",
butler=butler,
)
# Run all quanta in the QuantumGraph.
quanta = executor.run(register_dataset_types=True)
print(f"number of quanta executed: {len(quanta)}")
# Get the run outputs.
dataset_refs = {dtype.name:ref[0] for dtype, ref in quanta[0].outputs.items()}
print(f"available dataset types: {list(dataset_refs.keys())}")
injected_deepCoadd = butler.get(dataset_refs["injected_deepCoadd"])
.. _lsst.source.injection-faqs-restrictions-guidelines-conventions:

Are there restrictions or guidelines for ingesting source injection images and/or catalogs into the butler?
Expand Down

0 comments on commit 62856f3

Please sign in to comment.