Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-15189: Initial implementation of Gen3 raw-data ingest #107

Merged
merged 3 commits into from Aug 24, 2018

Conversation

TallJimbo
Copy link
Member

No description provided.

config : `RawIngestConfig`
Configuration for whether/how to transfer files and how to handle
conflicts and errors.
butler : `daf.butler.Butler`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All your class references in the docstrings need the lsst. in front. Consider using ~lsst.daf.butler.Butler etc.

Copy link
Member

@timj timj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks ok. A couple of log messages need deferred string formatting.

try:
self.processFile(os.path.abspath(file))
except Exception as err:
self.log.warn("Error processing '{}': {}".format(file, err))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use:

self.log.warnf("Error processing '{}': {}", file, err)

to defer the formatting until you know the message will appear.

raise IngestConflictError("Ingest conflict on {} {}".format(file, dataId))

# Add component Dataset entries, assuming this is a concrete
# composite. We may need to revisit that assumption in the future.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If raw always means "entity written by the telescope for this observation" then isn't raw by definition always a concrete composite?

Copy link
Member Author

@TallJimbo TallJimbo Aug 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope so. I'm worried about TransmissionCurves and Detectors, which are Exposure components that we keep in calibration repositories; if we can defer attaching those to Exposures until ISR or some pre-ISR PipelineTask, we're fine. But that makes it very inconvenient for people doing ad-hoc things with raw data, as lots of people will be doing in commissioning.

# clause above; if we get a conflict at this point the Registry is
# corrupted somehow, and so we want that more serious error to
# propagate up to the *onError* handling in run().
for component in ref.datasetType.storageClass.components:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code duplication here from Butler.put() is a little annoying.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. What do you think about a recursive boolean keyword argument to Registry.addDataset that would run this logic there?

@RobertLuptonTheGood
Copy link
Member

Jim and I've been discussing access to raw data. It is very convenient to be able to say data = butler.get('raw', dataId) and get a "proper" exposure with all 16 amps assembled into a mosaic (preserving the pre/overscan regions) and with a Detector attached. For example, I can display this image with overscan subtracted, or run AssembleCcdTask or IsrTask, calculate properties of the serial v. parallel overscans, and so forth.

@TallJimbo TallJimbo merged commit fbb8ea7 into master Aug 24, 2018
@TallJimbo TallJimbo deleted the tickets/DM-15189 branch August 24, 2018 18:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants