Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-15189: core butler support for raw data ingest #59

Merged
merged 14 commits into from Aug 24, 2018
Merged

Conversation

TallJimbo
Copy link
Member

@timj, this is by no means done yet, but it occurred to me that we might be doing conflicting things with Datastores, so I wanted to get this out there in case you'd like me to spin it off to a new issue and merge sooner. I also need to write some tests for it, but since you've just refactored Datastore tests I definitely didn't want to do that before rebasing on that work.

@TallJimbo TallJimbo force-pushed the tickets/DM-15189 branch 2 times, most recently from 1e6c30e to d6b4931 Compare July 25, 2018 17:31
@TallJimbo TallJimbo closed this Aug 2, 2018
@TallJimbo TallJimbo reopened this Aug 2, 2018
@TallJimbo
Copy link
Member Author

Most of the code originally here was moved to DM-15212, so some earlier comments may no longer make sense.

@TallJimbo TallJimbo force-pushed the tickets/DM-15189 branch 5 times, most recently from 93dd568 to 82378d4 Compare August 11, 2018 14:51
@TallJimbo TallJimbo force-pushed the tickets/DM-15189 branch 5 times, most recently from f2aae5e to 7aa301e Compare August 22, 2018 14:42
@TallJimbo TallJimbo requested a review from timj August 22, 2018 14:56
@TallJimbo
Copy link
Member Author

The full ticket is finally ready for review. In this package (as well as obs_subaru), I recommend looking at each commit separately; this package in particular just contains a lot of miscellaneous improvements necessary for making higher-level code in obs_base and obs_subaru work.

Copy link
Member

@timj timj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general looks good. I have quite a few comments.

ImageF:
<<: *Image
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since Image only has one field and that field is being over-ridden, I'm not sure this line does anything (and the other similar changes). Are you worried that Image might get extra fields?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it doesn't actually do anything; I just felt that using <<: to indicate inheritance even when it was a no-op was a useful bit of code-as-documentation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not too late to add a parent: keyword and get proper pythonic inheritance for StorageClasses (I think I can sort out print through of those class properties).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's see what happens with DM-15238 first.

@@ -5,7 +5,7 @@ datastore:
table: PosixDatastoreRecords
create: true
templates:
default: "{datasetType}/{tract:?}/{patch:?}/{filter:?}/{camera:?}_{visit:?}"
default: "{collection}/{datasetType}/{tract:?}/{patch:?}/{filter:?}/{camera:?}_{visit:?}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General comment on defaulting filenames: I'm a bit worried that with this in the base definition it's not possible for people to remove defaulting in their own datastores. Maybe I should allow default: None to disable defaulting.

apCorrMap: TablePersistableApCorr
coaddInputs: TablePersistableCoaddInputs
transmissionCurve: TablePersistableTransmissionCurve
metadata: PropertyList
ExposureComposite: &ExposureComposite
<<: *Exposure
assembler: lsst.daf.butler.assemblers.exposureAssembler.ExposureAssembler
ExposureCompositeF:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change won't be needed once #76 is merged because ExposureCompositeF becomes a simple label that allows you to control concrete/virtual at the butler level.

apCorrMap: TablePersistableApCorr
coaddInputs: TablePersistableCoaddInputs
transmissionCurve: TablePersistableTransmissionCurve
metadata: PropertyList
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getMetadata is documented to return a PropertySet though?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we discussed this on the other ticket. I'll convert to PropertySet here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually left it alone in #76 since it worked regardless.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think I'll stick with PropertyList here, too, then.

@@ -266,6 +266,9 @@ def getDirect(self, ref):
----------
ref : `DatasetRef`
Reference to an already stored dataset.
parameters : `dict`
Additional StorageClass-defined options to control reading,
typically used to efficiently only a subset of the dataset.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing word between "efficiently" and "only". Maybe I should work on parameters next (inMemoryDatastore will mess up with any tests where you use parameters).

if field_name in ("run", "collection"):
usedRunOrCollection = True
if optional:
raise KeyError("'run' and 'collection' may not be optional.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you allowed to specify an optional run if collection is mandatory? (not that it's possible for run to be optional of course). Maybe the error here should be "can not be optional" ("may" sounds like you might be able to be optional). Do you have a test for optional collection below?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 on "can be optional". Will add a test.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, actually, yeah, I suppose you could make one optional if the other is mandatory. Probably best to just remove this particular check, as there is another one to verify that at least one of those is present.

tiny = self.readFull(fileDescriptor, parameters)
return fileDescriptor.storageClass.assembler().getComponent(tiny, component)

def readFull(self, fileDescriptor, parameters=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are defining the parameters for get to be any parameters understood by the Exposure constructor?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not at the moment - just bbox and origin. I suppose we need a place to document this at the StorageClass level. I'm open to ideas.

__all__ = ("FitsRawFormatterBase",)


class FitsRawFormatterBase(FitsExposureFormatter, metaclass=ABCMeta):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the methods here need parameters and returns sections to the numpydocs.

@@ -33,6 +33,53 @@ class FitsExposureFormatter(Formatter):

parameters = frozenset(("bbox", "origin"))

def readImageComponent(self, fileDescriptor, component):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add parameters/returns sections to docstrings.

A dictionary containing all fields in the Exposure table.
"""
avg = visitInfo.getDate()
begin = DateTime(int(avg.nsecs(DateTime.TAI) - 0.5E9*visitInfo.getExposureTime()), DateTime.TAI)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean we can't get the value of DATE-OBS from the object but need to guess it by subtracting half the exposure time?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, yes. I don't know much about VisitInfo. I stole this recipe from obs.base.makeRawVisitInfo.offsetDate, and I'd like to replace both that and VisitInfo itself, but this seemed like an easy way to get the correct (or at least consistent with Gen2) answer.

@TallJimbo
Copy link
Member Author

I think I've addressed all comments (and rebased on latest master).

One question from that rebase (which I should have asked on a previous review, probably): do we still need the ExposureComposite StorageClasses?

@timj
Copy link
Member

timj commented Aug 24, 2018

Re ExposureComposite: I think we should move it to the test configs so that the tests can try a virtual and concrete composite.

@TallJimbo
Copy link
Member Author

TallJimbo commented Aug 24, 2018

I think we should move it to the test configs so that the tests can try a virtual and concrete composite.

I gave this a quick try on this ticket, but it's not easy because aliases apparently can't be used outside the file where they're defined (so you can't easily make ExposureComposite inherit attributes from Exposure when they're in different files).

@timj
Copy link
Member

timj commented Aug 24, 2018

I was worrying about that cross-yaml file inheritance this morning. Would trivially be fixed if we had true python inheritance as well. Leave ExposureComposite in there for now -- I really do need to make a test for Exposure that uses virtual and one that uses concrete.

These are useful for raw data ingest, too, so move them out of
the convert package.  New home still probably isn't their
permanent one.
This change didn't turn out to be necessary for this ticket,
but it seemed worth preserving.
The duplication of all Exposure components in order to refine
the types of some is unfortunate, but should go away after
DM-15238.
In retrospect, this needs to be modified to produce an
Exposure, not a DecoratedImage, and we can safely
remove the write method.
@TallJimbo TallJimbo merged commit a6f5c36 into master Aug 24, 2018
@TallJimbo TallJimbo deleted the tickets/DM-15189 branch August 24, 2018 18:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants