New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-15256: Include calibration repositories in Gen2->Gen3 conversion #92
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approach looks good, but I have some style/structure comments.
I'm actually amazed it was this straightforward to add (and that it didn't require any changes to the ConversionWriter
class as well).
"valid_last": datetime.datetime.strptime(row["validEnd"], "%Y-%m-%d"), | ||
} | ||
|
||
calibConn.close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should be able to write this as:
with sqlite3.connect(calibPath) as calibConn:
...
to avoid the explicit close
(and make sure the close
happens in the presence of exceptions). I was a bit surprised to find that that doesn't seem to be in the Python sqlite3 module documentation, but it definitely works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
mapperClassPath = f.read().strip() | ||
MapperClass = doImport(mapperClassPath) | ||
print("In calib: %s %s %d" % (calibPath, mapperFilePath, len(self._calibDict))) | ||
isCalib = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to factor out all of this extraction of calibration registry content into a separate function or method, just for clarity (i.e. even if it's only called here, it will make the rest of this method easier to read).
I should add that I also don't know whether the schema of the HSC calibration registries is the same one used in the calibration registries for other cameras. I don't think we should try to support all other cameras on this ticket, but it might be worthwhile to take a look at this from the perspective of maybe having to inject some camera-customization hooks here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added the additional method to handle fetching information from the gen2 calibration registry.
Checking the registries that I can find for other cameras (DECam and comCam), it appears that the set ("dark","flat","fringe","sky","bias","defect") contains all tables and calibrations currently available. I have modified this section to generate the calibDict query based on the tables that actually exist in the registry.
@@ -118,7 +159,7 @@ def tryRoot(self, root): | |||
|
|||
# This directory has no _mapper, no _parent, and no repositoryCfg.yaml. | |||
# It probably just isn't a repository root. | |||
if not (parentPaths or MapperClass): | |||
if not ((parentPaths or MapperClass) or isCalib): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The inner parentheses here do not need to and probably should not exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I have a parentheses problem.
dataset.dataId["valid_first"] = calibRow["valid_first"] | ||
dataset.dataId["valid_last"] = calibRow["valid_last"] | ||
print(calibRow) | ||
except KeyError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of using KeyError
to do control flow for the common case where the dataset is not a calibration product, I'd recommend using something like:
calibRow = self._calibDict.get(..., None)
if calibRow is not None:
...
That's just a slight preference, and possibly one driven by experience with languages that approach exceptions differently from Python. Feel free to push back if you disagree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with this. Using the exception was something I wasn't totally happy with, so a default value works nicely.
if calibRow["filter"] is None or calibRow["filter"] == dataset.dataId["filter"]: | ||
dataset.dataId["valid_first"] = calibRow["valid_first"] | ||
dataset.dataId["valid_last"] = calibRow["valid_last"] | ||
print(calibRow) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This print
looks like a debugging relic. It'd be nice to keep it, but via log.debug
and a more specific message so we only get it when we turn the verbosity up.
Also, I think we might want to have an else
clause here that does a log.warn
, because it seems like it might be a problem if we find the dataset in the calib registry but don't understand its filter. Is that right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check is designed to allow datasets to either depend on the filter (flat/sky) or not (dark/bias) as needed. This was a debug statement, so I've made it a log.debug() call.
6f64b9c
to
d3ae7b3
Compare
The Gen2->Gen3 conversion code ignored calibration repositories entirely. This fixes the issue, by having the gen2convert/walker.py script directly query calibration repos as they are discovered. As calibRegistry.sqlite3 contains no mapper information, this is derived by searching parent directories to find a valid mapper to use. Matching DM-15760, this uses datetime objects for the valid_first/valid_last entries in the Dataset table. Information about the MaskedImageF was added to the storage class/datastore yaml files to allow the FLAT to be handled properly. Additional changes to support this in obs_subaru.
d3ae7b3
to
1c3380f
Compare
The Gen2->Gen3 conversion code ignored calibration repositories
entirely. This fixes the issue, by having the gen2convert/walker.py
script directly query calibration repos as they are discovered. As
calibRegistry.sqlite3 contains no mapper information, this is derived
by searching parent directories to find a valid mapper to use.
Matching DM-15760, this uses datetime objects for the
valid_first/valid_last entries in the Dataset table.
Information about the MaskedImageF was added to the storage
class/datastore yaml files to allow the FLAT to be handled properly.
Additional changes to support this in obs_subaru.