New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-21501: Implement internal aperture corrections for fgcmcal tract mode #19
Conversation
This is supported in the latest version of fgcm to use externally calibrated aperture correction slope parameters.
Also require that checkAllCcds is False for tract-based mode, because it should be False (since visits and ccds will be specified explicitly in the dataRefs in this mode).
This feature should not be generally needed, but is very useful for testing and debugging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good to me...but I did have a few comment/questions that may be worth considering.
# Aperture correction input slope parameters. There should be one slope ber band. | ||
# This is used when there is insufficient data to fit the parameters from the data | ||
# itself (e.g. tract mode or RC2). | ||
config.aperCorrInputParameters = (-1.0150, -0.9694, -1.7229, -1.4549, -1.1998) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"should" or must be one slope per band? How does the algorithm know if there is insufficient data to fit the parameters? It seems that setting this list (side note: I've never seen ListFields
set a s tuple
s before, but I guess there's no harm?!) is also serving as the config controlling the decision (i.e. if aperCorrInputParameters
is set to a non-emply list, then use them, else, perform a fit?) If this is the case, this should be made VERY clear in the documentation. Another (but not necessarily preferable) pattern I've seen is to have a doFitAperCorrSlopes
that controls this decision...it is an extra config parameter and can lead to confusion when parsing a config file and seeing aperCorrInputParameters
, but not realizing that they aren't necessarily being used (although your doctstring can indicate they are not used when doFitAperCorrSlopes=True)
, and it's a pattern common throughout the stack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few things on this! First: "must" is correct, I will change that. The number of elements is checked by the fgcm
code, I'll make that clear. (This is also the case with the other config vars which are per band).
Second: the controlling variable on whether or not to perform a fit is aperCorrFitNBins
but this is not actually described in the fgcmcal
documentation.
So the idea is that these are the input parameters (optional!) and it will use these at the start of the run, and if aperCorrFitNBins == 0
then the output parameters will equal the input parameters. Otherwise they will be different.
This is not explicitly spelled out in the documentation, and I will improve that. But hopefully this logic does make sense, or do I need another config parameter or to rename something as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's convoluted, but as long as it is well-documented, I'm ok with whatever you feel is clear-enough.
@@ -105,7 +105,7 @@ def test_fgcmcalTasks(self): | |||
nGoodZp = 26 | |||
nOkZp = 26 | |||
nBadZp = 1206 | |||
nStdStars = 389 | |||
nStdStars = 390 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I updated the tests to use the aperture corrections as the input, the fits got just a bit better and one more star passed the cuts.
@@ -226,6 +226,8 @@ def runDataRef(self, butler, dataRefs): | |||
|
|||
if not self.config.fgcmBuildStars.doReferenceMatches: | |||
raise RuntimeError("Must run FgcmCalibrateTract with fgcmBuildStars.doReferenceMatches") | |||
if self.config.fgcmBuildStars.checkAllCcds: | |||
raise RuntimeError("Cannot run FgcmCalibrateTract with fgcmBuildStars.checkAllCcds set to True") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get that the visit need to be explicitly provided, but why does ccd list need to be too (this comment is really based on the commit message)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not that the ccd list needs to be given, it's that this flag should only be used in testing (in particular, when you have only a few ccds per visit.) This is in the docs for checkAllCcds
. It's also the case that in tract mode you will have specified the visit list and the tract on the command line and the PerTractCcdDataIdContainer
takes care of a lot of this.
My expectation is that the change to PipelineTask
will make most/all of this obsolete.
"slope per band. This is used when there is insufficient data to " | ||
"fit the parameters from the data itself (e.g. tract mode)."), | ||
dtype=float, | ||
default=[], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like it would be all-too-easy to get the order of these wrong. I assume the filter ordering is dictated by how they are set in bands
, but I had to scroll up off the page (as I'm currently looking at it in GitHub) to find that config parameter, and your docstring is not specific about the ordering. Might a mapping be less error-prone?
I'm also puzzled by the name of this parameter. Using the term "input" implies (to me, at least) that there will be some accompanied "output". And, if these are "slopes", why call them "parameters"? This boils down to me wondering why the name isn't just aperCorrSlopes
(with maybe a "fallback"/"default" or some such descriptor as, if I'm reading all this correctly, these are only meant to be used as the slopes if a live fitting was not possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the reason they aren't called aperCorrSlopes
is that originally I had an slope/offset/pivot tuple here, and then I realized only the slope was useful. And didn't think to change the name. Which I will now!
As for the ordering, there are bunch of these that should be matched to bands
. I will make sure that at least the documentation makes this clear. If you think a mapping would be useful, I can do that but there are a bunch of these and I'd probably do that on a separate ticket.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I leave that decision up to you (just so long as whatever is there is clearly documented!)
tests/test_fgcmcal_hsc.py
Outdated
@@ -172,10 +172,14 @@ def test_fgcmcalTract(self): | |||
|
|||
self.config = fgcmcal.FgcmCalibrateTractConfig() | |||
self.fillDefaultBuildStarsConfig(self.config.fgcmBuildStars, visitDataRefName, ccdDataRefName) | |||
self.config.fgcmBuildStars.checkAllCcds = False | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extra space needed?
# itself (e.g. tract mode or RC2). | ||
config.aperCorrInputParameters = (-1.0150, -0.9694, -1.7229, -1.4549, -1.1998) | ||
# "Fudge factors" for computing SED slope | ||
config.sedFudgeFactors = (0.25, 1.0, 1.0, 0.25, 0.25) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does this fit in with allowing fixed calibration slopes? Should this maybe be a separate commit with an explanation where the numbers are coming from/used for?? Also, same issue with the error-prone ordering issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So these numbers came from eyeballing some stellar loci that I did ages ago, but never checked in for this very reason. I could say these are what I use for DES...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Future you might appreciate the comment!
@@ -347,7 +347,7 @@ def runDataRef(self, butler): | |||
|
|||
# Output the standard stars in stack format | |||
if self.config.doRefcatOutput: | |||
self._outputStandardStars(butler, stdCat, offsets) | |||
self._outputStandardStars(butler, stdCat, offsets, self.config.datasetConfig) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't check the code flow super closely, but does this call happen when in "tract mode"? I assume not because I think it should not work properly if the special datasetConfig
with the tract added is not passed in...but maybe it does and not only repeats the call already made in generateOutputProducts()
, but also gets an erroneous persistence path, so I thought I'd point it out and let you do the tracing :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code does not get run in tract mode. Then the runDataRef
isn't used, but generateOutputProducts
is instead. It's one or the other, and I can add a code comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, I will rename generateOutputProducts
to generateTractOutputProducts
.
Add the ability to use externally calibrated aperture corrections for use in tract mode when there isn't sufficient data to fit them empirically. Also fix a dataRef logic error when running globally (e.g. PDR1) and the ability to output stars in tract mode, used for qa and debugging.