Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-21221: PTC task should produce a linearity model #24

Merged
merged 1 commit into from Jan 26, 2020

Conversation

plazas
Copy link
Contributor

@plazas plazas commented Jan 16, 2020

No description provided.

@plazas plazas requested a review from czwa January 16, 2020 21:14
python/lsst/cp/pipe/ptc.py Outdated Show resolved Hide resolved
# dataset is modified in place but also returned for external code
dataset = self.fitPtcAndNl(dataset, ptcFitType=self.config.ptcFitType)
numberAmps = len(detector.getAmplifiers())
numberAduValues = 2**18 # up to 18 bits
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it makes sense to make lookup tables that are this large?

Copy link
Contributor Author

@plazas plazas Jan 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also wondered about this. I wanted to cover, at least a priori, the possible ADU range for LSSTCam (that goes to 18 bits as opposed to other cameras, such as DECam, that goes to 16 bits). In practice, I would think there will be some cutoff (< 2**18) in the maximum ADU (for example, the file "obs_lsst/policy/lsstCam.yaml" has "linearityMax : 142857" ). Should perhaps this value ("numberAduValues") be set as a config parameter instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it needs to be configurable at least.

python/lsst/cp/pipe/ptc.py Outdated Show resolved Hide resolved
@@ -588,7 +604,80 @@ def _makeZeroSafe(self, array, warn=True, substituteValue=1e-9):
array[array == 0] = substituteValue
return array

def fitPtcAndNl(self, dataset, ptcFitType):
def calculateLinearityResidualAndLinearizers(self, exposureTimeVector, meanSignalVector):
"""Fit a quadratic polynomial to the mean vs time curve to get c0 for LinearizeSquared,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have concerns:

  • Fitting two polynomials seems like a waste. I get that we want a squared term for a LinearizeSquared, but if the linearization doesn't match that functional form, that's important to know. I would think that fitting a higher order polynomial (cubic or whatever), and then confirming that all terms k[3:] are ~zero would be better. This would ensure that both of the forms returned yield the same science results as well (if the linearization is fit well by LinearizeSquared, then the image difference in using that rather than a lookup table form that contains a full n-th order polynomial fit with small higher order terms will also be small).
  • I understand that we don't yet have a generic polynomial linearizer, but I think being ready to support that would be better than converting the polynomial into a lookup table.
  • Can we also have a way to put an empirical non-parametric model into the lookup table? I thought that was what that was for originally (I think only obs_decam uses that code, and if I remember the curves correctly, that has linearity models with discontinuous linear segments).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I agree and was debating about this. In the end, I decided to fit the quadratic polynomial too because I thought that "LinearizeSquared" assumed that this is exactly the case (i.e, the c0 coefficient comes from exactly a quadratic fit, not a polynomial of another order). However, as you point out, I can just stick to the general n-order polynomial, still approximate c0 to -k2/(k1^2), and check that the k[3:] are ~zero (issuing a warning if for some reason they are not).
  2. Something for another ticket?
  3. As in 2), perhaps this could be done in another ticket. I think you are correct, in the sense that, for example, we would like to use information from a calibration diode to correct for signal-chain non-linearity. For this ticket, we agreed (in Jira) that, for the moment, we would do something simple like taking the linear part of an n-degree polynomial as the linearity correction, and that we would put that information in a lookup table.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think I wasn't sure how much had been delegated to the future/alternate linearity functions/etc.

dataset.nonLinearity[ampName] = parsFitNl
dataset.nonLinearityError[ampName] = parsFitErrNl
dataset.nonLinearityResiduals[ampName] = linResidualNl
dataset.coefficientLinearizeSquared[ampName] = c0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a "human curated" parameter, would it make sense to persist the linearity as something human readable? I'm not sure what use cases you have for the PTC data, so I understand if these get read back in a lot that a pickle would be faster than yaml, but if this is just a set of values that are calculated and written and then used to construct the camera configuration, I would think yaml would be easier to use.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't know how the c0 will be read in case that someone wants to use LinearizeSquared, and hence I don't know where it should be exactly stored and in which format We discussed this about in one of our meetings, I think, and I believe we agreed that, for the moment, we would just save it somewhere. Since the dataset structure was already in place, I decided to append the coefficient to it. I have used the PTC data, such as the gains, and the noise, to do comparisons with other codes (e.g., the BOT analyses), and pickles have worked fine so far.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since writing this, I've discovered that yaml files aren't fully supported by butler yet (particularly in gen3). I am going to try to get that working soon, because I like having human readable things easily readable by humans. Pickle is fine for now.

Copy link
Member

@timj timj Jan 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you let me know what you mean by "not fully supported". I use yaml in gen3 all the time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've had to add storageClass/formatter definitions to get yaml dict output to be understood (and I thought I'd rebased to ensure I had everything up to date).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, well, we may want to move this definition out of the daf_butler tests then:

storageClasses:
  StructuredDataDictYaml:
    pytype: dict

and add

StructuredDataDictYaml: lsst.daf.butler.formatters.yamlFormatter.YamlFormatter

to the config/datastores/formatter.yaml. We hadn't enabled simply python types by default and expected there to be more expressive types read from YAML files. The default YamlFormatter isn't super great at casting the dict returned to a new type (it does try and could be improved).

Copy link
Contributor

@czwa czwa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pending remaining "Nl" name changes to be less confusing (even full cap NL would work).

@plazas plazas force-pushed the tickets/DM-21221 branch 2 times, most recently from e5068df to f786379 Compare January 26, 2020 14:56
Add lines that persist lookup table linearizer

Populate lookup table using polynomial fit

Add code to persist lookup table linearizer

Added code to calculate linearizers

Rename parameter to polynomialFitDegreeNonLinearity

Use only one polynomial for meanSignal vs expTime fit

Change 'Nl' by 'NonLinearity'

Deleted 'min' in config parameter

Change 'Nl' to 'NonLinearity' in tests

Add positional argument to fitting function in tests
@plazas plazas merged commit 76202e3 into master Jan 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants