New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

DM-24704: Make brighter-fatter correction a subclass of lsst.ip.isr.IsrCalib #75

Merged

czwa merged 7 commits into master from tickets/DM-24704

Apr 1, 2021

Contributor

czwa commented Mar 16, 2021

Convert makeBrighterFatterKernel to gen3. As we now generate PTC from its own task, we no longer need to duplicate that effort as part of the kernel generation, and can simply use the covariance matrices measured there.

czwa force-pushed the tickets/DM-24704 branch from 7ddca3e to 57f143e Compare

March 24, 2021 23:32

plazas reviewed

View reviewed changes

pipelines/cpBfkSolve.yaml Outdated Show resolved Hide resolved

plazas reviewed

View reviewed changes

python/lsst/cp/pipe/makeBrighterFatterKernel.py



		class BrighterFatterKernelSolveConnections(pipeBase.PipelineTaskConnections,
		dimensions=("instrument", "exposure", "detector")):

Contributor

plazas Mar 25, 2021 •

edited

Gruen et al 2015 and Coulton et al 2017 report a weak chromaticity of the BFE (however, Antilogus' model implies that the electric field distortions happen in the last few microns (<< 100 microns thickness of the CCD), so, according to that, the effect should be achromatic...maybe I should ask Craig too and see what he has measured). In any case, if we wanted to study this at some point, would we be able to easily derive a BF kernel per filter? I put this comment here because I'm hoping that this is just a matter of adding another "filter" (or however it is called now) dimension at this point in the code, but I might be wrong.

Contributor Author

czwa Mar 29, 2021

We'd need to add filter here, and change the dataset_type_name used in the connections.outputBFK (to chromaticBrighterFatterKernel or chromaticBfk depending on how verbose). This constraint is there because a given dataset_type can only have one defined set of dimensions.

plazas reviewed

View reviewed changes

python/lsst/cp/pipe/makeBrighterFatterKernel.py

+                      default="AMP",
+                      allowed={
+                          "AMP": "Every amplifier treated separately",
+                          "DETECTOR": "One kernel per detector",

Contributor

plazas Mar 25, 2021

Does "DETECTOR" mean that you calculate individual kernels per amp, and then take the average of those to produce one kernel for the detector? Or am I misuderstanding?

Contributor Author

czwa Mar 29, 2021

If we are generating a DETECTOR level kernel, we run the same per-amplifier loop as in the AMP case, but then append all of the scaled cross-correlations into a separate list. After the loop over amplifiers is done, this set of the cross-correlations is then averaged and successive over-relaxed to produce a final kernel. This treats each the measurements from each amplifier as another realization of the "true" underlying cross-correlation. Although this is different than what the calibration method does (there we average the individual final kernels), it seems mathematically better to do it this way when all the data is available.

The prior behavior was to measure the cross-correlation on the full detector, without regard to the amplifier boundaries. This was likely not optimum either, as there should be zero charge spread between adjacent pixels on the full detector image that come from different amplifiers. There's also an issue that using proper AMP kernels, applied on amplifier boundaries, would probably require masking 8-pixel wide strips along all edges, as the kernel falls off the real pixels.

plazas reviewed

View reviewed changes

python/lsst/cp/pipe/makeBrighterFatterKernel.py

-                      doc="isr operations that it is advisable to perform, but are not mission-critical."
-                      " WARNs are logged for any of these found to be False.",
-                      default=['doBias', 'doDark', 'doCrosstalk', 'doDefect', 'doLinearize']
+                      doc="List of amp names to ignore when averaging the amplifier kernels into the detector"

Contributor

plazas Mar 25, 2021

If I remember properly, you mentioned that you got better results when averaging over all the amps compared to when you calculate individual kernels per amp. Was this due to the quality of the input data or was there something more fundamental to it? (that should be documented in a comment in the docstring)

Contributor Author

czwa Mar 29, 2021

That was based on imSim data, for which there should be a single known truth. Comparing AMP kernels against this truth resulted in offsets, such that AmpX would have a kernel with a central value -10% compared to the known value, and AmpY would have a central value +10%. Switching to a detector kernel averages those differences away, yielding a result that seemed accurate to the few percent level. The exact levels I'm not sure of ("10%" being my best recollection).

That said, I think there will need to be some significant testing and regeneration of brighter-fatter kernels before they should be trusted. I know that enabling the ci_cpp_gen3 brighter fatter kernel produces obviously worse results. As there isn't a real PTC ramp in that data, I think this is just a result of bad inputs making a bad product.

I'm also seeing that this option is unused in the code, which needs to be fixed. There are also the badAmps and valid parameters of the kernel that should be used more consistently. It looks like there's at least one bug in how they're handled in the ip_isr ticket (fromTable seems to construct a badAmp list from the wrong selection of valid).

plazas reviewed

View reviewed changes

python/lsst/cp/pipe/makeBrighterFatterKernel.py Outdated

    
                      dtype=bool,

                      doc="Use a quadratic fit to find the correlations instead of simple averaging?",

                      doc="Use the PTC 'a' matrix instead of the average of measured covariances?",

Contributor

plazas Mar 25, 2021

Add a reference to Eq. 20 of Astier_19, for context.

plazas reviewed

View reviewed changes

python/lsst/cp/pipe/makeBrighterFatterKernel.py Outdated

-                  )
-                  fixPtcThroughOrigin = pexConfig.Field(
+                  # These are unused.  Are they worth implementing?

Contributor

plazas Mar 25, 2021

Do we anticipate using this? What do we mean exactly by averaging vs quadratic mean? I'm trying to figure out if we will use this, or not, or if we should eliminate this option, as the comment says.

Contributor

plazas Mar 25, 2021

I see the options below; it seems that these are options to produce a preKernel before using the SOR method to find the final kernel.

Contributor Author

czwa Mar 29, 2021

I've removed the comment, as I did do those implementations. The quadratic fit assumes that the measured correlation increases with flux. As far as I can tell, this was added to test if the simple scale and average method was incomplete. In the end, I updated the implementation so that if those tests were repeated, there'd be code to handle them.

plazas reviewed

View reviewed changes

python/lsst/cp/pipe/makeBrighterFatterKernel.py

+                      bfk = BrighterFatterKernel(camera=camera, detectorId=detector.getId(), level=self.config.level)
+                      bfk.means = inputPtc.finalMeans  # ADU
+                      bfk.variances = inputPtc.finalVars  # ADU^2
+                      bfk.rawXcorrs = inputPtc.covariances  # ADU^2

Contributor

plazas Mar 25, 2021

PTC calls it "covariances" and the old BF code calls them "correlations", and I guess that you figured out that in practice these objects are the same. However, should we try to unify the terms? There's a difference between "covariance" and "correlation", and this might be confusing.

Contributor

plazas Mar 25, 2021

Or at least make a comment here.

Contributor Author

czwa Mar 29, 2021

I've added a comment. I believe the scaling done in the loop changes the covariance to a correlation, and was done in the previous code as part of the PTC-like-step.

plazas reviewed

View reviewed changes

python/lsst/cp/pipe/makeBrighterFatterKernel.py

-                      It also removes datapoints that were thrown out in the PTC algorithm.
+                          fluxes = np.array([flux*gain for flux in fluxes])  # Now in e^-
+                          variances = np.array([variance*gain*gain for variance in variances])  # Now in e^2-

Contributor

plazas Mar 25, 2021

These are theafwMath.VARIANCECLIP variances. Do you actually use them in calculations in the code that follows? I only see that you report them in the log info of L239.

Contributor Author

czwa Mar 29, 2021

I only use them in the log I believe. It was useful to have both to ensure everything was being converted consistently (checking that mean ~ variance ~ cov(0, 0)).

plazas reviewed

View reviewed changes

python/lsst/cp/pipe/makeBrighterFatterKernel.py Outdated

+                          fluxes = np.array([flux*gain for flux in fluxes])  # Now in e^-
+                          variances = np.array([variance*gain*gain for variance in variances])  # Now in e^2-
+                          # This should duplicate the else block in generateKernel@L1358,

Contributor

plazas Mar 25, 2021

This comment refers to a line in the version of the code that this new code replaces. If you are going to keep it, do we need to be more specific as to which version of the code you are actually referring, in case that the code evolves again n the future?

Contributor Author

czwa Mar 29, 2021

Replaced with just the paper reference (also updated to include the year).

plazas reviewed

View reviewed changes

python/lsst/cp/pipe/makeBrighterFatterKernel.py Outdated

+                          variances = np.array([variance*gain*gain for variance in variances])  # Now in e^2-
+                          # This should duplicate the else block in generateKernel@L1358,
+                          # which in turn is based on Coulton et al Equation 22.

Contributor

plazas Mar 25, 2021

Add Journal reference or arXiv number (1711.06273).

Contributor Author

czwa Mar 29, 2021

And the arxiv number.

plazas reviewed

View reviewed changes

python/lsst/cp/pipe/makeBrighterFatterKernel.py

+                              # Normalize by the flux, which removes the (0,0)
+                              # component attributable to Poisson noise.
+                              q[0][0] -= 2.0*(flux)

Contributor

plazas Mar 25, 2021

What's the origin of the factor of two here? Is it because of the pair of flats at the same exposure time? Is this subtracting the first two terms in Equation 24 of Coulton+17?

Contributor Author

czwa Mar 30, 2021 •

edited

This is nominally the sum of the flux levels of the two flats. From cpExtractPtcTask.py L373, the rawMeans is set to 0.5*(mu1 + mu2) so this factor of two removes the 0.5 there. Unless I'm misunderstanding, this removes the two t I \delta(x - x') components from Coulton Eq 29.

plazas reviewed

View reviewed changes

python/lsst/cp/pipe/makeBrighterFatterKernel.py

-                      means : `dict` [`str`, `list` of `tuple`]
-                          Dictionary, keyed by ampName, containing a list of the means for
-                      each visit pair.
+                              q /= -2.0*(flux**2)

Contributor

plazas Mar 25, 2021

What's the origin of this normalization by -2*flux**2? Is this the factor that accompanies the kernel in the third term in Eq. 29 of Coulton+17? Perhaps a comment n the code would help (same for my previous comment).

Contributor Author

czwa Mar 30, 2021

Following on in Eq 29, this is the factor of -t^2 (I_A^2 + I_B^2). There's a cheat here, because if F_A = F_B + \epsilon, then these factors differ by something like 1 + 2 (e/F) + (e/F)^2, but I believe this should always be negligible

plazas reviewed

View reviewed changes

python/lsst/cp/pipe/makeBrighterFatterKernel.py

-                          Dictionary, keyed by ampName, containing a list of the
-                      cross-correlations for each visit pair.
+                              xcorrCheck = np.abs(np.sum(scaled))/np.sum(np.abs(scaled))
+                              if (xcorrCheck > self.config.xcorrCheckRejectLevel) or not (np.isfinite(xcorrCheck)):

Contributor

plazas Mar 25, 2021

Why is 2 the default value for self.config.xcorrCheckRejectLevel? If this is strictly checking the triangle inequality, as it says in the comment, shouldn't it be 1 (|A + B| <= |A| + |B|)? Or what am I missing? Wy is it even an option?

Contributor Author

czwa Mar 30, 2021

I added the triangle inequality message when trying to make sense of what this check is doing (previously it was called the "sum of the xcorr", which isn't right either). Digging through the commit history, it appears this is designed to only catch impossibly bad data. I'm not sure that it's actually possible to fail this test, which suggests it can probably be dropped.

plazas reviewed

View reviewed changes

python/lsst/cp/pipe/makeBrighterFatterKernel.py Outdated

Comment on lines 271 to 283

+                          if self.config.useAmatrix:
+                              # This is mildly wasteful
+                              preKernel = np.pad(self._tileArray(np.array(inputPtc.aMatrix[ampName])), ((1, 1)))
+                          elif self.config.correlationQuadraticFit:
+                              # Use a quadratic fit to the correlations as a function of flux.
+                              preKernel = self.quadraticCorrelations(scaledCorrList, fluxes, f"Amp: {ampName}")
+                          else:
+                              # Use a simple average of the measured correlations.
+                              preKernel = self.averageCorrelations(scaledCorrList, f"Amp: {ampName}")

Contributor

plazas Mar 25, 2021

I assume that these are options to initialize the kernel and then solve for the final kernel later. Perhaps you could add a comment? Also, in the comments, you seem to imply that the first (a matrices) and the second options (quadratic) are redundant. Does this imply that, in future work, we would need to determine if we need them or not? For example, if they provide better convergence when calculating the kernel via SOR? I guess my question is, why do we have these two extra options if you say that they are mildly wasteful?

Contributor Author

czwa Mar 30, 2021

I'll clarify that comment. The useAmatrix branch still does all of the amp-wise analysis, but then doesn't use it in the final fit (although it's still persisted in the calibration). The quadratic solver seems very sensitive to outliers when I tested it, but I didn't want to remove existing functionality in a gen2-gen3 conversion.

plazas reviewed

View reviewed changes

python/lsst/cp/pipe/makeBrighterFatterKernel.py

-                      border = self.config.nPixBorderXCorr
-                      sigma = self.config.nSigmaClipXCorr
+                          if self.config.forceZeroSum:

Contributor

plazas Mar 25, 2021

Strictly, the sum should always be zero, due to charge/area conservation, correct? But we still allow the user to whether enforce this or not ?

Contributor Author

czwa Mar 30, 2021 •

edited

This is another thing kept to ensure the conversion didn't drop anything. Looking over execution logs from imsim indicate that the sum is in the ~~10^-7 - 10^6~~ 10^-7 - 10^-6 range.

Contributor

plazas Mar 31, 2021

Why such a vast range? 13 orders of magnitude? Or did you mean 10^-6? Probably the latter...

plazas reviewed

View reviewed changes

tests/test_brighterFatterKernel.py Outdated

+                      task = cpPipe.BrighterFatterKernelSolveTask()
+                      results = task.run(self.ptc, ['this is a dummy exposure'], self.camera, {'detector': 1})
+                      expectation = np.array([[4.88348887e-08, 1.01136877e-07, 1.51784114e-07,

Contributor

plazas Mar 27, 2021

How is the expectation obtained? (here and in the other test) Do you have independendent code?

Contributor Author

czwa Mar 30, 2021

This is an expectation generated by running the code on the fake inputs. It will almost certainly break if there are any algorithmic changes, but this unit test does exercise all of the code (and was useful in tracking down typos and unexpected problems in branches not used on the imsim or LATISS tests).

plazas reviewed

View reviewed changes

tests/test_brighterFatterKernel.py

		3.12500e-06, -6.25000e-06, -3.12500e-06]])

		self.assertFloatsAlmostEqual(results.outputBFK.ampKernels['amp 1'], expectation, atol=1e-5)

Contributor

plazas Mar 27, 2021

You test 2 cases here, the "quadratic" one and the "average". Do you need to test the case where the "a" matrix is use? (these are the 3 cases to produce preKernel in makeBrightterFatterKernel.py).

Contributor Author

czwa Mar 30, 2021

Yes, that should be a test case.

plazas requested changes

View reviewed changes

Contributor

plazas left a comment

My comments are mainly about adding more clarification and comments to the code.

plazas approved these changes

View reviewed changes

Contributor

plazas left a comment

Thanks for addressing the comments!

czwa added 7 commits

March 31, 2021 16:54


          Use ip_isr BrighterFatterKernel.

5f829b8

Add first pass gen3 implementation.

This is now functional, and the results are plausible.


          Removed code that is no longer used.

bd59196


          Implement quadratic fit to the input correlations.

91f36df

Add support for correlation models.


          Add unit tests for brighter fatter kernel generation.

ecfde7c

Fix related issues that writing those tests bright to light.


          Add brighter-fatter pipeline definition.

ed9cbde


          Fix bugs that cause processing failures.

0dc0e33

- Ensure the mask is always cast to integer type.

- Accept that the input data may be bad, and deal accordingly.

- Update what is actually stored in the calibration to match what is
  likely to be useful in debugging.

- Add badAmps and other fields that should be carried over from PTC.

- Brighter-fatter kernels should be calibrations.


          Update documentation and tests.

d90c0d9

- Clarify confusing comments and docstrings.
- Update unit tests to test useAmatrix=True case.

czwa force-pushed the tickets/DM-24704 branch from 6b38de8 to d90c0d9 Compare

March 31, 2021 21:58

czwa merged commit f80de95 into master

czwa deleted the tickets/DM-24704 branch

April 1, 2021 20:11

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment