Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-26545: Add spline linearizer #56

Merged
merged 3 commits into from Oct 23, 2020
Merged

DM-26545: Add spline linearizer #56

merged 3 commits into from Oct 23, 2020

Conversation

czwa
Copy link
Contributor

@czwa czwa commented Oct 8, 2020

This adds the code to fit a spline linearizer to the data contained within the PTC curve.

Comment on lines 79 to 81
fitOrder = pexConfig.Field(
dtype=int,
doc="Degree of polynomial to fit.",
doc="Degree of polynomial to fit or number of spline knots to use",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the reasoning here on the rename (have both the number of spline knots and the poly order controlled by a single variable), but it does have the downside that 3 is a reasonable one for the poly order, but 3 spline knots is likely not a good default. I think there's certainly precedent elsewhere for having two different params here, so that one could change the type and still get a reasonable behaviour. I think it's your call, but just wanted to point that out as a choice so that it's conscious.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can, of course, recombine them around L213 to use them in the same way, so the code there doesn't change at all, but just have 2 variables here so that you can set different defaults.

Comment on lines 72 to 76
"LookupTable": "Create a lookup table solution.",
"Polynomial": "Create an arbitrary polynomial solution.",
"Squared": "Create a single order squared solution.",
"Spline": "Create a spline based solution.",
"None": "Create a dummy solution.",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the departure from ALLCAPS for options like this a Gen3 thing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This type needs to match the value listed in the LinearizeBase subclasses like at
https://github.com/lsst/ip_isr/blob/master/python/lsst/ip/isr/linearize.py#L539

Comment on lines 89 to 97
maxLinearAdu = pexConfig.Field(
dtype=float,
doc="Maximum DN value to use to estimate linear term.",
default=20000.0,
)
minLinearAdu = pexConfig.Field(
dtype=float,
doc="Minimum DN value to use to estimate linear term.",
default=10000.0,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little surprised that these values would be so high for bias-subtracted images, but you and @plazas are likely more up to date on that than I am, so again, whatever you say, just making sure that's a decent default.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've dropped the value of minLinearAdu to 2000, as that's more representative of the data I've been testing on.

@@ -31,12 +33,11 @@
from .utils import (fitLeastSq, funcPolynomial)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not part of this ticket necessarily, but the docstring on funcPolynomial is pretty weird, claims to return a covariance. Fancy cleaning it up while you're in here? (Also random comment on the return line too saying # C_00)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I'll file a ticket on this, because I do think it's weird and a quick fix.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DM-27267


# Exclude low end outliers
threshold = self.config.nSigmaClipLinear * np.sqrt(linearOrdinate)
fluxMask = np.abs(inputAbscissa - linearOrdinate) < threshold
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was totally following up to here, but wasn't quite sure about inputAbscissa - linearOrdinate but I'll trust that it's right, just writing this to save getting stuck here for any longer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

linearOrdinate is now a "flux-like" array, so it compares directly with inputAbscissa, the input flux array. The additional confusing point is that this fluxMask is masking values to retain, not exclude.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was incorrect, and now uses np.abs(inputOrdinate - linearOrdinate). I confused myself (in many ways, but also) by converting to using the linear form of the flux (linearOrdinate) as the new X value in the fits, but without being clear about that.


self.debugFit('splineFit', binCenters, np.abs(values), values, None, ampName)
interp = afwMath.makeInterpolate(binCenters.tolist(), values.tolist(),
afwMath.stringToInterpStyle("AKIMA_SPLINE"))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want AKIMA_SPLINE hard-coded here? This isn't the same as the not-a-knot, right? Don't be also want/need to be able to do that? I'd have thought that smooth handling of the boundaries was more important than dealing with fast-changing 2nd derivatives, but I could well be wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The boundaries should work, I think. The low end will switch to a first order extrapolation (in the difference of input and ideal) based on the padding I put in below. The high end of the model should work up until nearly saturation, at which the linearity is never going to be accurate.

# If we exclude a lot of points, we may end up with
# less than fitOrder points. Pad out the low-flux end
# to ensure equal lengths.
if len(binCenters) != self.config.fitOrder:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A < would be more defensive, in that, if it's more, you've got binning/other problems, and they won't be handled by this block. Then again, having it fail on the pad line is probably no bad thing, so maybe ignore this.

Comment on lines +262 to +289
binCenters = np.pad(binCenters, (padN, 0), 'linear_ramp',
end_values=(binCenters.min() - 1.0, ))
# This stores the correction, which is zero at low values.
values = np.pad(values, (padN, 0))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have quite enough of this in my head right now to be confident that this padding is legit, but I certainly trust you, but just noting that I didn't think this through hard.

values = np.pad(values, (padN, 0))

# Pack the spline into a single array.
linearityFit = np.concatenate((binCenters.tolist(), values.tolist())).tolist()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting data format, but it certainly at least explains the split I saw elsewhere!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the problem of packing things into the same single vector coefficients instead of adding a bunch of new attributes that are only optionally filled. I went with this because we pack and unpack crosstalk coefficients in an analogous way.


image = afwImage.ImageF(len(inputOrdinate), 1)
image.getArray()[:, :] = inputAbscissa
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get that they're the same length, I just found it odd to declare it with the length of inputOrdinate but then assign it to inputAbscissa instead (unless this is a very-late-in-the-day form of cross checking)

- Add step to transform the input abscissa (exp time or MONDIODE
  value) into a linear flux measurement, based on the config options.

- Filter the data based on the linear fit to exclude significant
  outliers (needed to exclude BOT neutral density data).

- Add iteratively reweighted least squares solver to reduce the impact
  of remaining outlier points.

- Update debugFit method to be more informative, and to provide
  information during the fitting stage.

- Add gen2 implementation Task, and associated bin.src script.

- Ensure spline data is padded to fitOrder length.
- Properly write out output.
- Set hasLinearity.
@czwa czwa force-pushed the tickets/DM-26545 branch 3 times, most recently from 0e6221a to 06b948f Compare October 20, 2020 23:23
Be pedantic on updateMetadata to avoid issues with ingest.
Add spline algorithm comments to make the code less confusing.
@czwa czwa merged commit c512666 into master Oct 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants