Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

initial bones of resample code, just have flux conserving and interpolation right now. #461

Merged
merged 15 commits into from
Jul 24, 2019

Conversation

SaOgaz
Copy link
Contributor

@SaOgaz SaOgaz commented Apr 10, 2019

See related issues: #359, #254, #255, #266

@SaOgaz
Copy link
Contributor Author

SaOgaz commented Apr 11, 2019

Questions:

I discussed this a little bit with @eteq, the initial suggestion for weighting (and or uncertainity) was to have a weight keyword that could have one of three values, None (meaning no weighting), provided array (use provided array as weights) or 'uncer' (use the uncertainty defined in the input Spectrum1D as the weights). If the user choses 'uncer' but no uncertainty is in the Spectrum1D, we throw an error. Right now this is unimplemented.

How do we want to handle propagating the uncertainties.... the paper I pulled the algorithm from has a prescription for calculating the new associated uncertainties (it's just using normal uncertainty propogation). I haven't looked into how easy it would be to adapt the current code to use any built in uncertainty handling, buy my initial guess is that it won't be too difficult, since the flux doesn't get multiplied through until the very end. The thing I'm not sure about is, if someone is using the uncertainties for weights, how does that effect the output uncertainty?

Another uncertainty related question. if they are not defined in the input spectrum, do we leave them undefined? I'm leaning towards yes.

@hcferguson
Copy link
Contributor

Anyone familiar with the uncertainties package?

@eteq
Copy link
Member

eteq commented Apr 17, 2019

@SaOgaz

I would say for this PR we just follow the propagation the paper suggests. The key point is that we need to be sure it's the right kind of uncertainty, though. That is, I think the paper is built around gaussian statistics, so you do that propagation if the uncertainty in the spectrum is either SD, variance, or ivar uncertainties (which is probably like 99% of the use cases anyway). One thing about this, though: if the keyword argument is weights, the "use the unceratinty" option should describing the weighting. I.e., instead of "uncer" I'd say it should be ivar_weighted, which means "convert the uncertainty to an inverse variance and use that as the weights" (which is also how you'd implement it).

As for the question of what to do if the weights aren't uncertainties - aren't the weights just the c_ij from https://arxiv.org/abs/1705.05165 ? In that case you should just be able to use the expression exactly as given in section 5 (the covariance matrix), regardless of whether the weights come from the uncertainties or somewhere else.

Related thought: since we are following that algorithm, perhaps the uncertainty should record the whole covariance matrix not just the diagonals (that is, the ). That would require making a new uncertainty class, but it would be easy to make one that can be converted to VarianceUncertainty or StdDevUncertainty by just taking the diagonal of the matrix? This may well be better done as a follow-on PR, though.

@hcferguson

Anyone familiar with the uncertainties package?

Yes, several years back I had a GSoC student work on that to see how well it can be used with some of the Astropy machinery. At the time the answer was "it's very hard because they've done some numpy stuff that conflicts deeply with a lot of other machinery". It may be useful as an eventual thing to complement the (fairly new) existing machinery we built in astropy to do uncertainty propogation (http://docs.astropy.org/en/stable/uncertainty/index.html). But it's hard to just use as it stands.

@eteq
Copy link
Member

eteq commented Apr 17, 2019

One other very minor comment to just make sure I don't forget: the resample functions should be pulled out into specutils.manipulation via from .resample import * in the __init__.py in manipulation. We agreed we generally want only 2-level imports at the most for the user API.

@eteq
Copy link
Member

eteq commented Apr 17, 2019

@SaOgaz - as we discussed out-of-band, I've added a test here to check for flux conservation. Basically the test I added is checking that flux is conserved between an input and output spectrum in the way I understand us to be aiming for. When I ran locally this didn't work... But we shall see what travis says.

@SaOgaz
Copy link
Contributor Author

SaOgaz commented Apr 17, 2019

@eteq, huh, I did that a bit locally (not sure why I didn't think to add a test for it), and it came out correctly, but maybe my calculations were off?

@SaOgaz
Copy link
Contributor Author

SaOgaz commented Apr 18, 2019

@eteq ahh, I think I might know why those tests are failing. For your input data, the bin edges aren't aligned, just input center wavelengths. i.e. the left edge of the non linespace wavelength axis is 3,500, the linespace left edge is 3666.66. The way I implemented this is that the flux conservation will only truly apply when the bin edges line up. (although i'm trying to edit your test example to follow this case and running into problems, so cannot yet confirm that results come out right aside from the testing i did earlier).

@SaOgaz
Copy link
Contributor Author

SaOgaz commented Apr 18, 2019

side note, meant to add to init, just hadn't gotten around to it yet.

@SaOgaz
Copy link
Contributor Author

SaOgaz commented Apr 24, 2019

As for the question of what to do if the weights aren't uncertainties - aren't the weights just the c_ij from https://arxiv.org/abs/1705.05165 ? In that case you should just be able to use the expression exactly as given in section 5 (the covariance matrix), regardless of whether the weights come from the uncertainties or somewhere else.

edit: sorry I'm getting a bit confused, I thought we were just using quadrature in the case of no, or separate weights (where separate weights would be including in the quadrature calculation) as described in the "calculation of associated uncertainties" section...

@SaOgaz
Copy link
Contributor Author

SaOgaz commented Apr 24, 2019

I would say for this PR we just follow the propagation the paper suggests. The key point is that we need to be sure it's the right kind of uncertainty, though. That is, I think the paper is built around gaussian statistics, so you do that propagation if the uncertainty in the spectrum is either SD, variance, or ivar uncertainties (which is probably like 99% of the use cases anyway). One thing about this, though: if the keyword argument is weights, the "use the unceratinty" option should describing the weighting. I.e., instead of "uncer" I'd say it should be ivar_weighted, which means "convert the uncertainty to an inverse variance and use that as the weights" (which is also how you'd implement it).

Is this in the situation where someone isn't using one of the standard NDUncertainty subclasses? I would think that just checking the type and assuming we want to return the same type would be the most straightforward approach, in which case using a bland "uncer" to mean, use whatever uncertainty is there as a weight would be the best way to go.

@eteq
Copy link
Member

eteq commented Jun 4, 2019

@SaOgaz

edit: sorry I'm getting a bit confused, I thought we were just using quadrature in the case of no, or separate weights (where separate weights would be including in the quadrature calculation) as described in the "calculation of associated uncertainties" section...

Ah right, I see your point here, and I missed some of the subtlety. I think for the purposes of this PR, the "weights" are basically 1/(sigma^2) in the notation of the "Calculation of associated uncertainties" section. That doesn't make a ton of sense really in the single-spectrum case, but it's relevant for the "list of spectra" case. And in fact this is making me realize maybe we don't want to call this weights but rather pixel_weights. We might later add a spectrum_weights or similar, but given the fact that I think usually the user will want to use the uncertainty rather than the weights array, this makes sense as a way to keep those concepts separate.

Is this in the situation where someone isn't using one of the standard NDUncertainty subclasses? I would think that just checking the type and assuming we want to return the same type would be the most straightforward approach, in which case using a bland "uncer" to mean, use whatever uncertainty is there as a weight would be the best way to go.

Yes, if there's an NDUncertainty class definitely you use that information. (maybe write a simple to_ivar function that converts them all? really we might want to pass that upstream to astropy too but for now it's probably fine to have it be a private utility in specutils) My comment here was about the possibility of the user providing a "weights" array, in which case you'd need to say what kind of weight it was. But given the above realization I had, I think that's no longer relevant. Instead we just say "if weights are arrays they are interpreted as inverse variance" and then it's the user's business to know how to convert their weights into that. So I think you can ignore this concern now.

@SaOgaz SaOgaz changed the title WIP: initial bones of resample code, just have flux conserving right now. initial bones of resample code, just have flux conserving right now. Jun 17, 2019
@eteq
Copy link
Member

eteq commented Jun 20, 2019

Just to note here - after some out-of-band discussion with @SaOgaz (included in the commits above) we decided the best way to push forward is to not have the weights at all for this PR. Rather instead this implements the flux-conserving algorithm (which doesn't really have meaningful per-pixel weights as described above if you want it to be flux-conserving...), but does propogate the uncertainties self consistently.

The whole "should there be a separate pixel_weight argument" is still relevant, especially in the context of other resampling algorithms (e.g. interpolation), but we decided that should be dealt with after this is implemented so that we can get something in to build off of.

@eteq
Copy link
Member

eteq commented Jun 20, 2019

(also the test failure appears to be unrealted to this PR, as it's also in master)

@SaOgaz SaOgaz changed the title initial bones of resample code, just have flux conserving right now. initial bones of resample code, just have flux conserving and interpolation right now. Jun 24, 2019
@SaOgaz
Copy link
Contributor Author

SaOgaz commented Jun 24, 2019

I've added two interpolation versions as well, linear and spline.

Copy link
Member

@eteq eteq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor comments, and one larger one (about splitting the InterpolatedResample) although all are up for debate if others disagree.

Also, it would be good to add at least a brief example in the docs about this. The most logical place is probably the "Manipulating Spectra" section? (manipulation.rst)

specutils/manipulation/resample.py Outdated Show resolved Hide resolved
Create a re-sampling matrix to be used in re-sampling spectra in a way
that conserves flux. This is adapted from
https://ui.adsabs.harvard.edu/abs/2017arXiv170505165C/references,
eprint arXiv:1705.05165. This code was heavily influenced by Nick Earl's
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you reference either a PR # or a commit on github to cite this just so that there's history?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, there's no PR though. I was referencing his branch off his fork: nmearl@0ff6ef1. What's the best way to reference that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's find just to give the URL directly to that SHA (e.g. nmearl@0ff6ef1 ) - this is a private method so no need to be too picky about it

specutils/manipulation/resample.py Outdated Show resolved Hide resolved
specutils/manipulation/resample.py Outdated Show resolved Hide resolved
specutils/manipulation/resample.py Outdated Show resolved Hide resolved
@SaOgaz
Copy link
Contributor Author

SaOgaz commented Jun 25, 2019

Oh shoot, @eteq I accidentally over-wrote the change of yours I merged in, what did you change "This resample algorithim conserves overall flux (as opposed to flux density)" to?

@stscicrawford
Copy link
Contributor

Tagging @philhodge, @jdavies-st, and @stscirij to take a look at how useful this functionality would be for the calibration software or if there is something that could be added in the future

Copy link
Member

@eteq eteq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I few fairly straightforward suggestions in some of the stuff I haven't yet reviewed.

One "find-and-replace" suggestion I'd like to make additionally, though. Right now the class names are all XXXResample. I think these would be better as XXXResampler - that is, class names should be nouns, because it's an "object", whereas "resample" is a verb. So the class should be "a thing that does resampling" - i.e. a "resampler". (Same comment for the variable names in the examples, but that's probably still just a global find-and-replace of "resample" -> "resampler"?)

specutils/manipulation/resample.py Outdated Show resolved Hide resolved
Create a re-sampling matrix to be used in re-sampling spectra in a way
that conserves flux. This is adapted from
https://ui.adsabs.harvard.edu/abs/2017arXiv170505165C/references,
eprint arXiv:1705.05165. This code was heavily influenced by Nick Earl's
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's find just to give the URL directly to that SHA (e.g. nmearl@0ff6ef1 ) - this is a private method so no need to be too picky about it

specutils/manipulation/resample.py Outdated Show resolved Hide resolved
specutils/manipulation/resample.py Outdated Show resolved Hide resolved
specutils/manipulation/resample.py Show resolved Hide resolved
specutils/manipulation/resample.py Show resolved Hide resolved
docs/manipulation.rst Outdated Show resolved Hide resolved
SaOgaz and others added 3 commits July 19, 2019 16:44
Co-Authored-By: Erik Tollerud <erik.tollerud@gmail.com>
Add warning about uncertainties not being used for linear interpolation

Co-Authored-By: Erik Tollerud <erik.tollerud@gmail.com>
@eteq
Copy link
Member

eteq commented Jul 24, 2019

Alright I think we're all good here now. Thanks for your work on this @SaOgaz !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants