Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Equivalent width uncertainties are significantly lower in v2 #127

Closed
ashodkh opened this issue May 18, 2023 · 22 comments
Closed

Equivalent width uncertainties are significantly lower in v2 #127

ashodkh opened this issue May 18, 2023 · 22 comments

Comments

@ashodkh
Copy link
Contributor

ashodkh commented May 18, 2023

I noticed that the EW error estimates between v1 and v2 have changed drastically. v2 has significantly lower EW uncertainties.
Screen Shot 2023-05-16 at 2 10 46 PM

This leads to 4-5 sigma detections of very low equivalent widths, even though the flux seems consistent with noise. Here are some examples:
https://fastspecfit.desi.lbl.gov/target/sv1-bright-27348-39627883852857837?index=1
https://fastspecfit.desi.lbl.gov/target/sv1-bright-27345-39627883852857767?index=1

@janewman-pitt-edu
Copy link

Copying from Slack: if you’re doing the nominal error on the profile-weighted flux from propagation of errors, that estimate is equivalent to assuming that the profile used is perfectly correct. If things are noisy e.g. the line width will be imperfect and that won’t be included in the error propagation.

@moustakas
Copy link
Member

@ashodkh would it be possible for you to make your nice figure (comparing the v1 and v2 catalogs) for some of the other quantities measured for each line: (1) amplitude; (2) flux; and (3) continuum flux?

The EW is derived simply as the ratio of the integrated line-flux and the continuum flux (with Gaussian propagation of errors on each of those quantities), so it would be helpful to figure out where the original issue arises.

@ashodkh
Copy link
Contributor Author

ashodkh commented Jul 25, 2023

Here are your requested plots:
Amplitude
v1-v2-amp

Flux

v1-v2-flux

Continuum Flux

v1-v2-cont

In addition, I also made a plot for the uncertainty in integrated flux
v1-v2-flux-unc

@moustakas
Copy link
Member

I am continuing to do tests using repeat observations, but I've tentatively reverted the change between v1 and v2 in #137 (a3b5f42). Specifically, I am no longer weighting by the line-profile to estimate the inverse variance in the flux.

Note that this means that {linename}_boxflux_ivar and {linename}_flux_ivar are identical.

@moustakas
Copy link
Member

@ashodkh @janewman-pitt-edu I reverted the flux inverse variance estimate to what was used in v1 in #137, so hopefully this addresses the issue you noted.

@janewman-pitt-edu
Copy link

Hi @@moustakas ,

the link to the diff when you closed the issue allowed me to find the relevant code easily. I think there is indeed a bug.

I think you wanted code equivalent to :
sigma^2 = sum( w_i^2 sigma_i^2) / (sum w_i)^2
but your formula looks to me like it is effectively:
sigma^2 = sum ( w_i sigma_i^2) / (sum w_i)

An easy test of the formula is whether it gives the right result for an unweighted average all with equal errors, (e.g., we could set w_i = 1 , sum(w_i) = N, sigma_i = sigma_x ).

The first one gives

sigma^2 = N*(sigma_x^2) / N^2 = sigma_x^2/N

which is the usual formula for standard error.

The formula used in the code gives:

sigma^2 = N*(sigma_x^2) / N = sigma_x^2 (edited)

@moustakas moustakas reopened this Aug 5, 2023
@moustakas
Copy link
Member

@ashodkh @janewman-pitt-edu

First, @janewman-pitt-edu, thanks for working out that I did indeed have a bug in my code which made the {LINE}_FLUX_IVAR estimates unrealistically small.

I've made progress on this ticket by simulating some data and I think I'm pretty satisfied with the results, although I welcome your input. In case you're interested, the code I wrote for these sims can be found here--
https://github.com/moustakas/fastspecfit-projects/blob/main/fastpaper1/fluxivar-sims

I created two types of simulations:

  • A single (hypothetical) isolated emission line;
  • And two blended (hypothetical) emission lines.

In both cases I:

  • used a fixed per-pixel signal-to-noise ratio of one;
  • assumed independent pixels and Gaussian noise;
  • assumed a fixed line-width of 125 km/s;
  • ignored all potential sources of systematic uncertainty like imperfect stellar continuum subtraction, etc.

For example, here's one realization from the first simulation class:

Screenshot 2023-08-07 at 1 35 50 PM

And here's one realization from the second simulation class:

Screenshot 2023-08-07 at 1 36 06 PM

In each panel, the vertical dashed lines indicate the $\pm3\sigma_{\mathrm{line}}$ interval used to determine $F(\mathrm{Box})\pm\sigma_{F(\mathrm{Box})}$ and the horizontal dashed lines is the $\pm1\sigma$ scatter in the emission-line subtracted spectrum used to determine the uncertainty in the amplitude, $\sigma_{\mathrm{Amp}}$. (This definition is practical; it is chosen so that $\mathrm{Amp}/\sigma_{\mathrm{Amp}}$ can be used to identify "significant" lines.)

In addition, $F(\mathrm{Gauss}) = \sqrt{2\pi} (\mathrm{Amp}) \lambda_{\mathrm{line}} (\sigma_{\mathrm{line}} / c)$ is the Gaussian-integrated flux and $\sigma_{F(\mathrm{Gauss})}= \sqrt{ \sum( w_i^2 \sigma_i^2) / (\sum w_i)^2 }$, where $\sigma_i$ are the per-pixel uncertainties and $w_i$ is the maximum-likelihood Gaussian line-profile at each pixel (the solid black curve in the first figure, above).

Note:

  1. When an emission line is isolated, F(Gauss) and F(Box) are consistent with one another (good!) but $\sigma_{F(\mathrm{Box})}$ is more than a factor of 10 larger than than $\sigma_{F(\mathrm{Gauss})}$.
  2. When emission lines are blended, F(Box) is (obviously) biased.

I then ran a suite of 5000 Monte Carlo realizations for each case and generated the following summary figure:

Screenshot 2023-08-07 at 1 49 11 PM

From this figure I conclude:

  • Amp is not expected to show a Gaussian pull, so the resulting distributions look fine;
  • The pull for F(Box) is optimal for the isolated line but bias for blended lines;
  • The pull for F(Gauss) is unbiased for all three cases but exhibits Lorentzian tails where the uncertainty is presumably being underestimated. Presumably this is a "feature" of this method, but I'm uncertain if anything can be done about it.

Finally, I compared the signal-to-noise ratio, $F/\sigma_{F}$, for the Gaussian and boxcar fluxes for the single isolated line. As implicit from the previous figures, the signal-to-noise ratio of the Gaussian flux is a factor of 7-20 larger than the boxcar signal-to-noise ratio.

Screenshot 2023-08-07 at 1 56 20 PM

@janewman-pitt-edu
Copy link

janewman-pitt-edu commented Aug 7, 2023

@moustakas are you fixing the Gaussian sigma etc. in the fitting? It sounds like not (but maybe you meant you just used a fixed value for the simulations)?

I think we really need to see how things scale at low S/N and when the parameters are left free. @ashodkh has shown pretty convincingly that the right error propagation formula (but not marginalizing over profile uncertainties) is way over-optimistic in that limit.

@moustakas
Copy link
Member

With input from @dstndstn, we realized that determining the Gaussian-integrated emission-flux and uncertainty is analogous to the matched-filtering problem in photometry. Here's a quick write-up that has gone into the FastSpecFit paper and the code changes have been implemented in the 2.3.1-candidate branch. More results shortly.

Screenshot 2023-08-14 at 10 36 40 AM

@janewman-pitt-edu
Copy link

janewman-pitt-edu commented Aug 14, 2023

As it happens I pointed Ashod to the Horne algorithm as another analogous (matched-filter-esque) case to look at last week :)

Note that this is still making the assumption that the correct profile is perfectly known. At low S/N, that fails badly. The right thing to do is to marginalize over the profile uncertainty; do you get errors on the velocity dispersion from the procedures you are using (e.g. from chi-squared as a function of sigma)?

If so, and if it remains too expensive to marginalize, it's likely not too bad to calculate the estimated flux at the +/- 1 sigma limits on the velocity dispersion, and then add the resulting error estimate in quadrature to the fixed-profile uncertainties you're calculating above.

@janewman-pitt-edu
Copy link

(If the PDFs are not Gaussian it's better to evaluate at +/- 1 sigma on dispersion than to calculate derivatives and use propagation of errors)

@moustakas
Copy link
Member

Note that this is still making the assumption that the correct profile is perfectly known. At low S/N, that fails badly. The right thing to do is to marginalize over the profile uncertainty; do you get errors on the velocity dispersion from the procedures you are using (e.g. from chi-squared as a function of sigma)?

Unfortunately my code does not return an estimate of the uncertainty on the line-width. The trf method of scipy.optimize.least_squares which I use returns the Jacobian. I know there are heuristics for turning that into a covariance matrix, but the matrix is often singular and therefore cannot be inverted. Bootstrapping is an interesting suggestion, but it'll need to await a speed-up of the code (e.g., moving to GPUs/JAX for the optimization).

Also recall that I fit all the lines simultaneously using physically motivated constraints (e.g., the line-widths of the Balmer lines are all tied together), so it's not trivial to decouple the lines in order to get a per-line estimate of the variance in the best-fitting parameters.

So in the spirit of gotta make a final decision and move on, I'm planning to run a few more tests and then simply document the various caveats. For the record, the data model includes both the Gaussian- and boxcar-integrated flux and uncertainty. End-users will simply need to make decisions about how they use the uncertainties for their specific science case (e.g., they can use my VAC to select specific samples of objects and then remeasure the fluxes and uncertainties using bootstrapping, another optimization algorithm, etc.)

BTW, here are the updated sims, where (1) I corrected a bug in my input inverse variance spectrum; and (2) I now allow both the velocity center and line-width to be optimized / fitted, not just the amplitude. And I ran a range of sims with per-pixels signal-to-noise ratios of 3, 5, 10, and 15. All the sim outputs can be retrieved here (although I reserve the right to move or overwrite these files)--
https://data.desi.lbl.gov/desi/users/ioannis/fastspecfit/fastpaper1/data/fluxivar-sims

Below, I highlight the S/N=5 and S/N=10 results.

From these results I conclude that:

  • For the isolated emission line, the boxcar fluxes and uncertainties are the least biased. The Gaussian-integrated fluxes are unbiased, but the uncertainties appear to be underestimated by ~20% (vs a factor of 10 previously).
  • For the blended lines, the boxcar fluxes are biased and the uncertainties are underestimated. Moreover, the uncertainty underestimate is different (higher) for the red (lower-amplitude, in this sim) line than the blue line. The Gaussian fluxes, meanwhile, are unbiased and the uncertainties are underestimated by 60%-70%.

S/N = 10

Screenshot 2023-08-14 at 11 52 48 AM Screenshot 2023-08-14 at 11 58 16 AM Screenshot 2023-08-14 at 11 25 46 AM



S/N = 5

Screenshot 2023-08-14 at 11 58 56 AM Screenshot 2023-08-14 at 11 53 19 AM Screenshot 2023-08-14 at 11 28 02 AM

@janewman-pitt-edu
Copy link

I think it is certainly out of scope for the current version, but you could perhaps apply some regularization before inversion to get a somewhat conservative estimate of the error on linewidth.

Another option which would be fast and stable though less accurate than marginalizing would be to just estimate the second derivative of chi sq WRT velocity dispersion with your favorite estimator (cf. https://en.wikipedia.org/wiki/Finite_difference_coefficient#Central_finite_difference) , fixing all other parameters, to determine what delta-linewidth corresponds to delta-chi-sq = 1.

@janewman-pitt-edu
Copy link

(This follows from properties of maximum likelihood estimators, which least squares is a special case of)

@janewman-pitt-edu
Copy link

(thinking about this a little more, it's likely better to use the 68% cutoff for chi-squared distribution with DOF = the number of parameters you were fitting, delta-chi-sq = 1 corresponds to the case where linewidth is the only thing you fit)

@ashodkh
Copy link
Contributor Author

ashodkh commented Aug 15, 2023

@moustakas Can you please clarify how you're making the histogram plots for the isolated line?

I have been running similar simulations and finding that the profile-weighted uncertainty is indeed optimal, in that it agrees with bootstrapping. The box-car uncertainty is larger by a factor of ~1.3, which is proved to be the case for a Gaussian profile in https://ui.adsabs.harvard.edu/abs/1986PASP...98..609H/abstract.

Screenshot 2023-08-15 at 4 26 33 PM

@moustakas
Copy link
Member

@ashodkh thanks for the nice confirmation.

The histogram is just (noisy) flux minus true flux (input to the sim) divided by the inferred uncertainty--
https://github.com/moustakas/fastspecfit-projects/blob/main/fastpaper1/fluxivar-sims#L322-L323

@ashodkh
Copy link
Contributor Author

ashodkh commented Aug 16, 2023

@moustakas I remade the histograms with my sims, and I am not finding that the profile-weighted uncertainty is underestimated (the 1.23 stddev in your isolated line plots). Both profile-weighted and boxcar fluxes have correct uncertainties and no bias. I looked into your code and I couldn't find what caused the difference.

Screenshot 2023-08-16 at 3 21 30 PM

@moustakas
Copy link
Member

@ashodkh I'm not sure why your sims are different. Are you refitting the emission line, including allowing the line-width and velocity center to vary? My code is there for you to inspect!

I did confirm that the numbers reported in my plots are correct.

import numpy as np
import fitsio
from astropy.table import Table
import matplotlib.pyplot as plt

tt = Table(fitsio.read('sim01-snr10.fits'))

for col in ['FLUX', 'BOXFLUX']:
    print(np.mean((tt[col] - tt['TRUEFLUX']) / tt[f'{col}_ERR']), np.std((tt[col] - tt['TRUEFLUX']) / tt[f'{col}_ERR']))
-0.017168581 1.2477057
-0.05973556 1.0198334

@ashodkh
Copy link
Contributor Author

ashodkh commented Aug 16, 2023

@moustakas My apologies, I did not notice that your updated sims also fitted for line-width and velocity center. Mine is only fitting for the amplitude. That could explain the difference.

@moustakas
Copy link
Member

@ashodkh @janewman-pitt-edu I'm going to close this ticket as having resolved the primary issue, which ultimately a bug in the v2 flux inverse variance(s). Let's revisit after the new VACs are in place and continue to discuss ways of improving the uncertainties in the catalog.

@janewman-pitt-edu
Copy link

Yes, I agree -- next steps are for future versions :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

No branches or pull requests

3 participants