Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

For tabular_fits output, force uncertainty to StdDev #1027

Merged
merged 5 commits into from
Mar 17, 2023

Conversation

tbowers7
Copy link
Contributor

@tbowers7 tbowers7 commented Mar 6, 2023

In the tabular_fits_writer, the uncertainty is written out as the standard deviation. Some readers (e.g., SDSS), however, load 1D spectra into Spectrum1D with the uncertainty specified as the inverse variance. Without explicitly converting the uncertainty to standard deviation, the code raises a astropy.units.core.UnitConversionError for these cases when trying to conform the units of the uncertainty to those of the spectral flux array using the standard tabular_fits_writer with the Spectrum1D.write() method.

This PR utilizes the represent_as functionality of astropy.nddata.NDUncertainty arrays to force the representation of the uncertainty in standard deviation form before checking units and writing to FITS.

@codecov
Copy link

codecov bot commented Mar 6, 2023

Codecov Report

Merging #1027 (1202abc) into main (8179313) will decrease coverage by 0.02%.
The diff coverage is 71.42%.

@@            Coverage Diff             @@
##             main    #1027      +/-   ##
==========================================
- Coverage   70.01%   70.00%   -0.02%     
==========================================
  Files          64       64              
  Lines        4346     4350       +4     
==========================================
+ Hits         3043     3045       +2     
- Misses       1303     1305       +2     
Impacted Files Coverage Δ
specutils/io/default_loaders/tabular_fits.py 96.72% <71.42%> (-3.28%) ⬇️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@rosteen
Copy link
Contributor

rosteen commented Mar 14, 2023

Thanks for this! Unfortunately I tried to write a test, and the very first SDSS file I tried this on threw a RuntimeWarning: divide by zero encountered in divide error on trying to write it back out. Perhaps that's not the worst thing, considering it would always fail before and now only fails if it hits that, but it does make writing a test more annoying (I was using our default SDSS testing file).

I think there are two ways we can resolve this:

  1. Add a try/except at the uncertainty conversion that raises a more useful error message to the user.
  2. Allow the tabular_fits writer to write out any of the NDUncertainty types (perhaps with a different column header name based on the class).

Any opinions on those two options?

Comment on lines 152 to 160
unc = (
spectrum
.uncertainty
.represent_as(StdDevUncertainty)
.quantity
.to(funit, equivalencies=u.spectral_density(disp))
)
columns.append(unc.astype(ftype))
colnames.append("uncertainty")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
unc = (
spectrum
.uncertainty
.represent_as(StdDevUncertainty)
.quantity
.to(funit, equivalencies=u.spectral_density(disp))
)
columns.append(unc.astype(ftype))
colnames.append("uncertainty")
if spectrum.uncertainty.uncertainty_type == 'var':
uunit = funit**2
elif spectrum.uncertainty.uncertainty_type == 'ivar':
uunit = funit**-2
else:
uunit = funit
unc = spectrum.uncertainty.quantity.to(uunit, equivalencies=u.spectral_density(disp))
columns.append(unc.astype(ftype))
colnames.append(f"{spectrum.uncertainty.uncertainty_type}")

The original behaviour definitely needs fixing, but I'd strongly recommend to allow preserving the original type. This could be one approach to set consistent units.

@dhomeier
Copy link
Contributor

dhomeier commented Mar 15, 2023

2. Allow the tabular_fits writer to write out any of the NDUncertainty types (perhaps with a different column header name based on the class).

I'd very much vote for this option – there may be use cases for choosing any of the 3 uncertainty types, and the writer should not needlessly force them to a certain type. Zeros ending up as Inf in the transformation as in your test case are just one example. This would also get in the way of better round-tripping of standard spectral formats @kelle has brought on the agenda.
Clearly identifying the different types brings up some issues I have also run into in #1009; in my suggestion here I have just used the most minimalist option, but only realised now that tabular_fits_loader (rather, generic_spectrum_from_table) does not rely on column names at all to identify an uncertainty column, but instead only on position and units. This means the generic loader currently cannot load 'var' or 'ivar', which should be fixed in an accompanying update. Could extend

for c in colnames:
if table[c].unit == table[flux_column].unit:
err_column = c

to the search for matching unit powers for Variance and InverseVariance, but that would need some agreement on the order of precedence, and perhaps colnames should be considered in addition.

@rosteen
Copy link
Contributor

rosteen commented Mar 17, 2023

Thanks for the thoughts @dhomeier - I think I agree that keeping the original type is the way to go, but that's a big enough change (I think) that I'm leaning toward adding a slightly better warning to this and merging this, and then opening a separate follow-up PR for your suggestion to merge into the v2.0-dev branch with other breaking/significant changes.

@dhomeier
Copy link
Contributor

Agreed, from experience with the wcs1d loader that would become a rather more complex endeavour.
Although this PR might also be considered breaking; though only for cases where results were already broken or wrong before.

tbowers7 and others added 3 commits March 17, 2023 12:40
In the ``tabular_fits_writer``, the uncertainty is written out as the standard
deviation.  Some readers, however, load 1D spectra into ``Spectrum1D`` with the
uncertainty specified as the inverse variance.  Without explicitly converting
the uncertainty to standard deviation, the code raises a
``astropy.units.core.UnitConversionError`` for these cases when trying to
conform the units of the uncertainty to those of the spectral flux array.

This commit utilizes the ``represent_as`` functionality of
``astropy.nddata.NDUncertainty`` arrays to force the representation of the
uncertainty in standard deviation form before checking units and writing to
FITS.

	modified:   specutils/io/default_loaders/tabular_fits.py
@rosteen rosteen merged commit b5deddc into astropy:main Mar 17, 2023
@tbowers7 tbowers7 deleted the tabular_fits_uncertainty branch March 21, 2023 21:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants