For `tabular_fits` output, force `uncertainty` to StdDev #1027

tbowers7 · 2023-03-06T19:32:08Z

In the tabular_fits_writer, the uncertainty is written out as the standard deviation. Some readers (e.g., SDSS), however, load 1D spectra into Spectrum1D with the uncertainty specified as the inverse variance. Without explicitly converting the uncertainty to standard deviation, the code raises a astropy.units.core.UnitConversionError for these cases when trying to conform the units of the uncertainty to those of the spectral flux array using the standard tabular_fits_writer with the Spectrum1D.write() method.

This PR utilizes the represent_as functionality of astropy.nddata.NDUncertainty arrays to force the representation of the uncertainty in standard deviation form before checking units and writing to FITS.

codecov · 2023-03-06T19:41:31Z

Codecov Report

Merging #1027 (1202abc) into main (8179313) will decrease coverage by 0.02%.
The diff coverage is 71.42%.

@@            Coverage Diff             @@
##             main    #1027      +/-   ##
==========================================
- Coverage   70.01%   70.00%   -0.02%     
==========================================
  Files          64       64              
  Lines        4346     4350       +4     
==========================================
+ Hits         3043     3045       +2     
- Misses       1303     1305       +2

Impacted Files	Coverage Δ
specutils/io/default_loaders/tabular_fits.py	`96.72% <71.42%> (-3.28%)`	⬇️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

rosteen · 2023-03-14T20:14:51Z

Thanks for this! Unfortunately I tried to write a test, and the very first SDSS file I tried this on threw a RuntimeWarning: divide by zero encountered in divide error on trying to write it back out. Perhaps that's not the worst thing, considering it would always fail before and now only fails if it hits that, but it does make writing a test more annoying (I was using our default SDSS testing file).

I think there are two ways we can resolve this:

Add a try/except at the uncertainty conversion that raises a more useful error message to the user.
Allow the tabular_fits writer to write out any of the NDUncertainty types (perhaps with a different column header name based on the class).

Any opinions on those two options?

dhomeier · 2023-03-15T20:15:18Z

specutils/io/default_loaders/tabular_fits.py

+        unc = (
+            spectrum
+            .uncertainty
+            .represent_as(StdDevUncertainty)
+            .quantity
+            .to(funit, equivalencies=u.spectral_density(disp))
+        )
        columns.append(unc.astype(ftype))
        colnames.append("uncertainty")


Suggested change

unc = (

spectrum

.uncertainty

.represent_as(StdDevUncertainty)

.quantity

.to(funit, equivalencies=u.spectral_density(disp))

)

columns.append(unc.astype(ftype))

colnames.append("uncertainty")

if spectrum.uncertainty.uncertainty_type == 'var':

uunit = funit**2

elif spectrum.uncertainty.uncertainty_type == 'ivar':

uunit = funit**-2

else:

uunit = funit

unc = spectrum.uncertainty.quantity.to(uunit, equivalencies=u.spectral_density(disp))

columns.append(unc.astype(ftype))

colnames.append(f"{spectrum.uncertainty.uncertainty_type}")

The original behaviour definitely needs fixing, but I'd strongly recommend to allow preserving the original type. This could be one approach to set consistent units.

dhomeier · 2023-03-15T20:31:54Z

2. Allow the tabular_fits writer to write out any of the NDUncertainty types (perhaps with a different column header name based on the class).

I'd very much vote for this option – there may be use cases for choosing any of the 3 uncertainty types, and the writer should not needlessly force them to a certain type. Zeros ending up as Inf in the transformation as in your test case are just one example. This would also get in the way of better round-tripping of standard spectral formats @kelle has brought on the agenda.
Clearly identifying the different types brings up some issues I have also run into in #1009; in my suggestion here I have just used the most minimalist option, but only realised now that tabular_fits_loader (rather, generic_spectrum_from_table) does not rely on column names at all to identify an uncertainty column, but instead only on position and units. This means the generic loader currently cannot load 'var' or 'ivar', which should be fixed in an accompanying update. Could extend

specutils/specutils/io/parsing_utils.py

Lines 268 to 270 in 1c2ba04

    
           for c in colnames: 
        
               if table[c].unit == table[flux_column].unit: 
        
                   err_column = c

to the search for matching unit powers for Variance and InverseVariance, but that would need some agreement on the order of precedence, and perhaps colnames should be considered in addition.

rosteen · 2023-03-17T14:36:31Z

Thanks for the thoughts @dhomeier - I think I agree that keeping the original type is the way to go, but that's a big enough change (I think) that I'm leaning toward adding a slightly better warning to this and merging this, and then opening a separate follow-up PR for your suggestion to merge into the v2.0-dev branch with other breaking/significant changes.

dhomeier · 2023-03-17T15:32:35Z

Agreed, from experience with the wcs1d loader that would become a rather more complex endeavour.
Although this PR might also be considered breaking; though only for cases where results were already broken or wrong before.

In the ``tabular_fits_writer``, the uncertainty is written out as the standard deviation. Some readers, however, load 1D spectra into ``Spectrum1D`` with the uncertainty specified as the inverse variance. Without explicitly converting the uncertainty to standard deviation, the code raises a ``astropy.units.core.UnitConversionError`` for these cases when trying to conform the units of the uncertainty to those of the spectral flux array. This commit utilizes the ``represent_as`` functionality of ``astropy.nddata.NDUncertainty`` arrays to force the representation of the uncertainty in standard deviation form before checking units and writing to FITS. modified: specutils/io/default_loaders/tabular_fits.py

dhomeier reviewed Mar 15, 2023

View reviewed changes

tbowers7 and others added 3 commits March 17, 2023 12:40

Add writing SDSS file back out to test

de40ab5

Add slightly more informative error message

3153d57

rosteen force-pushed the tabular_fits_uncertainty branch from f15ea4a to 3153d57 Compare March 17, 2023 16:40

rosteen added 2 commits March 17, 2023 12:42

This test is expected to fail for now

cf9b2f1

Checked for wrong type

1202abc

rosteen merged commit b5deddc into astropy:main Mar 17, 2023

rosteen mentioned this pull request Mar 17, 2023

Allow tabular-fits writer to write out any uncertainty type #1043

Open

tbowers7 deleted the tabular_fits_uncertainty branch March 21, 2023 21:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

For `tabular_fits` output, force `uncertainty` to StdDev #1027

For `tabular_fits` output, force `uncertainty` to StdDev #1027

tbowers7 commented Mar 6, 2023

codecov bot commented Mar 6, 2023 •

edited

rosteen commented Mar 14, 2023 •

edited

dhomeier Mar 15, 2023

dhomeier commented Mar 15, 2023 •

edited

rosteen commented Mar 17, 2023 •

edited

dhomeier commented Mar 17, 2023

For tabular_fits output, force uncertainty to StdDev #1027

For tabular_fits output, force uncertainty to StdDev #1027

Conversation

tbowers7 commented Mar 6, 2023

codecov bot commented Mar 6, 2023 • edited

Codecov Report

rosteen commented Mar 14, 2023 • edited

dhomeier Mar 15, 2023

Choose a reason for hiding this comment

dhomeier commented Mar 15, 2023 • edited

rosteen commented Mar 17, 2023 • edited

dhomeier commented Mar 17, 2023

For `tabular_fits` output, force `uncertainty` to StdDev #1027

For `tabular_fits` output, force `uncertainty` to StdDev #1027

codecov bot commented Mar 6, 2023 •

edited

rosteen commented Mar 14, 2023 •

edited

dhomeier commented Mar 15, 2023 •

edited

rosteen commented Mar 17, 2023 •

edited