Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

specio.read_spec assumes wavelength and flux are always in column 1 and column 2 #372

Closed
ariedel opened this issue Feb 7, 2024 · 9 comments · Fixed by #384
Closed

specio.read_spec assumes wavelength and flux are always in column 1 and column 2 #372

ariedel opened this issue Feb 7, 2024 · 9 comments · Fixed by #384

Comments

@ariedel
Copy link

ariedel commented Feb 7, 2024

Description

If the wavelength column in a spectrum is not column 1 and the flux/throughput column is not column 2, the units are read incorrectly.

Expected behavior

Loading a file should grab the wavelength unit of the wavelength column, and the flux unit of the flux column.

Actual behavior

The ETC has a file describing the spectral traces for NIRISS SOSS, wherein the columns are x (pixels), wavelength (microns), and trace (pixels). When reading it in, the code crashed because specio.read_spec produced a file where the wavelength was supposedly in pixels, and the trace was supposedly in microns.

Steps to Reproduce

  1. grab this file: https://github.com/spacetelescope/pandeia_data/blob/master/jwst/niriss/wavepix/jwst_niriss_soss-96-ord1_trace_20160927172406.fits
  2. Observe the header units:
import astropy.io.fits as fits

with fits.open("jwst_niriss_soss-96-ord1_trace_20160927172406.fits") as hdulist:
    print(hdulist[1].header)
  1. Read in with synphot:
import astropy.units as u
import synphot

microns = u.def_unit('MICRONS', u.micron)
pixels = u.def_unit("PIXELS", u.pixel)

u.add_enabled_units([microns, pixels])

spectrum = synphot.spectrum.SpectralElement.from_file("jwst_niriss_soss-96-ord1_trace_20160927172406.fits", flux_col="trace")
  1. Observe a crash: UnitConversionError: 'Unit("PIXELS")' is not a scaled version of 'Unit("Angstrom")'. Note that the wavelength unit shouldn't be pixels; it should be microns.

I have tracked the issue down to specio.read_fits_spec line 185 where in we read the data by field name, and the units from the header always as TUNIT1 and TUNIT2.
In this particular case, the wavelength column was 2, and the "flux" column was 3, so TUNIT2 and TUNIT3 should have been read.
I have experimentally written some code to fix that, which I can PR.

System Details

macOS-10.16-x86_64-i386-64bit
Python 3.11.4 (main, Jul 5 2023, 09:00:44) [Clang 14.0.6 ]
Numpy 1.26.3
astropy 6.0.0
Scipy 1.11.4
Matplotlib 3.7.2
Synphot 1.2.1

@ariedel ariedel added the bug label Feb 7, 2024
@pllim pllim added the specio label Feb 7, 2024
@pllim
Copy link
Contributor

pllim commented Feb 7, 2024

The default values (including the units) are set for historical reason. It could be overwritten, see https://synphot.readthedocs.io/en/latest/synphot/overview.html#fits-table-format

The extension header must contain the following keywords (unless you overwrite them with non-default values in read_fits_spec())

And you can pass in all these keywords via **kwargs in https://synphot.readthedocs.io/en/latest/api/synphot.spectrum.SourceSpectrum.html#synphot.spectrum.SourceSpectrum.from_file

If this suggestion works for you, please close the issue. Thanks.

@pllim pllim added question and removed bug labels Feb 7, 2024
@ariedel
Copy link
Author

ariedel commented Feb 7, 2024

I see that what I've got here is an explicitly nonstandard fits file. I'll see if I can put some special case logic in the code; this is for a generic file reader function and the rest of the files are more well-behaved.

@ariedel ariedel closed this as completed Feb 7, 2024
@ariedel ariedel reopened this Feb 7, 2024
@ariedel
Copy link
Author

ariedel commented Feb 7, 2024

Actually, I spoke too soon. read_fits_spec() uses the manually-provided units only if there are no units in the headers, but there are units in these headers. I'm still not certain this deserves to be fixed, versus correcting the input files themselves, given that they're nonstandard.

@ariedel ariedel closed this as completed Feb 7, 2024
@pllim
Copy link
Contributor

pllim commented Feb 7, 2024

Ah, right...

Wavelength and flux units, which default to Angstrom and FLAM, respectively. These are only used if TUNIT1 and TUNIT2 keywords are not present in table (not primary) header.

The supported format is indeed very strict and there is no law in the land that states the units must be TUNIT1 and TUNIT2. This parser probably existed before astropy.table.QTable is a thing. If I could re-implement it, I would probably just use QTable to do the parsing. I think it would grab the correct TUNITn automatically that way.

I will re-open this as a feature request, though I probably won't fix it unless you come back and decide that your format must be supported (instead of fixing the format itself on your side).

@pllim pllim reopened this Feb 7, 2024
@pllim
Copy link
Contributor

pllim commented Feb 7, 2024

A workaround with the released version is possible by not using from_file. Instead, you grab the wavelength and flux/throughput from the file however you see fit, and then use this instead: https://synphot.readthedocs.io/en/latest/synphot/bandpass.html#arrays

@ariedel
Copy link
Author

ariedel commented Feb 7, 2024

I've implemented what you suggested, and reading in the data and creating a SpectralElement does work as a special case. That's sufficient for now, though I will raise the issue of the nonstandard file format with our instrument partners.

@talister
Copy link

talister commented Mar 6, 2024

I ran into a similar when trying to read the output of the ESO Sky Model which are FITS binary tables with wavelength, radiance flux and transmission of the atmosphere. The (cutdown, I removed the identical TFORMx='1D' keywords) header of the resulting skytable.fits looks like:

TTYPE1  = 'lam     '           / label for field   1
TUNIT1  = 'nm      '           / physical unit of field
TTYPE2  = 'flux    '           / label for field   2
TUNIT2  = 'ph/s/m2/micron/arcsec2' / physical unit of field
TTYPE3  = 'dflux1  '           / label for field   3
TUNIT3  = 'ph/s/m2/micron/arcsec2' / physical unit of field
TTYPE4  = 'dflux2  '           / label for field   4
TUNIT4  = 'ph/s/m2/micron/arcsec2' / physical unit of field
TTYPE5  = 'trans   '           / label for field   5
TUNIT5  = '1       '           / physical unit of field

Reading the transmission from the trans column with:

header, wavelengths, transmission= specio.read_spec(
                    'skytable.fits, wave_col='lam', flux_col='trans', wave_unit=u.nm, flux_unit=units.THROUGHPUT
                )

results in transmission having the (incorrect) same units from TUNIT2 even though it was read from column 5. I fixed this with a similar workaround as above since I "know" what I am reading from, I could reset the incorrect units but transparent support would be nice. Also support for flux per area as discussed in #261 would be great for supporting these files properly also... Thanks for all the great work on synphot, it's really useful for my work.

@pllim
Copy link
Contributor

pllim commented Mar 6, 2024

Thanks for the extra data point. It is not good that this assumption breaks so easily now. Maybe I will try to prioritize refactoring the FITS table I/O in the near future (cc @larrybradley). If you have a hard deadline, please let me know. In the meantime, please use the workaround provided. Thanks for your patience!

@pllim
Copy link
Contributor

pllim commented Mar 15, 2024

Theoretically, both your use cases would be covered by #384 but it does have breaking changes. Would you both want to take #384 for a spin and review it? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants