Round tripping non-float Quantities from QTable to fits converts all Quantities to floats #12494

samaloney · 2021-11-18T16:11:36Z

Description

Trying to serialise a QTable to and from fits the non-floats are always converted to floats rather then keeping the original dtypes

Expected behavior

The created QTable to have the same dtype as the underling fits column.

Actual behavior

The created QTable has converted the uint and int to float64

Steps to Reproduce

import astropy.units as u
import numpy as np

from astropy.table import QTable
from astropy.io import fits

my_table = QTable()

my_table['uint'] = (list(range(10)) * u.m).astype(np.uint16)
my_table['int'] = (list(range(10)) * u.m).astype(np.int16)
my_table['float32'] = (list(range(10)) * u.m).astype(np.float32)

my_table.dtype
# dtype([('uint', '<u2'), ('int', '<i2'), ('float32', '<f4')])

my_table.write('test.fits')

table_fromfits = QTable.read('test.fits')
table_fromfits.dtype
#dtype([('uint', '<f8'), ('int', '<f8'), ('float32', '>f4')])

~~A simple but not elegant solution is to use the dtype info from the HDUL to cast the Quantity back to the desired dtype~~
This actually doesn't work for the uint case or cases with mixed quantities an non-quantities dtype([('uint', '<i2'), ('int', '<i2'), ('float32', '<f4')]) should be dtype([('uint', '<u2'), ('int', '<i2'), ('float32', '<f4')])

hdul = fits.open('test.fits')

hdul[1].data.columns

#ColDefs(
#    name = 'uint'; format = 'I'; unit = 'm'; bzero = 32768
#    name = 'int'; format = 'I'; unit = 'm'
#    name = 'float32'; format = 'E'; unit = 'm'
#)

fixed_table = QTable([u.Quantity(table_fromfits[c.name], dtype=c.dtype) for c in hdul[1].data.columns])

fixed_table.dtype
# dtype([('uint', '<i2'), ('int', '<i2'), ('float32', '<f4')])

System Details

Linux-5.4.0-33-generic-x86_64-with-glibc2.29
Python 3.8.10 (default, Sep 28 2021, 16:10:42)
[GCC 9.3.0]
Numpy 1.20.2
pyerfa 1.7.3
astropy 4.3.1
Scipy 1.6.3
Matplotlib 3.4.1

The text was updated successfully, but these errors were encountered:

dhomeier · 2021-11-18T16:32:55Z

The second step of the workaround

fixed_table = QTable([u.Quantity(table_fromfits[c.name], dtype=c.dtype) for c in hdul[1].data.columns])

could be made a bit more convenient at least as
QTable(table_fromfits, dtype=table_fromfits.dtype)
by having QTable._convert_col_for_table apply the dtype from *kwargs individually to q_cls.

dhomeier · 2021-11-18T16:59:53Z

As for the I/O part, I had assumed that it was a design choice to construct all Quantity-type columns as float types, but am wondering now if this has in fact been overlooked when discarding the dtype info in _decode_mixins

astropy/astropy/io/fits/connect.py

Lines 100 to 103 in de53bdc

    
           # Use the `datatype` attribute info to update column attributes that are 
        
           # NOT already handled via standard FITS column keys (name, dtype, unit). 
        
           for col in info['datatype']: 
        
               for attr in ['description', 'meta']:

Edit: whatever the intended default behaviour, setting the attrs above will not make any difference (as the tbl[col['name']].info.dtype in question already/still are of integer type at that point); any datatype information would probably need to be passed on with the tbl.meta['__serialized_columns__'] dict to the _construct_mixin[s] utilities.

dhomeier · 2021-11-19T23:39:08Z

@samaloney I have pushed a fix for correctly reading back the mixins in #12505 ; if you have chance to test if this addresses your problems, feedback would be great.

For a temporary workaround, as you noted that your use cases typically involve reading in large volumes of data, I strongly recommend to use getheader and getdata to read the tables in their original dtypes and then extract the units to directly construct
u.Quantity(tab[col], dtype=tab[col].dtype, name=col, unit=u.Unit(header.get('TUNITn'), format='fits')).
It's a bit cumbersome, but would avoid conversion losses and in particular the temporary creation of arrays 4x the original size or larger, in particular since the type conversion does not let you take advantage of memmap=True.

samaloney · 2021-11-25T10:02:58Z

Sorry for the delay @dhomeier I've just loaded some of my fits files using your branch seems to work as expected the dtypes are respected.

dhomeier · 2021-11-25T23:28:22Z

Sorry for the delay @dhomeier I've just loaded some of my fits files using your branch seems to work as expected the dtypes are respected.

Thanks for the confirmation! Note that the currently favoured implementation will store the dtype information to its own __serialized_columns__ field and thus not recover that info from files written with the current Astropy version. So for existing files the header would still need to be fixed separately. Also we are still discussing in #12505 (comment) whether to set the original dtype automatically or require a new QTable.read() option.

parejkoj · 2022-11-08T20:35:17Z

Poking this one, as we've just discovered this behavior in LSST. At the very least, we should make the green note on this page more prominent.

Any kind of "changes everything to a float" behavior can be dangerous for bit fields, ids or large explicitly countable values. This is related to a problem we've had with pandas, where it changes dtypes to float in order to fill in missing data with NaN.

samaloney added the Bug label Nov 18, 2021

pllim added io.fits unified-io labels Nov 18, 2021

dhomeier added the units label Nov 18, 2021

samaloney mentioned this issue Nov 19, 2021

L1 fits files don't maintain the same types as L0 all int uint of any size become float64 i4Ds/STIXCore#174

Closed

dhomeier linked a pull request Nov 19, 2021 that will close this issue

Preserve non-float dtypes in Quantity columns from serialized mixins #12505

Open

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Round tripping non-float Quantities from QTable to fits converts all Quantities to floats #12494

Round tripping non-float Quantities from QTable to fits converts all Quantities to floats #12494

samaloney commented Nov 18, 2021 •

edited

dhomeier commented Nov 18, 2021

dhomeier commented Nov 18, 2021 •

edited

dhomeier commented Nov 19, 2021

samaloney commented Nov 25, 2021

dhomeier commented Nov 25, 2021 •

edited

parejkoj commented Nov 8, 2022

Round tripping non-float Quantities from QTable to fits converts all Quantities to floats #12494

Round tripping non-float Quantities from QTable to fits converts all Quantities to floats #12494

Comments

samaloney commented Nov 18, 2021 • edited

Description

Expected behavior

Actual behavior

Steps to Reproduce

System Details

dhomeier commented Nov 18, 2021

dhomeier commented Nov 18, 2021 • edited

dhomeier commented Nov 19, 2021

samaloney commented Nov 25, 2021

dhomeier commented Nov 25, 2021 • edited

parejkoj commented Nov 8, 2022

samaloney commented Nov 18, 2021 •

edited

dhomeier commented Nov 18, 2021 •

edited

dhomeier commented Nov 25, 2021 •

edited