Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assigning to data of a BinTableHDU risks corrupting the original array #215

Open
braingram opened this issue Sep 21, 2023 · 1 comment
Open

Comments

@braingram
Copy link
Collaborator

braingram commented Sep 21, 2023

If a record array with an unsigned dtype is assigned to the data attribute of a BinTableHDU the data in the original array is modified when the BinTableHDU is saved (as seen in the example below).

from astropy.io import fits
import numpy as np

tbl = np.array([1, 2, 3], dtype=[('a', '<u2')])
hdulist = fits.HDUList()
hdulist.append(fits.PrimaryHDU())
hdulist.append(fits.BinTableHDU())
hdulist[-1].data = tbl
assert tbl['a'][0] == 1, tbl['a'][0]
hdulist.writeto('foo.fits', overwrite=True)
assert tbl['a'][0] == 1, tbl['a'][0]

Fails with: AssertionError: 32769

This occurs in astropy 5.3 and a recently downloaded 6.0 nightly.

There are many places in stdatamodels where hdu.data is assigned. Given that the modification occurs when the file is saved (which may occur at some later point in the code) this seems likely to introduce insidious bugs.

Updating the assignments to hdu.data to either copy the assigned array and/or to define the array at the time of creation of the BinTableHDU (passing the array in via the data keyword of BinTableHDU.__init__ does not appear to have the same issue) should be considered.

@braingram
Copy link
Collaborator Author

braingram commented Sep 22, 2023

There also appears to be a risk of the written data being corrupted as in the following example:

from astropy.io import fits
import numpy as np

tbl = np.array([(1, 'HOT', )], dtype=[('i', '>u2'), ('a', 'U4')])
hdulist = fits.HDUList()
hdulist.append(fits.PrimaryHDU())
hdulist.append(fits.BinTableHDU())
hdulist[-1].data = tbl
fr = hdulist[-1].data
hdulist.writeto('foo.fits', overwrite=True)

with fits.open('foo.fits') as ff:
    print(ff[-1].data)

Outputs:

WARNING: Unexpected extra padding at the end of the file.  This padding may not be preserved when saving changes. [astropy.io.fits.header]
[(32768, '\x00\x00\x10')]

Removing the 'a' column results in a correct output for 'i' (1) and removing 'i' results in the correct output for 'a' ('HOT'). However including both results in both items being corrupted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant