Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

astropy.table writes int8 column in boolean in FITS output #11963

Open
rongpu opened this issue Jul 20, 2021 · 8 comments
Open

astropy.table writes int8 column in boolean in FITS output #11963

rongpu opened this issue Jul 20, 2021 · 8 comments

Comments

@rongpu
Copy link

rongpu commented Jul 20, 2021

Description

When writing an astropy Table containing a np.int8 column into a FITS file, that column becomes boolean in the output

import numpy as np
from astropy.table import Table

t = Table()
t['a'] = np.array([-1, 0, 1, 2], dtype=np.int8)
print(t['a'].dtype)  # int8
t.write('test.fits')

tt = Table.read('test.fits')
print(tt['a'].dtype)  # bool

System Details

Python 3.8.5 (default, Sep 4 2020, 02:22:02)
[Clang 10.0.0 ]
Numpy 1.20.3
astropy 4.2.1

@github-actions
Copy link

Welcome to Astropy 👋 and thank you for your first issue!

A project member will respond to you as soon as possible; in the meantime, please double-check the guidelines for submitting issues and make sure you've provided the requested details.

If you feel that this issue has not been responded to in a timely manner, please leave a comment mentioning our software support engineer @embray, or send a message directly to the development mailing list. If the issue is urgent or sensitive in nature (e.g., a security vulnerability) please send an e-mail directly to the private e-mail feedback@astropy.org.

@embray
Copy link
Member

embray commented Jul 21, 2021

Thank you for the report. This is more specifically a lower-level issue about astropy.io.fits than Table or the unified-io interface.

The FITS standard does not allow for signed bytes in table columns. It is simply not one of the supported data types for FITS tables. Only unsigned bytes are supported.

For some historical reason that I can't recall, however, the code interprets an array of signed bytes as a boolean column. I'd have to investigate more to see why that is.

I think, at the very least, we could provide a better warning about this.

@embray
Copy link
Member

embray commented Jul 21, 2021

In any case, if your goal is to write signed bytes to a FITS table, you will have to change your approach, because it's not something supported by the FITS format, for some reason :/

@rongpu
Copy link
Author

rongpu commented Jul 21, 2021

Thanks @embray. For people that are unaware of this "feature" in the FITS standard, it would be completely unexpected that it silently turns integers into booleans. So, yes, I think it would be great if it prompts a warning or error when this happens.

@embray
Copy link
Member

embray commented Aug 16, 2021

See also #11996

@maxnoe
Copy link
Member

maxnoe commented Nov 7, 2021

The FITS standard does not allow for signed bytes in table columns. It is simply not one of the supported data types for FITS tables. Only unsigned bytes are supported.

This is incorrect. The FITS standard does allow storing int8 using the BSCALE/BZERO header keywords. Actually support for this was just added to astropy.io.fits in #11996, discussion in #11995.

Table should be able to round trip such data through fits.

The corresponding keywords for columns of binary tables are TSCAL<NNN> and TZERO<NNN>. The current FITS standard does not mention it, but they could be in principle used for the same thing.

@maxnoe
Copy link
Member

maxnoe commented Nov 7, 2021

Edit: checking writing unsigned ints through the table interface, they are used for this. So the same change as in #11996 for image extensions could be made for table extensions.

@mhvk
Copy link
Contributor

mhvk commented May 27, 2024

From comments on #16502 (which is a duplicate), for tables two changes would need to be made: L (logical) columns should be mapped to numpy b1 instead of i1, and i1 should be supported using a specific BZERO. The first change is perhaps the most worrying - it might well break things.

The mapping is at

# mapping from TFORM data type to numpy data type (code)
# L: Logical (Boolean)
# B: Unsigned Byte
# I: 16-bit Integer
# J: 32-bit Integer
# K: 64-bit Integer
# E: Single-precision Floating Point
# D: Double-precision Floating Point
# C: Single-precision Complex
# M: Double-precision Complex
# A: Character
FITS2NUMPY = {
"L": "i1",
"B": "u1",
"I": "i2",
"J": "i4",
"K": "i8",
"E": "f4",
"D": "f8",
"C": "c8",
"M": "c16",
"A": "S",
}
# the inverse dictionary of the above
NUMPY2FITS = {val: key for key, val in FITS2NUMPY.items()}
# Normally booleans are represented as ints in Astropy, but if passed in a numpy
# boolean array, that should be supported
NUMPY2FITS["b1"] = "L"
# Add unsigned types, which will be stored as signed ints with a TZERO card.
NUMPY2FITS["u2"] = "I"
NUMPY2FITS["u4"] = "J"
NUMPY2FITS["u8"] = "K"
# Add half precision floating point numbers which will be up-converted to
# single precision.
NUMPY2FITS["f2"] = "E"

Detection of signed integer for an unsigned byte could then follow similar logic to that used for the unsigned integer detection bits, at

# Handle arrays passed in as unsigned ints as pseudo-unsigned
# int arrays; blatantly tacked in here for now--we need columns
# to have explicit knowledge of whether they treated as
# pseudo-unsigned
bzeros = {
2: np.uint16(2**15),
4: np.uint32(2**31),
8: np.uint64(2**63),
}
if (
array.dtype.kind == "u"
and array.dtype.itemsize in bzeros
and self.bscale in (1, None, "")
and self.bzero == bzeros[array.dtype.itemsize]
):
# Basically the array is uint, has scale == 1.0, and the
# bzero is the appropriate value for a pseudo-unsigned
# integer of the input dtype, then go ahead and assume that
# uint is assumed
numpy_format = numpy_format.replace("i", "u")
self._pseudo_unsigned_ints = True

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants