Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fits.open does not detect compression in file-like objects #16171

Open
jak574 opened this issue Mar 6, 2024 · 5 comments
Open

fits.open does not detect compression in file-like objects #16171

jak574 opened this issue Mar 6, 2024 · 5 comments

Comments

@jak574
Copy link
Contributor

jak574 commented Mar 6, 2024

Description

astropy fits.open supports passing a file and a file-like object, and if the file is compressed, this is automatically detected, and handled.

However, for file-like objects such as BytesIO and SpooledTemporaryFile, the compression detection does not work, leading to the following error being raised.

OSError: No SIMPLE card found, this file does not appear to be a valid FITS file. If this is really a FITS file, try with ignore_missing_simple=True

If the compression detection doesn't work, then the author needs handle this by either adding decompression manually, or maybe write the file to disk then reading it back in.

Reason I hit this error: I'm using FastAPI to fetch a fits file, and it outputs either a bytes object or a file-like SpooledTemporaryFile which I want to pass to fits.open.

Expected behavior

The following code should work, allowing the gzipped fits file to be read in, and have compression automatically detected.

with open("skymap.fits.gz","rb") as fh:
    gzipped_bytes = fh.read()
bytesio_fh = BytesIO(gzipped_bytes)
hdu = fits.open(bytesio_fh)

Note this is just a toy example.

How to Reproduce

  1. Obtain a gzipped fits file. In my example it's called skymap.fits.gz, a gzipped HEALPix file.
  2. Then run the code below
  3. The error above occurs.
from io import BytesIO
from astropy.io import fits

with open("skymap.fits.gz","rb") as fh:
    gzipped_bytes = fh.read()
bytesio_fh = BytesIO(gzipped_bytes)
hdu = fits.open(bytesio_fh)

This also fails:

from io import BytesIO
from tempfile import SpooledTemporaryFile

with open("skymap.fits.gz","rb") as fh:
    gzipped_bytes = fh.read()

tempfile = SpooledTemporaryFile()
tempfile.write(gzipped_bytes)
tempfile.seek(0)

hdu = fits.open(tempfile)

Versions

macOS-14.3.1-arm64-arm-64bit
Python 3.11.8 (main, Feb 10 2024, 12:35:50) [Clang 15.0.0 (clang-1500.1.0.2.5)]
astropy 6.0.0
Numpy 1.26.2
pyerfa 2.0.1.1
Scipy 1.12.0
Matplotlib 3.8.3

@saimn
Copy link
Contributor

saimn commented Mar 6, 2024

Right, it works with a file handle (io.BufferedReader from open) but not with io.BytesIO and probably other objects.
The issue is here where isfile doesn't recognize io.BytesIO as a file object:

if isfile(fileobj):
self._open_fileobj(fileobj, mode, overwrite)
elif isinstance(fileobj, (str, bytes)):
self._open_filename(fileobj, mode, overwrite)
else:
self._open_filelike(fileobj, mode, overwrite)

(compression guessing happens in _open_fileobj).
Changing isfile is doable but we need to be careful with the impact on other parts of the code.

@jak574
Copy link
Contributor Author

jak574 commented Mar 6, 2024

Thanks for the reply!

Would changing the isinstance in isfile from comparing to io.FileIO to io.IOBase be too broad a comparison?

@jak574
Copy link
Contributor Author

jak574 commented Mar 6, 2024

..answering my own question. It definitely breaks tests if you do that, so something smarter needs to be done in isfile, clearly.

@saimn
Copy link
Contributor

saimn commented Mar 6, 2024

Maybe io.BufferedIOBase but an additional change is needed to get the fileobject mode.
For reference a good part of this comes from #6373, and the many possibilities make things complicated, not sure what the best approach is. Another option would be to do the compression guessing also in _open_filelike but it seems wrong.

@jak574
Copy link
Contributor Author

jak574 commented Mar 7, 2024

So I've been trying some things in the hope that they might lead to a PR.

I tried adding compression guessing into _open_filelike and it works for my code but breaks the test_create_fitshdu_with_compression test. Maybe related to this the code here:

https://github.com/astropy/astropy/blob/d32a06ad59188cde97b736806407e7c466dd11cb/astropy/io/fits/hdu/nonstandard.py#L35C1-L39C58

Which convert the fileobj into a gzip.GzipFile if compression is turned on. I'm guessing this interacts poorly with putting compression detection in _open_filelike.

Any attempts to broaden the types detected in isfile() cause many failures. One exception is if I add tempfile.SpooledTemporaryFile, it causes none of the tests to break and allows my code to work, but it's way too narrowly scoped fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants