Catalogs headers from products of dev1 pipeline cannot be read by external libraries #438

simonegarrappa · 2024-05-06T10:42:29Z

After processing data with pipeline dev1 version (as of this message date) I cannot read the catalogs header with external libraries (e.g. astropy).

Example file:
/last04e/data1/archive/LAST.01.04.01/2024/05/03/proc/005030v0/LAST.01.04.01_20240504.005310.921_clear_OP313_009_001_016_sci_proc_Cat_1.fits

Error:
I get an error when trying to read the hdu containing the header (2), I'm copying below full error log if it can be of any use. It looks like there is a float where there should be an int, somewhere.

File ~/Analysis/pyabscal/lastcatutils.py:27, in LastCatUtils.tables_from_lastcat(self, catfile)
25 hdu = pyfit.open(catfile)
26 last_cat = hdu[1].data
---> 27 info_cat = hdu[2]
29 return last_cat, info_cat

File ~/anaconda3/envs/LAST/lib/python3.9/site-packages/astropy/io/fits/hdu/hdulist.py:379, in HDUList.getitem(self, key)
375 # Originally this used recursion, but hypothetically an HDU with
376 # a very large number of HDUs could blow the stack, so use a loop
377 # instead
378 try:
--> 379 return self._try_while_unread_hdus(
380 super().getitem, self._positive_index_of(key)
381 )
382 except IndexError as e:
383 # Raise a more helpful IndexError if the file was not fully read.
384 if self._read_all:

File ~/anaconda3/envs/LAST/lib/python3.9/site-packages/astropy/io/fits/hdu/hdulist.py:1248, in HDUList._try_while_unread_hdus(self, func, *args, **kwargs)
1246 return func(*args, **kwargs)
1247 except Exception:
-> 1248 if self._read_next_hdu():
1249 continue
1250 else:

File ~/anaconda3/envs/LAST/lib/python3.9/site-packages/astropy/io/fits/hdu/hdulist.py:1290, in HDUList._read_next_hdu(self)
1287 offset = last._data_offset + last._data_size
1288 fileobj.seek(offset, os.SEEK_SET)
-> 1290 hdu = _BaseHDU.readfrom(fileobj, **kwargs)
1291 except EOFError:
1292 self._read_all = True

File ~/anaconda3/envs/LAST/lib/python3.9/site-packages/astropy/io/fits/hdu/base.py:366, in _BaseHDU.readfrom(cls, fileobj, checksum, ignore_missing_end, **kwargs)
360 hdu = cls._readfrom_internal(
361 fileobj, checksum=checksum, ignore_missing_end=ignore_missing_end, **kwargs
362 )
364 # If the checksum had to be checked the data may have already been read
365 # from the file, in which case we don't want to seek relative
--> 366 fileobj.seek(hdu._data_offset + hdu._data_size, os.SEEK_SET)
367 return hdu

File ~/anaconda3/envs/LAST/lib/python3.9/site-packages/astropy/io/fits/file.py:428, in _File.seek(self, offset, whence)
426 if not hasattr(self._file, "seek"):
427 return
--> 428 self._file.seek(offset, whence)
429 pos = self._file.tell()
430 if self.size and pos > self.size:

TypeError: 'float' object cannot be interpreted as an integer

agioffe · 2024-05-08T13:18:39Z

Can you read this header with some other tools, like HEASARC ftools/fv or ftools/fdump?

agioffe · 2024-05-20T12:33:13Z

The problematic file is already not there?
ocs@last04e> ls /last04e/data1/archive/LAST.01.04.01/2024/05/03/proc/005030v0/LAST.01.04.01_20240504.005310.921_clear_OP313_009_001_016_sci_proc_Cat_1.fits
ls: cannot access '/last04e/data1/archive/LAST.01.04.01/2024/05/03/proc/005030v0/LAST.01.04.01_20240504.005310.921_clear_OP313_009_001_016_sci_proc_Cat_1.fits': No such file or directory
ocs@last04e> ls /last04e/data1/archive/LAST.01.04.01/2024/05/03/proc/005030v0/
ls: cannot access '/last04e/data1/archive/LAST.01.04.01/2024/05/03/proc/005030v0/': No such file or directory
ocs@last04e> ls /last04e/data1/archive/LAST.01.04.01/2024/05/03/proc/
ocs@last04e>

agioffe · 2024-05-20T14:15:53Z

With fv or fdump this file is read fine without errors, while when I read it with matlab as
AH = AstroHeader('LAST.01.04.01_20240504.005310.921_clear_OP313_009_001_016_sci_proc_Cat_1.fits',2)
and ask for data types of AH.Data, I am indeed getting doubles where there should be integers:

AH.Data
96×3 cell array
{'XTENSION'} {'BINTABLE' } {'binary table extension'}
{'BITPIX' } {[ 8]} {'8-bit bytes' }
{'NAXIS' } {[ 2]} {'2-dimensional binary…'}
{'NAXIS1' } {[ 320]} {'width of table in by…'}
{'NAXIS2' } {[ 305]} {'number of rows in ta…'}
{'PCOUNT' } {[ 0]} {'size of special data…'}
{'GCOUNT' } {[ 1]} {'one data group (requ…'}
{'TFIELDS' } {[ 40]} {'number of fields in …'}
for i=2:8, class(AH.Data{i,2})
end

ans = 'double'
ans = 'double'
ans = 'double'
ans = 'double'
ans = 'double'
ans = 'double'
ans = 'double'

Matlab is not sensitive to it, but probably, python is, if astropy tells it that some of these keywords should be of a different type.

As far as I understand, a relatively recent novelty that we have introduced is that we use a MEX function to write FITS headers.
For example, for the catalogs we call it from FITS.writeTable1 as

io.fits.mex.mex_fits_table_write_image_header(Header.Header, FileName);

@chentishler , is it possible that the MEX function puts doubles instead of integers?

agioffe · 2024-05-22T10:27:56Z

It came out that io.fits.mex.mex_fits_table_write_image_header adds a decimal point (".") to all the double values in a header without a decimal point, so there are two ways to solve it:

a local one, when we change the type of NAXIS* and BITPIX keywords to integer in the header just before it is passed to io.fits.mex.mex_fits_table_write_image_header (I have added some lines to FITS.writeTable1 to make sure it works fine, but I did not push it yet)
a global one, when we should somehow check that all the values written to NAXIS* and BITPIX rows of an AstroHeader are of some integer type (uint16 and int8, respectively).

@EranOfek , what is you opinion?

EranOfek · 2024-05-22T10:46:09Z

Option 2

…

On Wed, 22 May 2024 at 13:28 Alexander Krassilchtchikov < ***@***.***> wrote: It came out that *io.fits.mex.mex_fits_table_write_image_header* adds a decimal point (".") to all the double values in a header without a decimal point, so there are two ways to solve it: 1. a local one, when we change the type of NAXIS* and BITPIX keywords to integer in the header just before it is passed to *io.fits.mex.mex_fits_table_write_image_header* (I have added some lines to *FITS.writeTable1* to make sure it works fine, but I did not push it yet) 2. a global one, when we should somehow check that all the values written to NAXIS* and BITPIX rows of an AstroHeader are of some integer type (uint16 and int8, respectively). @EranOfek <https://github.com/EranOfek> , what is you opinion? — Reply to this email directly, view it on GitHub <#438 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABJUQ4IKJP7SGFRSCIGBVDLZDRXMDAVCNFSM6AAAAABHIZMW62VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRUGQ2TGMRYGA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

agioffe · 2024-05-22T14:28:20Z

How do you see it? We could make a dictionary of FITS header keywords which should take only integer values and implement checks/conversions into FITS.readHeader1, but any later assignment (e.g., when we cut an image and thus change NAXIS1 and NAXIS2) could overwrite them to double again. I am just wondering if it is possible to do it in some general way or we would have to implement explicit checks at any header operation.

agioffe · 2024-06-09T07:29:45Z

Looks like due to the improvements introduced by @chentishler the headers are fine and do not crash astropy anymore.

simonegarrappa · 2024-10-08T10:29:52Z

Reopening this issue:

I have found an header with a -nan value for the key 'FWHM'. This value is not compatible to FITS standard and makes astropy crash with the following error

VerifyError: Unparsable card (FWHM), fix it first with .verify('fix').

Usually astropy runs by itself this .verify('fix') method but it is not able to fix the header entry in this case.

The catalog is

mnt/marvin/LAST.01.05.04/2024/08/02/proc/012938v0/LAST.01.05.04_20240803.013458.549_clear_4Cp2750_017_001_009_sci_proc_Cat_1.fits

The issue is very rare since I've found this error only after reading > 100k catalogs.

agioffe · 2024-10-08T10:30:42Z

OK, I'll check it out.

EastEriq · 2024-10-08T12:37:23Z

For my part, I can say that there are many numeric keys in the raw image headers which could happen to get a NaN value in some circumstances. What would you like me to do? One option could be to leave them like that and rely on the sanitisation done by some reader; another may bet to omit writing altogether the keys which are Nan; a third option would be to use a sentinel unphysical value, like -99999, but I'd find it more prone to misinterpretation.

agioffe · 2024-10-09T08:43:43Z

I can see that in the relevant coadd the FWHM is non-NaN (~2.1), will try to reproduce this in the proc image.

agioffe · 2024-10-09T15:06:04Z

@simonegarrappa, thank you for this finding! After some investigation I found that the FWHM function we are employing was used in the pipleline with a non-optimal step parameter. In the case of a very good and sharp FWHM this led no NaN, and in other cases -- to some overestimation of the FWHM value. Now it is fixed and pushed to dev1: in imProc.psf.fwhm the explicit 'Step' parameter is now passed to imUtil.psf.curve_of_growth as Obj(Iobj).PSFData.fwhm('curveArgs',{'Step',0.5});

agioffe assigned chentishler May 20, 2024

agioffe closed this as completed Jun 9, 2024

simonegarrappa reopened this Oct 8, 2024

agioffe closed this as completed Oct 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Catalogs headers from products of dev1 pipeline cannot be read by external libraries #438

Catalogs headers from products of dev1 pipeline cannot be read by external libraries #438

simonegarrappa commented May 6, 2024

agioffe commented May 8, 2024

agioffe commented May 20, 2024 •

edited

Loading

agioffe commented May 20, 2024

agioffe commented May 22, 2024

EranOfek commented May 22, 2024 via email

agioffe commented May 22, 2024

agioffe commented Jun 9, 2024

simonegarrappa commented Oct 8, 2024

agioffe commented Oct 8, 2024

EastEriq commented Oct 8, 2024

agioffe commented Oct 9, 2024

agioffe commented Oct 9, 2024 •

edited

Loading

Catalogs headers from products of dev1 pipeline cannot be read by external libraries #438

Catalogs headers from products of dev1 pipeline cannot be read by external libraries #438

Comments

simonegarrappa commented May 6, 2024

agioffe commented May 8, 2024

agioffe commented May 20, 2024 • edited Loading

agioffe commented May 20, 2024

agioffe commented May 22, 2024

EranOfek commented May 22, 2024 via email

agioffe commented May 22, 2024

agioffe commented Jun 9, 2024

simonegarrappa commented Oct 8, 2024

agioffe commented Oct 8, 2024

EastEriq commented Oct 8, 2024

agioffe commented Oct 9, 2024

agioffe commented Oct 9, 2024 • edited Loading

agioffe commented May 20, 2024 •

edited

Loading

agioffe commented Oct 9, 2024 •

edited

Loading