-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error reading Solar Dynamics Observatory, HMI continuum, FITS file with "nan" fields #10153
Comments
System DetailsWindows-10-10.0.18362-SP0 |
Is running Also, for working with SDO/HMI data, I recommend that you use SunPy, since it'll do additional stuff for you. (Under the hood, SunPy calls >>> import sunpy.map
>>> hmi_map = sunpy.map.Map('hmi.Ic_720s.20170530_000000_TAI.3.continuum.fits')
>>> hmi_map.peek() |
Yeah, if the maker of the data violates the standard, it is out of scope. Did you contact the observatory first? |
Thanks for the help, The fix you suggest is acceptable. I did see the I have sent an email to the observatory at Stanford to let them know. I mostly use Earth science data which are netcdf based and I am not a FITS expert. I assumed that the code run by the Joint Science Operations Centre at Stanford for a major NASA mission like SDO, will have been thoroughly validated. I thought it much more likely that astropy.io.fits was a little bit behind the latest updates in FITS standards especially given that NaN is such a sensible idea in many situations. Again thanks for the help and suggestions. I found and installed Sunpy yesterday but I have not looked at it yet, but it sounds very promising. |
Maybe a sensible option is to add some documentation to the |
It's not linked to particularly well, but there is a page devoted to the issue of verification: https://docs.astropy.org/en/stable/io/fits/usage/verification.html I'll emphasize a few sentences from there:
|
Thanks for the link and I think your statement underlines the need for documentation on the Let me know if I can be any of assistance. I would be happy to edit the FITS documentation as a novice contributor if you think that would be helpful. |
Yes, the issue is known (e.g. #873), but it seems that since that nobody pushed for allowing NaNs in header cards in the standard. And sadly people use this kind of non-standard things, so I guess we could add an option to allow reading those NaNs, it would not be the first exception. And as mentioned by @ayshih, |
Maybe a link to the verification page could be added at the end of the "Opening a FITS File" paragraph ? http://docs.astropy.org/en/latest/io/fits/#opening-a-fits-file |
@ndl303 - your contribution would be most useful. |
Indeed, most of the I definitely did improve it a lot back in the day, by having the reader enforce strict standards by default, but then there are several places where it can identify various common standard violations and in some cases propose fixes. This is done in a very ad-hoc manner though, and there's no good way to control the behavior (e.g. A smarter approach would be to define some specific verification phases (this is already defined implicitly in the code) such as verifying overall header structure, verifying card structure, verifying individual keywords and values, etc. And there could be a cleaner way to represent (e.g. in the form of exceptions) specific standards that are being violated (perhaps even with reference to the relevant chapter and verse of the FITS standard). This could allow users to register hooks for how to fix and/or ignore specific standard violations. So if you're working with data from an observatory that writes "nan" as card values, and Astropy doesn't otherwise support that case, you could register your own fix for I think this would all be relatively easy to do, and @ndl303 has already done good work pinpointing some of the relevant parts of the code. But it should definitely be handled with care. Additionally, it would be nice to have a better UI for fixing violations (alluded to somewhat in #3668) including an interactive mode when working in a REPL, which would allow inspecting invalid data one at a time, seeing what the proposed fix(es) are, and possibly also allowing a manual override. Something like that (I'm thinking e.g. |
Description
I am unable to read a FITS file with astropy.io.fits produced by the Solar Dynamics Observatory for the HMI continuum data product. I obtained the FITS files from the SDO JSOC but this requires users to register email and learn how to request data. I have the file temporarily available here if you want to replicate the problem.
I actually think this is not really a bug in astropy.io.fits but is due to the SDO HMI team using NaN values which is beyond the FITS standard (I think). Unfortunately this is a good idea and they obviously have FITS software that lets them create these files. From a pragmatic point of view it might make sense for astropy.io.fits to also support the option, which seems to be relatively straight forward, so it can keep up with other FITS software out there.
I have done some diagnostics outlined below and I have hacked my version of astropy to allow me to keep on working with the SDO HMI data.
Expected behavior
I hoped astropy.io.fits would read the SDO HMI continuum FITS file without issue. It consists of one 2-D image, 4096x4096 numbers.
The problem seems to be that the SDO HMI people have encoded "nan" values into some of the CARD values used in their FITS file. This is not supported by astropy.io.fits and may not be part of the FITS standard. Unfortunately it is used by the HMI instrument.
Other people have seen this problem and reported issues on Stack Overflow.
The sample code is straight-forward but must use SDO HMI data files. The example is shown below. The fits.open line works without issue but crashes on the line data = hdul[1].data.
Curiously the code does not always crash when using the PyCharm/Python debugger environment. I suspect the debugger is calling many @Property based values when it displays variable values and this is circumventing the normal internal caching and checking going on inside the FITS code. Definitely complicates debugging.
The code always crashes when it runs without a debugger. The following errors are output when executing the last line,
System Details
Numpy 1.18.1
astropy 4.0.1.post1
Scipy 1.3.1
Diagnostics
I have debugged the code to the point that I have a temporary fix on my own machine. The problem is in file
astropy.io.fits.card.py
in functionCard._parse_value(self)
.Around line 764 there is a piece of code which is where the problem occurs.
The line
m = self._value_NFSC_RE.match(self._split()[1])
is trying to ensure that a CARD value is valid and 'nan' is not an acceptable value. The match returns as None and the exception is thrown.The quick fix I have implemented changes the regular expression so it passes the match.
card.py
to include a new match group called nan. In my implementation I only match 'nan'. More general implementations might also match 'NaN' and 'NAN'.Card._parse_value(self)
around line 764 to manage this new match group (nan) and use thenumpy.nan
function to assign a value,With these fixes, the code on my machine is working. In practice a robust fix would check some of the other places where CARD values are parsed.
Please let me know if I can be of any assistance. Astropy is a great package and I would be more than happy to help with this problem.
The text was updated successfully, but these errors were encountered: