Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed incorrect string representation of floats in Cards #14508

Merged
merged 11 commits into from Mar 13, 2023

Conversation

kYwzor
Copy link
Member

@kYwzor kYwzor commented Mar 9, 2023

Description

This pull request is to address a bug that caused io.fits.Card to format floats incorrectly, which could force the associated comments to be truncated.

EDIT:

@kYwzor kYwzor requested a review from saimn as a code owner March 9, 2023 11:08
@github-actions
Copy link

github-actions bot commented Mar 9, 2023

Thank you for your contribution to Astropy! 🌌 This checklist is meant to remind the package maintainers who will review this pull request of some common things to look for.

  • Do the proposed changes actually accomplish desired goals?
  • Do the proposed changes follow the Astropy coding guidelines?
  • Are tests added/updated as required? If so, do they follow the Astropy testing guidelines?
  • Are docs added/updated as required? If so, do they follow the Astropy documentation guidelines?
  • Is rebase and/or squash necessary? If so, please provide the author with appropriate instructions. Also see "When to rebase and squash commits".
  • Did the CI pass? If no, are the failures related? If you need to run daily and weekly cron jobs as part of the PR, please apply the "Extra CI" label. Codestyle issues can be fixed by the bot.
  • Is a change log needed? If yes, did the change log check pass? If no, add the "no-changelog-entry-needed" label. If this is a manual backport, use the "skip-changelog-checks" label unless special changelog handling is necessary.
  • Is this a big PR that makes a "What's new?" entry worthwhile and if so, is (1) a "what's new" entry included in this PR and (2) the "whatsnew-needed" label applied?
  • Is a milestone set? Milestone must be set but we cannot check for it on Actions; do not let the green checkmark fool you.
  • At the time of adding the milestone, if the milestone set requires a backport to release branch(es), apply the appropriate "backport-X.Y.x" label(s) before merge.

Copy link
Contributor

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we may need upper case. Some other comments that are less directly related to your PR.

if "." not in value_str and "E" not in value_str:
value_str += ".0"
elif "E" in value_str:
value_str = str(value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we may have to do str(value).upper() to ensure we still have an E - that may well be FITS standard.

Copy link
Member Author

@kYwzor kYwzor Mar 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed the if and .split() to check for "e" instead of "E" since that is how python represents the exponent. I believe the upper case E is indeed part of the FITS standard, however that is not an issue since I did not change the following line:

value_str = f"{significand}E{sign}{int(exponent):02d}"

This will force the end result to always use the upper case E.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. So, then it is more relevant if we actually can remove that whole stanza (which I think is likely to be the case).

Copy link
Member Author

@kYwzor kYwzor Mar 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds a bit risky but we can give it a try. I notice that the code specifically has int(exponent), but I don't think it's possible that Python ever returns a fractional exponent (1e0.5 isn't even legal syntax). Also, 1e-9 is converted to "1e-09" so it seems Python automatically pads to at least 2 digits. The only way this would fail is if indeed the representation is different in different platforms.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part looks very sketchy to me:

elif "." not in value_str:
value_str += ".0"

The only way this if statement would be called is if a integer reached this point, but I don't think that could ever happen because we only call this function in these places:

elif isinstance(value, (float, np.floating)):
return f"{_format_float(value):>20}"
elif isinstance(value, (complex, np.complexfloating)):
val_str = f"({_format_float(value.real)}, {_format_float(value.imag)})"
return f"{val_str:>20}"

I've checked and using both Python built-in complex numbers and numpy's complex numbers, .real and .imag are float even if we use 1j+1. In other words, I don't see how we could get a number that does not have a decimal point while also not using an exponent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, that can be removed too! It was needed only with the format string:

In [2]: f"{5.0:16G}"
Out[2]: '               5'

elif "E" in value_str:
value_str = str(value)

if "e" in value_str:
# On some Windows builds of Python (and possibly other platforms?) the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt this is still true.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean the comments regarding the Windows builds of Python?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think python now no longer relies on anything OS-dependent for typesetting float values (especially since we no longer use the format string).

astropy/io/fits/card.py Show resolved Hide resolved
@pllim pllim added the Bug label Mar 9, 2023
@pllim pllim added this to the v5.3 milestone Mar 9, 2023
@kYwzor
Copy link
Member Author

kYwzor commented Mar 9, 2023

Does anyone know why Read the Docs does not like me using "io.fits.Card" nor "Card" on the changelog?

@kYwzor
Copy link
Member Author

kYwzor commented Mar 9, 2023

I tried to be a bit more aggressive with the changes and only truncate if we are using a scientific notation. I assumed that the string representation of floats in Python couldn't go over 20 characters in instances where scientific notation isn't used. However, this is apparently not true (for example, -0.0021428662759603867 and 0.0014104245585713215 both get represented exactly as they are written), so the old code did truncate these. So I will revert that part to the old way of truncating for now, at least while we don't opt for rounding instead. I do wonder though: would it be possible to have a float that is 22 characters long without scientific notation and just a single decimal place? Because if so, with the way the strings are being truncated, we could hypothetically truncate the decimal point (which would break the standard).

@mhvk
Copy link
Contributor

mhvk commented Mar 9, 2023

Probably best to leave the truncation, rounding for another time. Sounds like standard notation can give 22 characters, negating your second example:

In [3]: len(str(-0.0014104245585713215))
Out[3]: 22

We could only make this 21 by removing the leading 0, so not a real solution.

@kYwzor
Copy link
Member Author

kYwzor commented Mar 9, 2023

Sorry, maybe my comment wasn't clear. What I meant is that from my testing I figured out that floats can surpass 20 chars even when not using scientific notation (which I initially didn't think possible). My question now is: is there a float for which Python gives a string representation that is 1) standard notation (not scientific), 2) more than 20 characters long and 3) has just a single decimal place? Because if such a float exists, the truncation code would break, since it would delete the decimal places, making it an invalid float in FITS.

@mhvk
Copy link
Contributor

mhvk commented Mar 9, 2023

@kYwzor - I think the truncation itself is safe, just (slightly) wrong. Probably best to keep that for another PR, as you suggested earlier...

Copy link
Contributor

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks all good to me - let's handle the truncation separately.

Approving now, but will wait with merging to give @saimn a chance to have a look.

Copy link
Contributor

@saimn saimn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good, thanks for the PR @kYwzor (and for the review @mhvk :))

@saimn saimn merged commit 386b52e into astropy:main Mar 13, 2023
20 of 21 checks passed
@keflavich
Copy link
Contributor

This broke some tests downstream in spectral-cube because we're using wcs_out.wcs.compare(result.wcs.wcs). I'm still trying to ID the issue specifically, but I may request a change.

@keflavich
Copy link
Contributor

OK I think I see the problem - the error noted in #14507 is possibly not a bug. It is not possible to round-trip WCS->header->WCS with this PR in place.

However, I can't produce a MWE; I don't know exactly what wcs.compare is doing, and in the case where it's returning False, the WCSes are by-eye identical. I'm still digging.

@keflavich
Copy link
Contributor

MWE:

from astropy.wcs import WCS
ww = WCS(naxis=1)
ww.wcs.cdelt[0] = 1288.2149687900003
ww2 = WCS(ww.to_header())
ww.wcs.compare(ww.wcs) # True
ww.wcs.compare(ww2.wcs) # False

@keflavich
Copy link
Contributor

Could we please add the above example as a regression test and revert this until we come up with a solution?

@keflavich
Copy link
Contributor

To expand a bit: it looks like we have a problem with round-tripping from header->wcs->header identified in #14507, but the solution here breaks the round-trip from wcs->header->wcs.

@pllim
Copy link
Member

pllim commented Mar 13, 2023

PR to revert at #14524

@dhomeier
Copy link
Contributor

(ww.wcs.cdelt[0] - ww2.wcs.cdelt[0]) / ww.wcs.cdelt[0]
1.7650289815899323e-16

is below float64 resolution of 1e-15, so it may not be reasonable to expect this can roundtrip. In fact that example also fails in 5.2, even when setting ww.wcs.cdelt[0] = 1288.214968790003 (i.e. more than resolution above 1288.21496879).
But before and after this PR

fits.Card('CDELT1', ww.wcs.cdelt[0], 'Coordinate increment at reference point')
('CDELT1', 1288.2149687900003, 'Coordinate increment at reference point')
ww.to_header().cards[2]
('CDELT1', 1288.21496879, 'Coordinate increment at reference point')

as the latter is created by WCS.to_header(), with a default precision of {:22.14G} set by WCSLIB (WCSHDO_P14) – changing it to the desired precision here would require

 ww.to_header(relax=wcs.WCSHDO_P17).cards[2]
('CDELT1', 1288.2149687900003, 'Coordinate increment at reference point')

does not seem like it falls within the scope of this PR or the discussion to be had on https://github.com/astropy/astropy/pull/14508/files#r1133599288.

@pllim
Copy link
Member

pllim commented Mar 13, 2023

So does it mean we need to revert or not?

@keflavich
Copy link
Contributor

Ah, I'm afraid you're right, my MWE does not succeed on 5.2 either, it fails on both, so it is not a good MWE. The test that fails in spectral-cube has a little more going on in it; I thought I had narrowed it down to the right spot. I used git bisect to pin the spectral-cube failure down to 3d8a2c6, but I can't tell yet why spectral-cube's test succeeds on 5.2.1 while my MWE still fails. I'll try to produce a better one.

@keflavich
Copy link
Contributor

@pllim we still do need to revert, I think - just because we haven't found where this breaks the spectral-cube test does not mean that this is a safe change.

I'm finding it extremely difficult to produce a MWE that passes on 5.2.1 and fails on main, but spectral-cube's test definitely does.

@pllim
Copy link
Member

pllim commented Mar 13, 2023

Are you sure it is this PR?

@keflavich
Copy link
Contributor

Sure as I can be. I ran git bisect and it points to the commit I referenced above. I verified that spectral-cube still passes when built against last night's nightly build.

@pllim
Copy link
Member

pllim commented Mar 13, 2023

I'd rather see that better MWE first. Or you can also propose a revert PR from your fork.

@mhvk
Copy link
Contributor

mhvk commented Mar 13, 2023

Hmm, sounds like a wcs.is_close (or a loosening in wcs.compare) is called for... It does seem that @kYwzor's comment is right, that the fact that it did work is more "bug-for-bug compatibility" than anything else.

@keflavich
Copy link
Contributor

I've given up on the MWE. I have reproduced the path as precisely as possible and cannot get the MWE to pass with 5.2.1 and fail with main. I'm just changing the comparison to wcs.compare(..., tolerance=1e-12) and moving on.

@mhvk
Copy link
Contributor

mhvk commented Mar 13, 2023

Ah, so it is like .is_close() already - certainly, sounds like a sensible solution to just allow for some floating point errors.

@dhomeier
Copy link
Contributor

Yes; it is correct that that commit has changed the output of

# 5.2.1
fits.card._format_value(1288.2149687900003)
'       1288.21496879'
# 5.3dev
fits.card._format_value(1288.2149687900003)
'  1288.2149687900003'

which happened to just match the precision limitation of wcs.to_header in that specific case (but not even for 1288.21496879003 or 1288.214968790003). So wcs.compare did fail in those examples and still does; seems not a helpful solution to re-introduce the precision loss in fits.card here, but it is a puzzle where card._format_value would have been used in that spectral-cube test so that it could just make up for the WCS roundtrip bug.

@dhomeier
Copy link
Contributor

dhomeier commented Mar 13, 2023

I'm just changing the comparison to wcs.compare(..., tolerance=1e-12) and moving on.

Alternatively you might still try the header_out.update(wcs_out.to_header(relax=wcs.WCSHDO_P17)) option from above; still unclear why the default precision has to be lower (WCSLIB design decision?).
That version does pass with astropy 5.2.1 anyway (which I really was not sure of at this point ;-).

@dhomeier
Copy link
Contributor

dhomeier commented Mar 13, 2023

@keflavich it seems the spectral-cube bug is already introduced when reading the input file, which actually has

CDELT3 = 1.28821496879E+00 /

but is parsed (with any astropy version) as

fits.open('adv.fits')[0].header['CDELT3']
1.28821496879
SpectralCube.read('adv.fits')._header['CDELT3']
1.28821496879
SpectralCube.read('adv.fits').header['CDELT3']
1.2882149687900002

Likely there is some floating point error introduced in BaseSpectralCube.header, perhaps in
https://github.com/radio-astro-tools/spectral-cube/blob/8747c377e190c0f5d193f9dd9916e1fe4b4af1c1/spectral_cube/spectral_cube.py#L2516-L2521
(although in the present case there should be no unit conversion) that is creating the odd values here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants