Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Table io null fix #14723

Merged
merged 11 commits into from Nov 3, 2023
Merged

Table io null fix #14723

merged 11 commits into from Nov 3, 2023

Conversation

gpsgibb
Copy link
Contributor

@gpsgibb gpsgibb commented May 1, 2023

Description

This pull request addresses two related bugs in writing/reading astropy.table.Table objects with NULL values in int columns to/from FITS. The first bug is in reading a table from a FITS file, where the fill_value parameter of the MaskedColumn is the default (999999) not the value recorded in the input file's TNULL parameter. The second bug is when writing a table to file, the value of the fill_value column is not propagated through to the written table, so for an int column TNULL is always defaulted to 999999.

The changes proposed by this PR ensure that TNULL/fill_value are correctly propagated to/from the Table object upon read/write.

Also included in this PR is the propagation of the FITS standard's NULL for floats (NaN) and strings ("") into a column's fill_value, rather than the default (from numpy.ma) of 1E20 and "N/A" for floats and strings respectively.

Fixes #14693

@github-actions
Copy link

github-actions bot commented May 1, 2023

Thank you for your contribution to Astropy! 🌌 This checklist is meant to remind the package maintainers who will review this pull request of some common things to look for.

  • Do the proposed changes actually accomplish desired goals?
  • Do the proposed changes follow the Astropy coding guidelines?
  • Are tests added/updated as required? If so, do they follow the Astropy testing guidelines?
  • Are docs added/updated as required? If so, do they follow the Astropy documentation guidelines?
  • Is rebase and/or squash necessary? If so, please provide the author with appropriate instructions. Also see "When to rebase and squash commits".
  • Did the CI pass? If no, are the failures related? If you need to run daily and weekly cron jobs as part of the PR, please apply the "Extra CI" label. Codestyle issues can be fixed by the bot.
  • Is a change log needed? If yes, did the change log check pass? If no, add the "no-changelog-entry-needed" label. If this is a manual backport, use the "skip-changelog-checks" label unless special changelog handling is necessary.
  • Is this a big PR that makes a "What's new?" entry worthwhile and if so, is (1) a "what's new" entry included in this PR and (2) the "whatsnew-needed" label applied?
  • Is a milestone set? Milestone must be set but we cannot check for it on Actions; do not let the green checkmark fool you.
  • At the time of adding the milestone, if the milestone set requires a backport to release branch(es), apply the appropriate "backport-X.Y.x" label(s) before merge.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Welcome to Astropy 👋 and congratulations on your first pull request! 🎉

A project member will respond to you as soon as possible; in the meantime, please have a look over the Checklist for Contributed Code and make sure you've addressed as many of the questions there as possible.

If you feel that this pull request has not been responded to in a timely manner, please send a message directly to the development mailing list. If the issue is urgent or sensitive in nature (e.g., a security vulnerability) please send an e-mail directly to the private e-mail feedback@astropy.org.

@pllim
Copy link
Member

pllim commented May 1, 2023

Thanks! Needs a change log.

I restarted the failed jobs... might be transient.

@pllim pllim added this to the v5.0.7 milestone May 1, 2023
@pllim pllim added 💤 backport-v5.0.x on-merge: backport to v5.0.x 💤 backport-v5.2.x on-merge: backport to v5.2.x 💤 backport-v5.3.x on-merge: backport to v5.3.x labels May 1, 2023
Copy link
Member

@taldcroft taldcroft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @gpsgibb - this looks generally good. I just put in a couple of small comments.

astropy/io/fits/connect.py Outdated Show resolved Hide resolved
fill_value=NULL_VALUE,
)
t = Table([c1])

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a specific unit test of the Table.as_array() functionality here by doing tn = t.as_array() and confirm the correct fill_value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done (see aab7ab6)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems this commit was lost in one of the rebase ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rebased and added the commit.

Copy link
Contributor

@saimn saimn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR looks good, thanks @gpsgibb.
Double backticks are needed to avoid reference errors with Sphinx.

docs/changes/io.fits/14723.bugfix.rst Outdated Show resolved Hide resolved
docs/changes/table/14723.bugfix.rst Outdated Show resolved Hide resolved
@saimn
Copy link
Contributor

saimn commented May 18, 2023

@gpsgibb - Can you also rebase on main to get rid of the merge commits ? Thanks.

@gpsgibb
Copy link
Contributor Author

gpsgibb commented May 22, 2023

I think that's me rebased it... (sorry if I have messed up)

@pllim
Copy link
Member

pllim commented May 22, 2023

Re: rebase -- Unless you really wrote 32 commits here, I would say something is wrong. Try this:

git fetch upstream main
git rebase upstream/main
git log

Check to see if the history now makes more sense.

@gpsgibb
Copy link
Contributor Author

gpsgibb commented May 23, 2023

I'm getting conflicts when trying to rebase:

dropping d2b262f3e34e8d65cb8199a44cda5d3c1f247b42 Fix doctest -- patch contents already upstream
Auto-merging astropy/coordinates/tests/test_intermediate_transformations.py
Auto-merging docs/changes/coordinates/14675.api.rst
CONFLICT (add/add): Merge conflict in docs/changes/coordinates/14675.api.rst
error: could not apply e8a0c611a... MNT: Remove misc deprecated stuff
hint: Resolve all conflicts manually, mark them as resolved with
hint: "git add/rm <conflicted_files>", then run "git rebase --continue".
hint: You can instead skip this commit: run "git rebase --skip".
hint: To abort and get back to the state before "git rebase", run "git rebase --abort".
Could not apply e8a0c611a... MNT: Remove misc deprecated stuff in coordinates from a long time ago.

How should I proceed?

@pllim pllim removed the 💤 backport-v5.2.x on-merge: backport to v5.2.x label May 23, 2023
@taldcroft
Copy link
Member

@gpsgibb - sorry, looks like you got your branch into a bad place. The lesson for future PR's is never do a git merge in astropy, that always leads to tears. We've all been there.

I'm not a git expert, but I've been in this mess before and for me the fastest way back was to effectively start from scratch. It sounds painful but looking at your PR, there are really only a handful of change blocks and so you can recover the change in 15 minutes. Specifically:

  • Rename your current (broken) branch git branch -m table-io-null-fix table-io-null-fix-original
  • Make a new branch table-io-null-fix (git switch main then git pull upstream then git switch -c table-io-null-fix)
  • Open the changes in your PR (https://github.com/astropy/astropy/pull/14723/files)
  • Disable "Show comments" by unchecking the box
  • One by one re-apply each of your changes by copy/paste from the web browser into your editor.
  • Make one commit with your updates.
  • Force push the new version of your branch git push -f origin table-io-null-fix.

This should work.

@eerovaher
Copy link
Member

It should be possible to avoid manual copy-pasting by using git reflog, (see also the Data Recovery section at https://git-scm.com/book/en/v2/Git-Internals-Maintenance-and-Data-Recovery).

@saimn
Copy link
Contributor

saimn commented May 24, 2023

It can be fixed "easily", the problem is that you merged with your remote branch after the previous rebase (probably because git push told you to do so, after a rebase you should use git push --force instead). So you need to get rid of the last merge commit:
image

To remove the merge commit :

git reset --hard 538021163

Then rebase:

git fetch upstream main
git rebase upstream/main

And push with --force:

git push --force

@gpsgibb
Copy link
Contributor Author

gpsgibb commented May 29, 2023

Thanks @saimn that seems to have done the trick... down to 10 commits in the diff, all my own

@taldcroft
Copy link
Member

@gpsgibb - note also the pre-commit failure (end of line). And at the end we'll probably also ask you to squash this down to one commit since it has gotten a bit large. Thanks for hanging in there and making this important fix!

@pllim pllim modified the milestones: v5.0.7, v5.0.8 Aug 15, 2023
@pllim pllim removed this from the v5.0.8 milestone Sep 7, 2023
@taldcroft
Copy link
Member

@gpsgibb - I rebased this and added a commit to address my review comment.

@saimn - is this looking OK for io.fits?

@saimn saimn added the backport-v6.0.x on-merge: backport to v6.0.x label Oct 28, 2023
Copy link
Contributor

@saimn saimn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks @gpsgibb !

@saimn saimn enabled auto-merge November 3, 2023 21:43
Last minute fix:

`np.NaN` was removed in the NumPy 2.0 release. Use `np.nan` instead
@saimn saimn merged commit be40cc4 into astropy:main Nov 3, 2023
26 checks passed
@pllim pllim removed Close? 💤 backport-v5.0.x on-merge: backport to v5.0.x 💤 backport-v5.3.x on-merge: backport to v5.3.x labels Nov 4, 2023
@pllim pllim modified the milestones: v5.0.9, v6.0 Nov 4, 2023
@pllim
Copy link
Member

pllim commented Nov 4, 2023

Don't feel like dealing with 3 backport PRs, so just 6.0 now since the other two branches gonna die soon.

@meeseeksdev backport to v6.0.x

@pllim
Copy link
Member

pllim commented Nov 4, 2023

p.s. Not sure why the backport labels didn't trigger in the first place but GitHub was having issues. 🤷

pllim added a commit that referenced this pull request Nov 4, 2023
…723-on-v6.0.x

Backport PR #14723 on branch v6.0.x (Table io null fix)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TNULL not respected/retained by astropy.table when reading/writing from FITS file
5 participants