-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG [table]: fix a bug where column names would be lost when instanciating Table from a list of Row objects #15735
base: main
Are you sure you want to change the base?
BUG [table]: fix a bug where column names would be lost when instanciating Table from a list of Row objects #15735
Conversation
Thank you for your contribution to Astropy! 🌌 This checklist is meant to remind the package maintainers who will review this pull request of some common things to look for.
|
👋 Thank you for your draft pull request! Do you know that you can use |
astropy/table/table.py
Outdated
@@ -593,7 +620,7 @@ class Table: | |||
Copy the input data. If the input is a Table the ``meta`` is always | |||
copied regardless of the ``copy`` parameter. | |||
Default is True. | |||
rows : numpy ndarray, list of list, optional | |||
rows : numpy ndarray, list of list, sequence of ``Row``, optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addition of new accepted type as input also feel like a feature, so I didn't mark this as backport but Table maintainers can always change that if they want.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initializing from a sequence of Row was always intended to work, so this is just documenting that correctly (and fixing a bug).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, @taldcroft , should we backport?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes on the backport in concept, though to be honest I don't know where it will be backported. I thought there is only 6.0.1 which is the next bugfix release forward from 6.0.0.
Also, of course, I haven't reviewed the code yet!
There's one failure I didn't catch locally. I'll get back to fix it in the morning, most likely. |
astropy/table/table.py
Outdated
return names | ||
|
||
for row in rows[1:]: | ||
if not isinstance(row, Row) or len(row.colnames) != len(names): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note to reviewers: I'm assuming that passing rows of different sizes would raise an exception sooner or later during initialization, but I think this function should be as resilient as possible, so raising that exception is not its responsibility.
t1["a"] = [1, 2, 3] | ||
t1["b"] = [2.0, 3.0, 4.0] | ||
|
||
rows = [row for row in t1] # noqa: C416 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used a noqa comment here to keep this intention of that line very clear. Otherwise, the linter would push for
rows = list(t1)
, which is equivalent, but a bit more opaque in my opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could also create a list of Rows
manually. Here I just took the example from @astrofrog for easier comparison between the test and its linked issue.
b83cb50
to
2381367
Compare
2381367
to
e3386c8
Compare
We should backport. |
Since I don't review many PRs, I don't know what I need tp do about that though. Set the backport label? |
Change the milestone and set the backport label, yes. I already did it here. Thanks for the feedback! |
@neutrinoceros - I did a little digging and noticed a solution that has a much smaller footprint, in fact it reduces the line count. Basically this relies on This also fixes some old inelegant code for getting the column order which was written way before Python had an ordered dict. It's very easy now to get the column order as simply the order in which they appear in successive rows. The previous alphabetical sort was not very nice. One slight unfortunate thing about this solution is that it could change the column ordering in code so it would count as an API change for the unusual circumstance where the first row does not have all the columns. So I think this would need to wait for the next release, but the original issue is 6 years old so it can wait a little longer.
|
715ae5a
to
3bea311
Compare
@taldcroft thank you so much, I agree your solution is much more elegant and it still passes the test, so I went ahead and replaced mine with it (with you as a "co-author") ! |
As per Simon's recommendation, I moved milestone to 7.0, added label to request What's New entry, and turned this into draft so we don't merge before v6.1. |
Re: #15735 (comment) @saimn , I have not been super careful about API change and v6.1. Do we have to revert anything in https://github.com/astropy/astropy/pulls?q=is%3Apr+is%3Amerged+label%3A%22API+change%22+milestone%3Av6.1.0 |
I think pushing this to 7.0 is going too far. See #15735 (comment) |
@hamogu not exactly: the behaviour that's changed here was documented, even though it wasn't ideal (and it's a side effect from @taldcroft's patch that I adopted, not it's primary goal) My own patch was backward compatible, albeit a lot less elegant than @taldcroft , so I think I can propose a way forward that should be satisfactory to all parties:
@saimn I am not sure how to go about adding a deprecation warning in this case: ideally users should have some sort of control to opt-in the new behaviour before it becomes the default, so they're able to dodge the warning once they are aware of it. In this case in do not see how to provide such an interface without adding a flag to |
@hamogu - From what I understand, and @neutrinoceros confirms it above, there is both a bugfix (supporting list of Row) and an API change (ordering of the columns, which affects also instantiating from a list of dict). If the second part can be avoided then the bugfix can even go to 6.0.1, and maybe the API change can be delayed to 7.0, with a warning if possible. @neutrinoceros - Not sure for the warning, it also depends if it makes sense to have an option to keep the old behavior. One option is to issue the warning when columns are sorted, stating that the default behavior will change in 7.0, and people can filter the warning if they want. Or add an option to choose the sort behavior which would allow to force the old behavior if it make sense to do so ? |
What's great about @taldcroft's patch is that it decreases complexity. If we were to support multiple behaviours then complexity would go back up again, and I'm honestly not sure it's worth it. |
This is exactly the sort of situation I was thinking about in discussions for APE-21 which resulted in the statement about being pragmatic. Historically, API changes like this have been common in minor releases. Although we justified them by pointing to the LTS, in practice the project recognized that very few people were actually using the LTS. Breakage could only occur for a pretty unusual situation:
The pragmatic part is accepting that this situation can occur but it is unlikely. Not only does (1) need to be true, but it needs to always be true for all input data sets. Basically the current code is somewhat randomly changing the column ordering based on an arbitrary test. It is true that we could test for situation (1), issue a warning and sort the columns alphabetically. But then users in that situation need to modify their code anyway to suppress the warning. And if they want to permanently keep alphabetical ordering they have to add code to do that sorting so it still works after 7.0, and then they are left with orphaned code to suppress the warning. Users that want the more natural ordering now would be out of luck unless we add some new configuration option. This is just a LOT of complexity for a very little gain. |
Like #15774 this sounds like space-bar heating: https://xkcd.com/1172/. I don't think we have to backport, but I suggest a 6.1 milestone. We also don't have to do the extra work of breaking this up into two PRs. |
In my day job (Chandra Flight Director dealing directly with NASA) I deal with process control authority, A LOT. In this case, there is a delegation of authority:
Only in unusual circumstances would the release managers or the CoCo override the recommendation of the package maintainer(s). Of course a healthy discussion is always encouraged! |
Not sure I understand where this discussion is going but, sure in the end it's up to the maintainers to decide. |
It's not up to me to decide and ultimately I'm happy to re-iterate as long as it takes to get everyone happy but I think that going with the patch as is and aiming it as 6.1 is a reasonable way forward and indeed falls into the "pragmatic" kind of low impact breakage that APE 21 defines as acceptable for a minor release. |
@saimn - agreed to try avoiding API changes, I'm definitely 100% behind not breaking code unnecessarily. That said, in this case we have talked it through and I believe the complexity that is needed to maintain perfect back-compatibility outweighs the potential impact. Creating Tables from list of dict is common, but having both (1) and (2) from above is not. |
57b4c3a
to
cd2692a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@neutrinoceros - this is converging! I made a number of suggestions on the docs. The code and tests look fine. I think the CI failures are just doc tests (at least from the first one) but you should check them all.
660e796
to
c02340a
Compare
dbba51f
to
1545980
Compare
rebased to resolve a merge conflict |
…ble from a list of Row objects Co-authored-by: Tom Aldcroft <taldcroft@gmail.com>
53ab42e
to
7339e2b
Compare
rebased again to resolve conflicts. This time I also squashed my commits to minimize the number of conflicts I need to resolve in a single rebase. |
Description
Fixes #5923
This is a somewhat inelegant minimal patch; I'll be happy to iterate over it to improve quality if requested !
ping @hamogu @taldcroft