Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only sort dict columns in from_records for py < 3.6 #22687

Closed
wants to merge 2 commits into from
Closed

Only sort dict columns in from_records for py < 3.6 #22687

wants to merge 2 commits into from

Conversation

llawall
Copy link

@llawall llawall commented Sep 13, 2018

  • closes #xxxx (no existing issue)
  • tests passed (no new test added - can add one if necessary)
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

DataFrame.from_records does not respect dictionary order on Python 3.6+. For example:

>>> df = pd.DataFrame.from_records({'C':[1], 'A':[2]})
>>> df
   A  C
0  2  1

This change respects dict ordering on Python 3.6+ in line with 0.23 pandas behaviour of constructor and from_dict:

>>> df = pd.DataFrame.from_records({'C':[1], 'A':[2]})
>>> df
   C  A
0  1  2

There appears to be further potential to reduce duplication if from_records called from_dict?

@pep8speaks
Copy link

pep8speaks commented Sep 13, 2018

Hello @llawall! Thanks for updating the PR.

Comment last updated on September 13, 2018 at 10:03 Hours UTC

@WillAyd
Copy link
Member

WillAyd commented Sep 13, 2018

Would at the very least need tests bundled with this, but can you also open an issue to discuss? The existing behavior is an explicit sort, not necessarily an undefined order for Python < 3.6 so it's not as if this is a bug being fixed but rather a behavioral change that could use discussion

@WillAyd WillAyd added the DataFrame DataFrame data structure label Sep 13, 2018
@TomAugspurger
Copy link
Contributor

Also, this would be somewhat unique as you have the potential for dicts of different orders in records. I believe in all our other changes to respect dict ordering, the data are columnar so you don't have this potential ambiguity.

@llawall
Copy link
Author

llawall commented Sep 14, 2018

@WillAyd :

  • I'll add a test for this - that is no problem.
  • With respect to "explicit sort", what makes this sort any different to the change to the change in _init_dict used by constructor? Previously an OrderedDict was special cased but the columns were otherwise always sorted too.

@TomAugspurger :

  • I'm not quite sure I understand what you mean by "dicts of different orders in records"?

I've opened #22708 for further discussion about this. Let's move the discussion there.

@codecov
Copy link

codecov bot commented Oct 22, 2018

Codecov Report

Merging #22687 into master will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #22687   +/-   ##
=======================================
  Coverage   92.17%   92.17%           
=======================================
  Files         169      169           
  Lines       50715    50715           
=======================================
  Hits        46747    46747           
  Misses       3968     3968
Flag Coverage Δ
#multiple 90.58% <100%> (ø) ⬆️
#single 42.35% <0%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/frame.py 97.19% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c040353...ddd8c7f. Read the comment docs.

@WillAyd
Copy link
Member

WillAyd commented Nov 23, 2018

Can you merge master here?

@llawall
Copy link
Author

llawall commented Nov 24, 2018

Sure. I’m away for a few days but will do it on my return.

@jreback
Copy link
Contributor

jreback commented Dec 23, 2018

closing as stale. if you want to continue, pls ping.

@jreback jreback closed this Dec 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DataFrame DataFrame data structure
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants