Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
Feature request: sorted() methods on everything #9816
Comments
|
see #8239 for much of the same discussion this would actually be a nice soln as changing the existing behavior or order/sort is back incompatible pull requested are welcome! |
jreback
added this to the
0.17.0
milestone
Apr 5, 2015
|
Thanks for the vote of interest — I will look for the Pandas team at the PyCon sprints :) |
|
awesome! this would be amazing to do then! |
|
Agreed, this is a nice solution! |
jreback
changed the title from
Feature request: `sorted()` methods on everything to Feature request: sorted() methods on everything
Apr 5, 2015
jorisvandenbossche
removed the
Reshaping
label
Apr 6, 2015
|
From me a But, I think there are some other aspects of the interface that needs discussion (as seen in #8239): how to specify to sort on a certain column, or column/index combination, default of sorting on index or values, ... |
BrenBarn
commented
Apr 26, 2015
|
Is there a reason that we can't just add an |
|
order is actually an odd term (from R I believe) and sort/sorted is more pythonic the intention would be to replicate sort for DataFrame and order for Series |
This was referenced Jul 7, 2015
jreback
added a commit
to jreback/pandas
that referenced
this issue
Aug 18, 2015
|
|
jreback |
13d2d71
|
brandon-rhodes commentedApr 5, 2015
It would make Pandas easier to teach, easier to learn, and easier to use if the sorting behavior were the same between series and dataframes. But the existing
order()andsort()methods are locked into their old behaviors by all of the code that already depends on them.But a new
sorted()method could bring symmetry between series and dataframes for code written from now on:Having this new pair of methods with identical conventions, where possible, would solve several different problems that learners have with Pandas today:
Series.sort()is a special case.sort()method traditionally returnsNoneand does an in-place sort, but learners have to discover thatDataFrame.sort()violates this convention in order to match the behavior of the rest of Pandas.Series.order()which is very difficult to discover, as nothing else in the Python ecosystem is namedorder(), and since one would normally expect anorder()method to tell you the order (ascending? descending? none?) instead of imposing a new order.sorted(), per the universally loved Python built-in, but learners cannot transfer this knowledge to Pandas, where that concept exists but under the two different namesSeries.order()andDataFrame.sort().Yes, the
edat the end ofsorted()would be one character longer thanorder()and two characters longer than the current practice ofdf.sort(). But, on balance, I think that most programmers would happily cede two characters in order to be able to use the same method name when they are flipping code between handling series and handling dataframes, and happy to have the option of using the standard Python name for the concept of a non-in-place sort.I suspect that deprecating the old names would be overly disruptive at this point, and they could probably live alongside the new
sorted()methods without much trouble — new documentation could adopt the new, consistent terminology where possible, if the Pandas developers did not want to disrupt current users of the old inconsistent names.