Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
BUG: cut does not respect order of passed labels #16459
Comments
jorisvandenbossche
added Categorical Regression
labels
May 23, 2017
jorisvandenbossche
added this to the
0.20.2
milestone
May 23, 2017
jorisvandenbossche
changed the title from
DocTest Shows Wrong Order to BUG: cut does not respect order of passed labels
May 23, 2017
|
|
I'm looking into this.
|
|
If I have a categorical, and I want it to be ordered, and I pass a list of strings as labels, it doesn't seem like I'd ever want the sorted version of those labels to be my rank ordering. Is there a case where I'd actually want to sort those strings alphabetically? |
|
@economy |
|
So, if I'm reading this correctly, updating the test to explicitly set |
|
I think we need a new test in Ideally you would
|
|
@TomAugspurger, I can create a new test for this, but If I update |
Do you mean the one reported by @ProsperousHeart? That is a doctest, which aren't setup to run automatically (yet, @ProsperousHeart is getting them fixed up so that we can do that). In the meantime we need a regular unit test to verify that the fix works. |
economy
referenced
this issue
May 23, 2017
Closed
BUG: fixed wrong order of ordered labels in pd.cut() #16466
economy
added a commit
to economy/pandas
that referenced
this issue
May 23, 2017
|
|
economy |
d8bda87
|
TomAugspurger
added a commit
to economy/pandas
that referenced
this issue
May 30, 2017
|
|
economy + TomAugspurger |
ac78a64
|
jreback
closed this
in d419be4
Jun 1, 2017
TomAugspurger
added a commit
to TomAugspurger/pandas
that referenced
this issue
Jun 1, 2017
|
|
economy + TomAugspurger |
4806a11
|
TomAugspurger
added a commit
that referenced
this issue
Jun 4, 2017
|
|
economy + TomAugspurger |
418dcb3
|
Kiv
added a commit
to Kiv/pandas
that referenced
this issue
Jun 11, 2017
|
|
economy + Kiv |
785887a
|
stangirala
added a commit
to stangirala/pandas
that referenced
this issue
Jun 11, 2017
|
|
economy + stangirala |
2cecd75
|
guillemborrell
added a commit
to guillemborrell/pandas
that referenced
this issue
Jul 7, 2017
|
|
economy + guillemborrell |
98ce9eb
|
ProsperousHeart commentedMay 23, 2017
•
edited by jorisvandenbossche
This issue was found when working on #16432
Examples -------- >>> pd.cut(np.array([.2, 1.4, 2.5, 6.2, 9.7, 2.1]), 3, retbins=True) ([(0.19, 3.367], (0.19, 3.367], (0.19, 3.367], (3.367, 6.533], (6.533, 9.7], (0.19, 3.367]] Categories (3, interval[float64]): [(0.19, 3.367] < (3.367, 6.533] < ... (6.533, 9.7]], array([ 0.1905 , 3.36666667, 6.53333333, ... 9.7 ])) >>> result = pd.cut(np.array([.2, 1.4, 2.5, 6.2, 9.7, 2.1]), ... 3, labels=["good","medium","bad"]) [good, good, good, medium, bad, good] Categories (3, object): [good < medium < bad]Problem description
When running
pytest -x --doctest-modules core/reshape/tile.pyagainst the work inside of the "reshape-3439" branch, we see the following:This is backwards.
Expected Output
Should clear this - this is not an issue.
Output of
pd.show_versions()