Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0.17 drop_duplicates() incorrectly dropping non-unique values #11459

Closed
amanhanda opened this issue Oct 28, 2015 · 1 comment
Closed

0.17 drop_duplicates() incorrectly dropping non-unique values #11459

amanhanda opened this issue Oct 28, 2015 · 1 comment
Labels
Duplicate Report Duplicate issue or pull request Reshaping Concat, Merge/Join, Stack/Unstack, Explode

Comments

@amanhanda
Copy link

Under 0.17, numpy 1.10

In [32]: import pandas

In [33]: df = pandas.DataFrame(data={"a" : [0, 1, 3, 4], "b":[20, 16, 8, 4]})

In [34]: df
Out[34]:
   a   b
0  0  20
1  1  16
2  3   8
3  4   4

In [35]: df.drop_duplicates(['a','b'], keep='last')
Out[35]:
   a  b
3  4  4

0.16.2 has the correct behavior.

In [6]: import pandas

In [7]: df = pandas.DataFrame(data={"a" : [0, 1, 3, 4], "b":[20, 16, 8, 4]})

In [8]: df
Out[8]:
   a   b
0  0  20
1  1  16
2  3   8
3  4   4

In [9]: df.drop_duplicates(['a','b'])
Out[9]:
   a   b
0  0  20
1  1  16
2  3   8
3  4   4
@jreback
Copy link
Contributor

jreback commented Oct 28, 2015

thanks for the report, a dupe of: #11376

this was already fixed here: #11403

and will be in forthcoming 0.17.1 (it's in master now)

@jreback jreback closed this as completed Oct 28, 2015
@jreback jreback added Reshaping Concat, Merge/Join, Stack/Unstack, Explode Duplicate Report Duplicate issue or pull request labels Oct 28, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Duplicate Report Duplicate issue or pull request Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

No branches or pull requests

2 participants