Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

COMPAT: .unique() should be type preserving #15442

Closed
jreback opened this issue Feb 17, 2017 · 1 comment · Fixed by #29515
Closed

COMPAT: .unique() should be type preserving #15442

jreback opened this issue Feb 17, 2017 · 1 comment · Fixed by #29515
Labels
good first issue Needs Tests Unit test(s) needed to prevent regressions
Milestone

Comments

@jreback
Copy link
Contributor

jreback commented Feb 17, 2017

xref https://github.com/pandas-dev/pandas/pull/15439/files#r101811209

In [1]: Series([1, 2, 2], dtype='int8').unique()
Out[1]: array([1, 2])

In [3]: Series([1, 2, 2], dtype='int32').unique()
Out[3]: array([1, 2])

In [4]: Series([1, 2, 2], dtype='int64').unique()
Out[4]: array([1, 2])

In [5]: Series([1, 2, 2], dtype='float32').unique()
Out[5]: array([ 1.,  2.])

these should all be the original dtype.

We only create hashtables (and use them in unique) for int64, float64 and uint64, should do this for other dtypes.
https://github.com/pandas-dev/pandas/blob/master/pandas/src/hashtable_class_helper.pxi.in#L222

I thought there was an issue about this, but guess not.

@jreback jreback added Compat pandas objects compatability with Numpy or Python functions Difficulty Intermediate Dtype Conversions Unexpected or buggy dtype conversions labels Feb 17, 2017
@jreback jreback added this to the 0.20.0 milestone Feb 17, 2017
@jreback jreback modified the milestones: 0.20.0, 0.21.0 Mar 23, 2017
@jreback jreback modified the milestones: 0.21.0, Next Major Release Sep 23, 2017
@mroeschke
Copy link
Member

Looks to be the case on master. Could use a test.

In [82]: In [1]: Series([1, 2, 2], dtype='int8').unique()
    ...:
Out[82]: array([1, 2], dtype=int8)

In [83]: In [3]: Series([1, 2, 2], dtype='int32').unique()
    ...:
Out[83]: array([1, 2], dtype=int32)

In [84]: In [4]: Series([1, 2, 2], dtype='int64').unique()
    ...:
Out[84]: array([1, 2])

In [85]: In [5]: Series([1, 2, 2], dtype='float32').unique()
    ...:
Out[85]: array([1., 2.], dtype=float32)

In [86]: pd.__version__
Out[86]: '0.26.0.dev0+684.g953757a3e'

@mroeschke mroeschke added good first issue Needs Tests Unit test(s) needed to prevent regressions and removed Compat pandas objects compatability with Numpy or Python functions Dtype Conversions Unexpected or buggy dtype conversions labels Oct 27, 2019
@jreback jreback modified the milestones: Contributions Welcome, 1.0 Nov 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Needs Tests Unit test(s) needed to prevent regressions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants