Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backport PR #43150 on branch 1.3.x (BUG: GroupBy.quantile fails with pd.NA) #43417

Merged

Conversation

debnathshoham
Copy link
Member

@debnathshoham debnathshoham commented Sep 5, 2021

Backport PR #43150

@debnathshoham
Copy link
Member Author

debnathshoham commented Sep 5, 2021

@jreback @simonjayhawkins
There is a difference in what's happening with the index when df has nullable dtypes in 1.3.x vs. master (comparison below).

So I think there are two options

  1. backport the PR which changes this behaviour (I was not able to find the PR though).
  2. hardcode the test to pass (which I don't think we should do).
In [60]: pd.__version__
Out[60]: '1.3.2+16.g6a54ac4a0a'

In [61]: df = DataFrame({"x": [1, 1], "y": [0.2, np.nan]})

In [62]: df.dtypes
Out[62]: 
x      int64
y    float64
dtype: object

In [63]: df.groupby("x")["y"].quantile(0.5).index
Out[63]: Int64Index([1], dtype='int64', name='x')

In [64]: df=df.astype({"x":"Int64"})

In [65]: df.dtypes
Out[65]: 
x      Int64
y    float64
dtype: object

In [66]: df.groupby("x")["y"].quantile(0.5).index
Out[66]: Index([1], dtype='object', name='x')
Out[13]: '1.4.0.dev0+598.g35d52ff2d5'

In [14]: df = DataFrame({"x": [1, 1], "y": [0.2, np.nan]})

In [15]: df.dtypes
Out[15]: 
x      int64
y    float64
dtype: object

In [16]: df.groupby("x")["y"].quantile(0.5).index
Out[16]: Int64Index([1], dtype='int64', name='x')

In [17]: df=df.astype({"x":"Int64"})

In [18]: df.dtypes
Out[18]: 
x      Int64
y    float64
dtype: object

In [19]: df.groupby("x")["y"].quantile(0.5).index
Out[19]: Int64Index([1], dtype='int64', name='x')

@jreback jreback changed the title Auto backport of pr 43150 on 1.3.x BUG: GroupBy.quantile fails with pd.NA #43150 Sep 6, 2021
@jreback jreback added Groupby NA - MaskedArrays Related to pd.NA and nullable extension arrays labels Sep 6, 2021
@jreback jreback added this to the 1.3.3 milestone Sep 6, 2021
@simonjayhawkins simonjayhawkins changed the title BUG: GroupBy.quantile fails with pd.NA #43150 Backport PR #43150 on branch 1.3.x (BUG: GroupBy.quantile fails with pd.NA) Sep 7, 2021
@debnathshoham debnathshoham force-pushed the auto-backport-of-pr-43150-on-1.3.x branch from 22cf22f to a55b74b Compare September 9, 2021 07:19
@jreback jreback merged commit 5d6e352 into pandas-dev:1.3.x Sep 9, 2021
@jreback
Copy link
Contributor

jreback commented Sep 9, 2021

thanks @debnathshoham

@debnathshoham debnathshoham deleted the auto-backport-of-pr-43150-on-1.3.x branch September 9, 2021 19:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Groupby NA - MaskedArrays Related to pd.NA and nullable extension arrays
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants