Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TST: DataFrame operations with ExtensionArrays #28506

Open
jorisvandenbossche opened this issue Sep 18, 2019 · 1 comment
Open

TST: DataFrame operations with ExtensionArrays #28506

jorisvandenbossche opened this issue Sep 18, 2019 · 1 comment
Labels
Enhancement ExtensionArray Extending pandas with custom dtypes or arrays. Testing pandas testing functions or related to the test suite

Comments

@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Sep 18, 2019

@jbrockmendel the use case I was trying to explain in words is this (which currently works):

In [4]: df1 = pd.DataFrame({'a': DecimalArray(make_data()[:5]), 'b': DecimalArray(make_data()[:5])}) 

In [5]: df1 
Out[5]: 
                                                   a                                                  b
0  Decimal: 0.12461374221703280795736645814031362...  Decimal: 0.52335889201915997137604108502273447...
1  Decimal: 0.31309497234009509014640570967458188...  Decimal: 0.00133572791466307627672449598321691...
2  Decimal: 0.84547604466523051947035582998069003...  Decimal: 0.73944523257987115893996588056324981...
3  Decimal: 0.78411203581007593577112402272177860...  Decimal: 0.60960149820278708432397252181544899...
4  Decimal: 0.73864636517230020107405152884894050...  Decimal: 0.53212511863152778257557429242297075...

In [6]: df1.dtypes 
Out[6]: 
a    decimal
b    decimal
dtype: object

In [7]: df2 = pd.DataFrame({'a': np.arange(5), 'b': np.arange(5)})

In [8]: df1 + df2 
Out[8]: 
                                         a                                        b
0  Decimal: 0.1246137422170328079573664581  Decimal: 0.5233588920191599713760410850
1   Decimal: 1.313094972340095090146405710   Decimal: 1.001335727914663076276724496
2   Decimal: 2.845476044665230519470355830   Decimal: 2.739445232579871158939965881
3   Decimal: 3.784112035810075935771124023   Decimal: 3.609601498202787084323972522
4   Decimal: 4.738646365172300201074051529   Decimal: 4.532125118631527782575574292

In [9]: (df1 + df2).dtypes 
Out[9]: 
a    decimal
b    decimal
dtype: object

As you mentioned, in those case the ops code would keep doing that column by column (or 1D block by 1D block) and the right integer 2D block will be splitted.
But I thought you mentioned something about such a case currently not being covered by the tests? (in which case it would be good to add some)

@jorisvandenbossche jorisvandenbossche added ExtensionArray Extending pandas with custom dtypes or arrays. Testing pandas testing functions or related to the test suite labels Sep 18, 2019
@jbrockmendel
Copy link
Member

But I thought you mentioned something about such a case currently not being covered by the tests?

Not sure if this is the case I was referring to, but df2.add(df1["a"], axis=0) currently raises

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement ExtensionArray Extending pandas with custom dtypes or arrays. Testing pandas testing functions or related to the test suite
Projects
None yet
Development

No branches or pull requests

3 participants