You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When someone has a custom ExtensionType defined in Python, and an array class that gets converted to that (through \_\_arrow_array\_\_), the conversion in pyarrow works with the array class, but not yet for the array stored in a pandas DataFrame.
In [15]: pd_array = pd.period_range("2012-01-01", periods=3, freq="D").arrayIn [16]: pd_arrayOut[16]:
<PeriodArray>
['2012-01-01', '2012-01-02', '2012-01-03']
Length: 3, dtype: period[D]
In [17]: pa.array(pd_array)
Out[17]:
<pyarrow.lib.ExtensionArrayobjectat0x7f657cf78768>
[
15340,
15341,
15342
]
In [18]: df = pd.DataFrame({'periods': pd_array})
In [19]: pa.table(df)
...
ArrowInvalid: ('Could not convert 2012-01-01 with type Period: did not recognize Python value type when inferring an Arrow data type', 'Conversion failed for column periods with type period[D]')
(this is working correctly for array objects whose \_\_arrow_array\_\_ is returning a built-in pyarrow Array).
Joris Van den Bossche / @jorisvandenbossche:
In the end, this appears not related to the fact that they return an arrow ExtensionType array, but was a bug specifically to pandas' Interval and Period types, as those types have somewhat inconsistent (historical) behaviour for Series.values (they return an object ndarray instead of the extension array).
When someone has a custom ExtensionType defined in Python, and an array class that gets converted to that (through
\_\_arrow_array\_\_
), the conversion in pyarrow works with the array class, but not yet for the array stored in a pandas DataFrame.Eg using my definition of ArrowPeriodType in pandas-dev/pandas#28371, I see:
(this is working correctly for array objects whose
\_\_arrow_array\_\_
is returning a built-in pyarrow Array).Reporter: Joris Van den Bossche / @jorisvandenbossche
Assignee: Joris Van den Bossche / @jorisvandenbossche
PRs and other links:
Note: This issue was originally created as ARROW-7022. Please see the migration documentation for further details.
The text was updated successfully, but these errors were encountered: