We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hi! pa.Schema.from_pandas called on a dataframe whose index is a pandas extension dtype (e.g., string[python]) results in an error:
import pyarrow as pa df = pd.DataFrame({"a": [1, 2]}, index=pd.Index(["A", "B"], dtype="string")) pa.Schema.from_pandas(df)
produces
AttributeError Traceback (most recent call last) /tmp/ipykernel_1827952/3691394220.py in <module> 1 import pyarrow as pa 2 df = pd.DataFrame({"a": [1, 2]}, index=pd.Index(["A", "B"], dtype="string")) ----> 3 pa.Schema.from_pandas(df) ~/miniconda3/envs/dask/lib/python3.8/site-packages/pyarrow/types.pxi in pyarrow.lib.Schema.from_pandas() ~/miniconda3/envs/dask/lib/python3.8/site-packages/pyarrow/pandas_compat.py in dataframe_to_types(df, preserve_index, columns) 527 type_ = pa.array(c, from_pandas=True).type 528 elif _pandas_api.is_extension_array_dtype(values): --> 529 type_ = pa.array(c.head(0), from_pandas=True).type 530 else: 531 values, type_ = get_datetimetz_type(values, c.dtype, None) AttributeError: 'Index' object has no attribute 'head'
If I remove the head call, or convert the index to a series manually, things work.
head
Reported downstream in dask/dask#9186
Related issue from a couple of years ago: https://issues.apache.org/jira/browse/ARROW-8159
Reporter: Ian Rose Assignee: James Bourbeau / @jrbourbeau
Note: This issue was originally created as ARROW-16838. Please see the migration documentation for further details.
The text was updated successfully, but these errors were encountered:
Joris Van den Bossche / @jorisvandenbossche: Issue resolved by pull request 14080 #14080
Sorry, something went wrong.
No branches or pull requests
Hi! pa.Schema.from_pandas called on a dataframe whose index is a pandas extension dtype (e.g., string[python]) results in an error:
produces
If I remove the
head
call, or convert the index to a series manually, things work.Reported downstream in dask/dask#9186
Related issue from a couple of years ago: https://issues.apache.org/jira/browse/ARROW-8159
Reporter: Ian Rose
Assignee: James Bourbeau / @jrbourbeau
PRs and other links:
Note: This issue was originally created as ARROW-16838. Please see the migration documentation for further details.
The text was updated successfully, but these errors were encountered: