Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in pyarrow.from_pandas() when input has MultiIndex index columns having non-string names #38983

Open
qsourav opened this issue Nov 29, 2023 · 3 comments

Comments

@qsourav
Copy link

qsourav commented Nov 29, 2023

Describe the bug, including details regarding any error messages, version, and platform.

When the input dataframe has MultiIndex index columns having non-string names, the output is not as expected:

import pyarrow as pa
print(f"version: {pa.__version__}")

index = pd.MultiIndex.from_tuples([(10,20), (30,40), (50,60), (70,80)], names = [1,2])
df = pd.DataFrame({"a": [1,2,3,4], "b": [5,6,7,8]}, index=index)
print(df)
print(pa.Table.from_pandas(df))

Outputs:

version: 14.0.1
sys:1: UserWarning: The DataFrame has non-str index name `[1, 1]` which will be converted to string and not roundtrip correctly.
pyarrow.Table
a: int64
b: int64
1: int64
1: int64
----
a: [[1,2,3,4]]
b: [[5,6,7,8]]
1: [[10,30,50,70]]
1: [[10,30,50,70]]

Component(s)

Python

@qsourav
Copy link
Author

qsourav commented Nov 29, 2023

The same seems to be fixed from the below workaround:

n = len(getattr(index, 'levels', [index]))

def _get_index_level_values(index):
    n = len(getattr(index, 'levels', [index]))
    if isinstance(index, _pandas_api.pd.MultiIndex):
        return [index.levels[i] for i in range(n)]
    else:
        return [index.get_level_values(i) for i in range(n)]

@qsourav
Copy link
Author

qsourav commented Nov 29, 2023

Some issues with the "levels" parameter when the index columns have non-string names. Even names = ["1", 2] works fine!

@qsourav
Copy link
Author

qsourav commented Nov 29, 2023

might be related to:
pandas-dev/pandas#16160
pandas-dev/pandas#2770

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant