[Python] Failing tests because pandas.Index can now store all numeric dtypes (not only 64bit versions) #34404

jorisvandenbossche · 2023-03-01T18:11:53Z

We have several failing tests in the nightly build (https://github.com/ursacomputing/crossbow/actions/runs/4277727973/jobs/7446784501) because of a change in pandas 2.0: the Index can now store all numeric dtypes, and not just int64/uint64/float64, see https://pandas.pydata.org/docs/dev/whatsnew/v2.0.0.html#index-can-now-hold-numpy-numeric-dtypes.

Failing tests because of this:

FAILED python/pyarrow/tests/test_pandas.py::test_table_from_pandas_schema_index_columns - AssertionError: DataFrame.index are different
FAILED python/pyarrow/tests/parquet/test_dataset.py::test_read_partitioned_directory[False] - AssertionError: Attributes of DataFrame.iloc[:, 2] (column name="foo") are different
FAILED python/pyarrow/tests/parquet/test_dataset.py::test_read_partitioned_directory_s3fs[False] - AssertionError: Attributes of DataFrame.iloc[:, 2] (column name="foo") are different

I think all those cases are where now an int32 dtype is preserved, while before it would have been cast to int64 by pandas. But the expected result still uses int64, causing the test failures.

The text was updated successfully, but these errors were encountered:

…ll numeric dtypes (not only 64bit versions) (#34498) ### Rationale for this change Several failing tests in the nightly build (https://github.com/ursacomputing/crossbow/actions/runs/4277727973/jobs/7446784501) ### What changes are included in this PR? Due to change in supported dtypes for Index in pandas, the tests expecting `int64`and not `int32` are failing with dev version of pandas. The failing tests are updated to match the new pandas behaviour. * Closes: #34404 Authored-by: Alenka Frim <frim.alenka@gmail.com> Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>

jorisvandenbossche added Component: Python Type: test labels Mar 1, 2023

jorisvandenbossche added this to the 12.0.0 milestone Mar 1, 2023

AlenkaF self-assigned this Mar 2, 2023

github-actions bot mentioned this issue Mar 8, 2023

GH-34404: [Python] Failing tests because pandas.Index can now store all numeric dtypes (not only 64bit versions) #34498

Merged

jorisvandenbossche closed this as completed in #34498 Mar 10, 2023

noloerino mentioned this issue May 3, 2023

BUG: test_read_parquet_pandas_index[pyarrow] is broken at main due to pyarrow 12.0 modin-project/modin#6072

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Python] Failing tests because pandas.Index can now store all numeric dtypes (not only 64bit versions) #34404

[Python] Failing tests because pandas.Index can now store all numeric dtypes (not only 64bit versions) #34404

jorisvandenbossche commented Mar 1, 2023

[Python] Failing tests because pandas.Index can now store all numeric dtypes (not only 64bit versions) #34404

[Python] Failing tests because pandas.Index can now store all numeric dtypes (not only 64bit versions) #34404

Comments

jorisvandenbossche commented Mar 1, 2023