[REVIEW] Drop `DataFrame.append` and `Series.append` #12839

galipremsagar · 2023-02-24T00:40:49Z

Description

This PR removes DataFrame.append & Series.append to match pandas-2.0 API. Test usages are now replaced with .concat API calls.

pytests related to these changes:

(cudfdev) pgali@dt07:/nvme/0/pgali/cudf$ pytest python/cudf/cudf/tests/test_dataframe.py::test_concat_index
============================================================================================= test session starts ==============================================================================================

python/cudf/cudf/tests/test_dataframe.py ....                                                                                                                                                            [100%]

============================================================================================== 4 passed in 1.68s ===============================================================================================
(cudfdev) pgali@dt07:/nvme/0/pgali/cudf$ pytest python/cudf/cudf/tests/test_dataframe.py::test_axes
============================================================================================= test session starts ==============================================================================================

python/cudf/cudf/tests/test_dataframe.py ...........                                                                                                                                                     [100%]

============================================================================================== 11 passed in 1.68s ==============================================================================================
(cudfdev) pgali@dt07:/nvme/0/pgali/cudf$ pytest python/cudf/cudf/tests/test_dataframe.py::test_concat_different_column_dataframe
============================================================================================= test session starts ==============================================================================================                                                                                                                                                                                           
python/cudf/cudf/tests/test_dataframe.py ............                                                                                                                                                    [100%]

============================================================================================== 12 passed in 1.74s ==============================================================================================
(cudfdev) pgali@dt07:/nvme/0/pgali/cudf$ pytest python/cudf/cudf/tests/test_dataframe.py::test_dataframe_concat_dataframe
============================================================================================= test session starts ==============================================================================================

python/cudf/cudf/tests/test_dataframe.py ............................................................................................................................................................... [ 25%]
........................................................................................................................................................................................................ [ 57%]
........................................................................................................................................................................................................ [ 89%]
.................................................................                                                                                                                                        [100%]

============================================================================================= 624 passed in 4.68s ==============================================================================================
(cudfdev) pgali@dt07:/nvme/0/pgali/cudf$ pytest python/cudf/cudf/tests/test_dataframe.py::test_dataframe_concat_series
============================================================================================= test session starts ==============================================================================================

python/cudf/cudf/tests/test_dataframe.py ................................................................                                                                                                [100%]

============================================================================================== 64 passed in 1.98s ==============================================================================================
(cudfdev) pgali@dt07:/nvme/0/pgali/cudf$ pytest python/cudf/cudf/tests/test_dataframe.py::test_dataframe_concat_series_mixed_index
============================================================================================= test session starts ==============================================================================================

python/cudf/cudf/tests/test_dataframe.py .                                                                                                                                                               [100%]

============================================================================================== 1 passed in 1.64s ===============================================================================================
(cudfdev) pgali@dt07:/nvme/0/pgali/cudf$ pytest python/cudf/cudf/tests/test_dataframe.py::test_dataframe_concat_dataframe_lists
============================================================================================= test session starts ==============================================================================================
python/cudf/cudf/tests/test_dataframe.py ............................................................................................................................................................... [ 30%]
........................................................................................................................................................................................................ [ 67%]
.........................................................................................................................................................................                                [100%]

============================================================================================= 528 passed in 5.57s ==============================================================================================
(cudfdev) pgali@dt07:/nvme/0/pgali/cudf$ pytest python/cudf/cudf/tests/test_dataframe.py::test_dataframe_concat_series_without_name
============================================================================================= test session starts ==============================================================================================

python/cudf/cudf/tests/test_dataframe.py .                                                                                                                                                               [100%]

============================================================================================== 1 passed in 1.63s ===============================================================================================
(cudfdev) pgali@dt07:/nvme/0/pgali/cudf$ pytest python/cudf/cudf/tests/test_series.py::test_series_concat_basic
============================================================================================= test session starts ==============================================================================================

python/cudf/cudf/tests/test_series.py ..................                                                                                                                                                 [100%]

============================================================================================== 18 passed in 1.20s ==============================================================================================
(cudfdev) pgali@dt07:/nvme/0/pgali/cudf$ pytest python/cudf/cudf/tests/test_series.py::test_series_concat_basic_str
============================================================================================= test session starts ==============================================================================================

python/cudf/cudf/tests/test_series.py ................                                                                                                                                                   [100%]

============================================================================================== 16 passed in 1.23s ==============================================================================================
(cudfdev) pgali@dt07:/nvme/0/pgali/cudf$ pytest python/cudf/cudf/tests/test_series.py::test_series_concat_series_with_index
============================================================================================= test session starts ==============================================================================================

python/cudf/cudf/tests/test_series.py ................                                                                                                                                                   [100%]

============================================================================================== 16 passed in 1.20s ==============================================================================================
(cudfdev) pgali@dt07:/nvme/0/pgali/cudf$ pytest python/cudf/cudf/tests/test_series.py::test_series_concat_error_mixed_types
============================================================================================= test session starts ==============================================================================================

python/cudf/cudf/tests/test_series.py .                                                                                                                                                                  [100%]

============================================================================================== 1 passed in 1.14s ===============================================================================================
(cudfdev) pgali@dt07:/nvme/0/pgali/cudf$ pytest python/cudf/cudf/tests/test_series.py::test_series_concat_list_series_with_index
============================================================================================= test session starts ==============================================================================================

python/cudf/cudf/tests/test_series.py ........................                                                                                                                                           [100%]

============================================================================================== 24 passed in 1.79s ==============================================================================================
(cudfdev) pgali@dt07:/nvme/0/pgali/cudf$ pytest python/cudf/cudf/tests/test_series.py::test_series_concat_existing_buffers
============================================================================================= test session starts ==============================================================================================

python/cudf/cudf/tests/test_series.py .                                                                                                                                                                  [100%]

============================================================================================== 1 passed in 1.21s ===============================================================================================

(cudfdev) pgali@dt07:/nvme/0/pgali/cudf$ conda list | grep "pandas"
pandas                    2.0.0rc0                 pypi_0    pypi

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.

…o append_2.0

mroeschke · 2023-02-24T22:21:39Z

python/cudf/cudf/tests/test_dataframe.py

+    expected = pd.concat([pdf, other_pd], sort=sort, ignore_index=ignore_index)
+    actual = cudf.concat([gdf, other_gd], sort=sort, ignore_index=ignore_index)
+
+    # In some cases, Pandas creates an empty Index([], dtype="object") for


Could you show a example where this happens? I think pandas would also like to return an empty RangeIndex too if the object is empty-like

Apologies for an incorrect comment previously. So the actual issue is with how cudf & pandas represent empty columns. In case of pandas it's always an empty RangeIndex, but in cudf it's always an empty Index:

In [1]: import cudf In [2]: import pandas as pd In [3]: df = cudf.DataFrame() In [4]: pdf = pd.DataFrame() In [5]: df.columns Out[5]: RangeIndex(start=0, stop=0, step=1) In [6]: pdf.columns Out[6]: Index([], dtype='object')

Due to this difference, when we concat similar dataframes, pandas still returns a RangeIndex, whereas cudf will return an empty Index. Which is why we needed this special handling here for pytest.

Okay cool thanks for the clarification!

…o append_2.0

galipremsagar added 2 commits February 23, 2023 14:11

Drop append

12d3b0a

Drop _append

57e6470

galipremsagar added 3 - Ready for Review Ready for review by team Python Affects Python cuDF API. 4 - Needs cuDF (Python) Reviewer improvement Improvement / enhancement to an existing function breaking Breaking change labels Feb 24, 2023

galipremsagar requested a review from a team as a code owner February 24, 2023 00:40

galipremsagar self-assigned this Feb 24, 2023

galipremsagar requested review from bdice and mroeschke and removed request for a team February 24, 2023 00:40

galipremsagar added 2 commits February 23, 2023 16:42

fix tests

885b234

Merge remote-tracking branch 'upstream/pandas_2.0_feature_branch' int…

dc04080

…o append_2.0

galipremsagar mentioned this pull request Feb 24, 2023

[FEA] Add pandas-2.0 support for cudf #12794

Closed

galipremsagar requested a review from shwina February 24, 2023 00:44

mroeschke reviewed Feb 24, 2023

View reviewed changes

galipremsagar added 5 commits March 7, 2023 16:33

Merge branch 'pandas_2.0_feature_branch' into append_2.0

3c3d084

Merge remote-tracking branch 'upstream/pandas_2.0_feature_branch' int…

da01f0e

…o append_2.0

clarify comment

24cce96

isort

ae1f7cc

Merge branch 'pandas_2.0_feature_branch' into append_2.0

25d7043

mroeschke approved these changes Mar 10, 2023

View reviewed changes

Merge branch 'pandas_2.0_feature_branch' into append_2.0

99a397b

shwina approved these changes Mar 10, 2023

View reviewed changes

galipremsagar merged commit e115ba5 into rapidsai:pandas_2.0_feature_branch Mar 10, 2023

galipremsagar added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team 4 - Needs cuDF (Python) Reviewer labels Mar 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[REVIEW] Drop `DataFrame.append` and `Series.append` #12839

[REVIEW] Drop `DataFrame.append` and `Series.append` #12839

galipremsagar commented Feb 24, 2023

mroeschke Feb 24, 2023

galipremsagar Mar 9, 2023

mroeschke Mar 10, 2023

[REVIEW] Drop DataFrame.append and Series.append #12839

[REVIEW] Drop DataFrame.append and Series.append #12839

Conversation

galipremsagar commented Feb 24, 2023

Description

Checklist

mroeschke Feb 24, 2023

Choose a reason for hiding this comment

galipremsagar Mar 9, 2023

Choose a reason for hiding this comment

mroeschke Mar 10, 2023

Choose a reason for hiding this comment

[REVIEW] Drop `DataFrame.append` and `Series.append` #12839

[REVIEW] Drop `DataFrame.append` and `Series.append` #12839