-
Notifications
You must be signed in to change notification settings - Fork 856
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REVIEW] Drop DataFrame.append
and Series.append
#12839
[REVIEW] Drop DataFrame.append
and Series.append
#12839
Conversation
expected = pd.concat([pdf, other_pd], sort=sort, ignore_index=ignore_index) | ||
actual = cudf.concat([gdf, other_gd], sort=sort, ignore_index=ignore_index) | ||
|
||
# In some cases, Pandas creates an empty Index([], dtype="object") for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you show a example where this happens? I think pandas would also like to return an empty RangeIndex too if the object is empty-like
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for an incorrect comment previously. So the actual issue is with how cudf
& pandas
represent empty columns. In case of pandas it's always an empty RangeIndex
, but in cudf it's always an empty Index
:
In [1]: import cudf
In [2]: import pandas as pd
In [3]: df = cudf.DataFrame()
In [4]: pdf = pd.DataFrame()
In [5]: df.columns
Out[5]: RangeIndex(start=0, stop=0, step=1)
In [6]: pdf.columns
Out[6]: Index([], dtype='object')
Due to this difference, when we concat similar dataframes, pandas still returns a RangeIndex
, whereas cudf will return an empty Index
. Which is why we needed this special handling here for pytest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay cool thanks for the clarification!
Description
This PR removes
DataFrame.append
&Series.append
to match pandas-2.0 API. Test usages are now replaced with.concat
API calls.pytests related to these changes:
Checklist