Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support at/iat indexers in cudf.pandas #16177

Merged
merged 8 commits into from
Jul 8, 2024
12 changes: 10 additions & 2 deletions python/cudf/cudf/core/dataframe.py
Original file line number Diff line number Diff line change
Expand Up @@ -462,6 +462,10 @@ def _setitem_tuple_arg(self, key, value):
self._frame[col].loc[key[0]] = value[i]


class _DataFrameAtIndexer(_DataFrameLocIndexer):
pass


class _DataFrameIlocIndexer(_DataFrameIndexer):
"""
For selection by index.
Expand Down Expand Up @@ -584,6 +588,10 @@ def _setitem_tuple_arg(self, key, value):
self._frame[col].iloc[key[0]] = value[i]


class _DataFrameiAtIndexer(_DataFrameIlocIndexer):
pass


class DataFrame(IndexedFrame, Serializable, GetAttrGetItemMixin):
"""
A GPU Dataframe object.
Expand Down Expand Up @@ -2581,14 +2589,14 @@ def iat(self):
"""
Alias for ``DataFrame.iloc``; provided for compatibility with Pandas.
"""
return self.iloc
return _DataFrameiAtIndexer(self)

@property
def at(self):
"""
Alias for ``DataFrame.loc``; provided for compatibility with Pandas.
"""
return self.loc
return _DataFrameAtIndexer(self)

@property # type: ignore
@_external_only_api(
Expand Down
12 changes: 12 additions & 0 deletions python/cudf/cudf/pandas/_wrappers/pandas.py
Original file line number Diff line number Diff line change
Expand Up @@ -775,6 +775,18 @@ def Index__new__(cls, *args, **kwargs):
pd.core.indexing._LocIndexer,
)

_AtIndexer = make_intermediate_proxy_type(
"_AtIndexer",
cudf.core.dataframe._DataFrameAtIndexer,
pd.core.indexing._AtIndexer,
)

_iAtIndexer = make_intermediate_proxy_type(
"_iAtIndexer",
cudf.core.dataframe._DataFrameiAtIndexer,
pd.core.indexing._iAtIndexer,
)

FixedForwardWindowIndexer = make_final_proxy_type(
"FixedForwardWindowIndexer",
_Unusable,
Expand Down
19 changes: 19 additions & 0 deletions python/cudf/cudf_pandas_tests/test_cudf_pandas.py
Original file line number Diff line number Diff line change
Expand Up @@ -1566,3 +1566,22 @@ def test_arrow_string_arrays():
)

tm.assert_equal(cu_arr, pd_arr)


@pytest.mark.parametrize("indexer", ["at", "iat"])
def test_at_iat(indexer):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this cover the case of a new column that doesn't exist? I'd like to see a test that more clearly resembles the patterns in the minimal repro of #16112.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, added a unit test in a43454c to mirror the issue. (It appears that iat.__setitem__ in pandas doesn't support the type of behavior demonstrated in the issue)

Copy link
Contributor Author

@mroeschke mroeschke Jul 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, thanks for encouraging a unit test with the issue. Even though the test passed locally the CI didn't agree.

Left some investigation notes in #16112 (comment) that I may come back to later

df = xpd.DataFrame(range(3))
result = getattr(df, indexer)[0, 0]
assert result == 0

getattr(df, indexer)[0, 0] = 1
expected = pd.DataFrame([1, 1, 2])
tm.assert_frame_equal(df, expected)


def test_at_setitem_empty():
df = xpd.DataFrame({"name": []}, dtype="float64")
df.at[0, "name"] = 1.0
df.at[0, "new"] = 2.0
expected = pd.DataFrame({"name": [1.0], "new": [2.0]})
tm.assert_frame_equal(df, expected)
Loading