Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Series.__setitem__ that expands is not displayable #13860

Closed
mroeschke opened this issue Aug 11, 2023 · 5 comments
Closed

[BUG] Series.__setitem__ that expands is not displayable #13860

mroeschke opened this issue Aug 11, 2023 · 5 comments
Assignees
Labels
bug Something isn't working

Comments

@mroeschke
Copy link
Contributor

mroeschke commented Aug 11, 2023

Describe the bug
Series.setitem that expands is not displayable

Steps/Code to reproduce bug

In [12]: import cudf

In [13]: se = cudf.Series([1, 2, 3])

In [14]: se[5] = 5.

In [15]: repr(se)
File ~/mambaforge/envs/xdf_dev/lib/python3.10/site-packages/cudf/core/series.py:1442, in Series.__repr__(self)
   1433     output = pd_series.to_string(
   1434         name=self.name,
   1435         dtype=self.dtype,
   (...)
   1439         na_rep=cudf._NA_REP,
   1440     )
   1441 else:
-> 1442     output = repr(preprocess.to_pandas())
   1444 lines = output.split("\n")
   1445 if isinstance(preprocess._column, cudf.core.column.CategoricalColumn):

File ~/mambaforge/envs/xdf_dev/lib/python3.10/site-packages/nvtx/nvtx.py:101, in annotate.__call__.<locals>.inner(*args, **kwargs)
     98 @wraps(func)
     99 def inner(*args, **kwargs):
    100     libnvtx_push_range(self.attributes, self.domain.handle)
--> 101     result = func(*args, **kwargs)
    102     libnvtx_pop_range(self.domain.handle)
    103     return result

File ~/mambaforge/envs/xdf_dev/lib/python3.10/site-packages/cudf/core/series.py:2004, in Series.to_pandas(self, index, nullable, **kwargs)
   2002 if index is True:
   2003     index = self.index.to_pandas()
-> 2004 s = self._column.to_pandas(index=index, nullable=nullable)
   2005 s.name = self.name
   2006 return s

File ~/mambaforge/envs/xdf_dev/lib/python3.10/site-packages/cudf/core/column/numerical.py:690, in NumericalColumn.to_pandas(self, index, nullable, **kwargs)
    687     pd_series = self.to_arrow().to_pandas(**kwargs)
    689 if index is not None:
--> 690     pd_series.index = index
    691 return pd_series

File ~/mambaforge/envs/xdf_dev/lib/python3.10/site-packages/pandas/core/generic.py:5915, in NDFrame.__setattr__(self, name, value)
   5913 try:
   5914     object.__getattribute__(self, name)
-> 5915     return object.__setattr__(self, name, value)
   5916 except AttributeError:
   5917     pass

File ~/mambaforge/envs/xdf_dev/lib/python3.10/site-packages/pandas/_libs/properties.pyx:69, in pandas._libs.properties.AxisProperty.__set__()

File ~/mambaforge/envs/xdf_dev/lib/python3.10/site-packages/pandas/core/series.py:593, in Series._set_axis(self, axis, labels)
    590             pass
    592 # The ensure_index call above ensures we have an Index object
--> 593 self._mgr.set_axis(axis, labels)

File ~/mambaforge/envs/xdf_dev/lib/python3.10/site-packages/pandas/core/internals/managers.py:230, in BaseBlockManager.set_axis(self, axis, new_labels)
    228 def set_axis(self, axis: int, new_labels: Index) -> None:
    229     # Caller is responsible for ensuring we have an Index object.
--> 230     self._validate_set_axis(axis, new_labels)
    231     self.axes[axis] = new_labels

File ~/mambaforge/envs/xdf_dev/lib/python3.10/site-packages/pandas/core/internals/base.py:70, in DataManager._validate_set_axis(self, axis, new_labels)
     67     pass
     69 elif new_len != old_len:
---> 70     raise ValueError(
     71         f"Length mismatch: Expected axis has {old_len} elements, new "
     72         f"values have {new_len} elements"
     73     )

ValueError: Length mismatch: Expected axis has 4 elements, new values have 3 elements

EDIT: I notice something similar happens with loc.__setitem__ and iloc.__seitem__

Expected behavior
This object repr should not be corrupted

In [17]: import pandas

In [18]: se = pandas.Series([1, 2, 3])

In [19]: se[5] = 5.

In [20]: se
Out[20]: 
0    1.0
1    2.0
2    3.0
5    5.0
dtype: float64

Environment overview (please complete the following information)

  • Environment location: Bare-metal
  • Method of cuDF install: conda
    • If method of install is [Docker], provide docker pull & docker run commands used

Environment details
Please run and paste the output of the cudf/print_env.sh script here, to gather any other relevant environment details

Additional context
Add any other context about the problem here.

@mroeschke mroeschke added bug Something isn't working Needs Triage Need team to review and classify labels Aug 11, 2023
@galipremsagar galipremsagar self-assigned this Aug 13, 2023
@galipremsagar galipremsagar added python and removed Needs Triage Need team to review and classify labels Aug 13, 2023
@wence-
Copy link
Contributor

wence- commented Aug 15, 2023

EDIT: I notice something similar happens with loc.setitem and iloc.seitem

iloc.__setitem__ should fail right? With positional indexing one can't expand unknown indices AFAIU.

@wence-
Copy link
Contributor

wence- commented Aug 15, 2023

@galipremsagar not sure where you got to, but maybe:

diff --git a/python/cudf/cudf/core/series.py b/python/cudf/cudf/core/series.py
index 6ff0584538..931c67f63d 100644
--- a/python/cudf/cudf/core/series.py
+++ b/python/cudf/cudf/core/series.py
@@ -166,7 +166,7 @@ def _describe_categorical(obj, percentiles):
     return data
 
 
-def _append_new_row_inplace(col: ColumnLike, value: ScalarLike):
+def _append_new_row(col: ColumnLike, value: ScalarLike, *, inplace):
     """Append a scalar `value` to the end of `col` inplace.
     Cast to common type if possible
     """
@@ -174,7 +174,9 @@ def _append_new_row_inplace(col: ColumnLike, value: ScalarLike):
     val_col = as_column(value, dtype=to_type)
     old_col = col.astype(to_type)
 
-    col._mimic_inplace(concat_columns([old_col, val_col]), inplace=True)
+    return col._mimic_inplace(
+        concat_columns([old_col, val_col]), inplace=inplace
+    )
 
 
 class _SeriesIlocIndexer(_FrameIndexer):
@@ -274,8 +276,13 @@ class _SeriesLocIndexer(_FrameIndexer):
                 and not isinstance(self._frame.index, cudf.MultiIndex)
                 and is_scalar(value)
             ):
-                _append_new_row_inplace(self._frame.index._values, key)
-                _append_new_row_inplace(self._frame._column, value)
+                index = self._frame._index._values
+                col = self._frame._column
+                col = _append_new_row(col, value, inplace=False)
+                index = _append_new_row(index, key, inplace=False)
+                self._frame._mimic_inplace(
+                    Series(col, index=index), inplace=True
+                )
                 return
             else:
                 raise e

@mroeschke
Copy link
Contributor Author

EDIT: I notice something similar happens with loc.setitem and iloc.seitem

iloc.__setitem__ should fail right? With positional indexing one can't expand unknown indices AFAIU.

Correct

In [4]: se.iloc[3] = 4
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Cell In[4], line 1
----> 1 se.iloc[3] = 4

File ~/pandas-mroeschke/pandas/core/indexing.py:882, in _LocationIndexer.__setitem__(self, key, value)
    880     key = com.apply_if_callable(key, self.obj)
    881 indexer = self._get_setitem_indexer(key)
--> 882 self._has_valid_setitem_indexer(key)
    884 iloc = self if self.name == "iloc" else self.obj.iloc
    885 iloc._setitem_with_indexer(indexer, value, self.name)

File ~/pandas-mroeschke/pandas/core/indexing.py:1607, in _iLocIndexer._has_valid_setitem_indexer(self, indexer)
   1605 elif is_integer(i):
   1606     if i >= len(ax):
-> 1607         raise IndexError("iloc cannot enlarge its target object")
   1608 elif isinstance(i, dict):
   1609     raise IndexError("iloc cannot enlarge its target object")

IndexError: iloc cannot enlarge its target object

@galipremsagar galipremsagar mentioned this issue Aug 16, 2023
3 tasks
@galipremsagar
Copy link
Contributor

galipremsagar commented Aug 16, 2023

@galipremsagar not sure where you got to, but maybe:

diff --git a/python/cudf/cudf/core/series.py b/python/cudf/cudf/core/series.py
index 6ff0584538..931c67f63d 100644
--- a/python/cudf/cudf/core/series.py
+++ b/python/cudf/cudf/core/series.py
@@ -166,7 +166,7 @@ def _describe_categorical(obj, percentiles):
     return data
 
 
-def _append_new_row_inplace(col: ColumnLike, value: ScalarLike):
+def _append_new_row(col: ColumnLike, value: ScalarLike, *, inplace):
     """Append a scalar `value` to the end of `col` inplace.
     Cast to common type if possible
     """
@@ -174,7 +174,9 @@ def _append_new_row_inplace(col: ColumnLike, value: ScalarLike):
     val_col = as_column(value, dtype=to_type)
     old_col = col.astype(to_type)
 
-    col._mimic_inplace(concat_columns([old_col, val_col]), inplace=True)
+    return col._mimic_inplace(
+        concat_columns([old_col, val_col]), inplace=inplace
+    )
 
 
 class _SeriesIlocIndexer(_FrameIndexer):
@@ -274,8 +276,13 @@ class _SeriesLocIndexer(_FrameIndexer):
                 and not isinstance(self._frame.index, cudf.MultiIndex)
                 and is_scalar(value)
             ):
-                _append_new_row_inplace(self._frame.index._values, key)
-                _append_new_row_inplace(self._frame._column, value)
+                index = self._frame._index._values
+                col = self._frame._column
+                col = _append_new_row(col, value, inplace=False)
+                index = _append_new_row(index, key, inplace=False)
+                self._frame._mimic_inplace(
+                    Series(col, index=index), inplace=True
+                )
                 return
             else:
                 raise e

Yup, that + we need to do some handling for RangeIndex alone. Opened a WIP PR: #13888

@vyasr
Copy link
Contributor

vyasr commented Jan 23, 2024

This issue was resolved in 66fdbe7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants