Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't use iloc to set a subset of a categorical dataframe to scalar None. #4149

Closed
mvashishtha opened this issue Feb 4, 2022 · 0 comments · Fixed by #4160
Closed

Can't use iloc to set a subset of a categorical dataframe to scalar None. #4149

mvashishtha opened this issue Feb 4, 2022 · 0 comments · Fixed by #4160

Comments

@mvashishtha
Copy link
Collaborator

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): MacOS Big Sur 11.5.2
  • Modin version (modin.__version__): 0.13.0+11.g62179ef6
  • Python version: 3.9.9
  • Code we can use to reproduce:
import modin.pandas as pd
df = pd.DataFrame([["A"]], dtype="category")
df.iloc[0, 0] = None

Describe the problem

I get an assertion error from an internal Modin assertion instead of setting the only element in the dataframe to numpy.NaN, as pandas would.

Source code / logs

Show stack trace
AssertionError                            Traceback (most recent call last)
Input In [20], in <module>
      1 import modin.pandas as pd
      3 df = pd.DataFrame([["A"]], dtype="category")
----> 4 df.iloc[0, 0] = None

File ~/modin/modin/pandas/indexing.py:945, in _iLocIndexer.__setitem__(self, key, item)
    942 self._check_dtypes(col_loc)
    944 row_lookup, col_lookup = self._compute_lookup(row_loc, col_loc)
--> 945 super(_iLocIndexer, self).__setitem__(
    946     row_lookup,
    947     col_lookup,
    948     item,
    949     axis=self._determine_setitem_axis(
    950         row_lookup, col_lookup, row_scalar, col_scalar
    951     ),
    952 )

File ~/modin/modin/pandas/indexing.py:413, in _LocationIndexerBase.__setitem__(self, row_lookup, col_lookup, item, axis)
    411 if not assigning_to_single_category_column:
    412     item = self._broadcast_item(row_lookup, col_lookup, item, to_shape)
--> 413 self._write_items(row_lookup, col_lookup, item)

File ~/modin/modin/pandas/indexing.py:493, in _LocationIndexerBase._write_items(self, row_lookup, col_lookup, item)
    480 def _write_items(self, row_lookup, col_lookup, item):
    481     """
    482     Perform remote write and replace blocks.
    483
   (...)
    491         The new item value that needs to be assigned to `self`.
    492     """
--> 493     new_qc = self.qc.write_items(row_lookup, col_lookup, item)
    494     self.df._create_or_update_from_compiler(new_qc, inplace=True)

File ~/modin/modin/core/storage_formats/pandas/query_compiler.py:3062, in PandasQueryCompiler.write_items(self, row_numeric_index, col_numeric_index, broadcasted_items)
   3059     partition.iloc[row_internal_indices, col_internal_indices] = item
   3060     return partition
-> 3062 new_modin_frame = self._modin_frame.apply_select_indices(
   3063     axis=None,
   3064     func=iloc_mut,
   3065     row_labels=row_numeric_index,
   3066     col_labels=col_numeric_index,
   3067     new_index=self.index,
   3068     new_columns=self.columns,
   3069     keep_remaining=True,
   3070     item_to_distribute=broadcasted_items,
   3071 )
   3072 return self.__constructor__(new_modin_frame)

File ~/modin/modin/core/dataframe/pandas/dataframe/dataframe.py:110, in lazy_metadata_decorator.<locals>.decorator.<locals>.run_f_on_minimally_updated_metadata(self, *args, **kwargs)
    108     elif apply_axis == "rows":
    109         obj._propagate_index_objs(axis=0)
--> 110 result = f(self, *args, **kwargs)
    111 if apply_axis is None and not transpose:
    112     result._deferred_index = self._deferred_index

File ~/modin/modin/core/dataframe/pandas/dataframe/dataframe.py:1981, in PandasDataframe.apply_select_indices(self, axis, func, apply_indices, row_labels, col_labels, new_index, new_columns, keep_remaining, item_to_distribute)
   1979 assert row_labels is not None and col_labels is not None
   1980 assert keep_remaining
-> 1981 assert item_to_distribute is not None
   1982 row_partitions_list = self._get_dict_of_block_index(0, row_labels).items()
   1983 col_partitions_list = self._get_dict_of_block_index(1, col_labels).items()

AssertionError:
YarShev pushed a commit that referenced this issue Feb 25, 2022
…ms. (#4160)

Signed-off-by: mvashishtha <mahesh@ponder.io>
vnlitvinov pushed a commit that referenced this issue Mar 17, 2022
…ms. (#4160)

Signed-off-by: mvashishtha <mahesh@ponder.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant