Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: np.str_ not supported in df.loc #45580

Closed
2 of 3 tasks
auderson opened this issue Jan 24, 2022 · 1 comment · Fixed by #45626
Closed
2 of 3 tasks

BUG: np.str_ not supported in df.loc #45580

auderson opened this issue Jan 24, 2022 · 1 comment · Fixed by #45626
Labels
Bug Compat pandas objects compatability with Numpy or Python functions Indexing Related to indexing on series/frames, not to indexes themselves Strings String extension data type and string data
Milestone

Comments

@auderson
Copy link
Contributor

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

df = pd.DataFrame(index=pd.date_range('2021', '2022'))
df.loc[np.array(['2021/6/1'])[0]:]

Issue Description

The above code throws the following error:

Traceback (most recent call last):
  File "/home/auderson/miniconda3/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3444, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-8-71a904a69823>", line 1, in <module>
    df.loc[np.array(['2021/6/1'])[0]:]
  File "/home/auderson/miniconda3/lib/python3.8/site-packages/pandas/core/indexing.py", line 966, in __getitem__
    return self._getitem_axis(maybe_callable, axis=axis)
  File "/home/auderson/miniconda3/lib/python3.8/site-packages/pandas/core/indexing.py", line 1179, in _getitem_axis
    return self._get_slice_axis(key, axis=axis)
  File "/home/auderson/miniconda3/lib/python3.8/site-packages/pandas/core/indexing.py", line 1213, in _get_slice_axis
    indexer = labels.slice_indexer(slice_obj.start, slice_obj.stop, slice_obj.step)
  File "/home/auderson/miniconda3/lib/python3.8/site-packages/pandas/core/indexes/datetimes.py", line 743, in slice_indexer
    return Index.slice_indexer(self, start, end, step, kind=kind)
  File "/home/auderson/miniconda3/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 6246, in slice_indexer
    start_slice, end_slice = self.slice_locs(start, end, step=step)
  File "/home/auderson/miniconda3/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 6456, in slice_locs
    start_slice = self.get_slice_bound(start, "left")
  File "/home/auderson/miniconda3/lib/python3.8/site-packages/pandas/core/indexes/datetimes.py", line 778, in get_slice_bound
    return super().get_slice_bound(label, side=side, kind=kind)
  File "/home/auderson/miniconda3/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 6365, in get_slice_bound
    label = self._maybe_cast_slice_bound(label, side)
  File "/home/auderson/miniconda3/lib/python3.8/site-packages/pandas/core/indexes/datetimes.py", line 694, in _maybe_cast_slice_bound
    label = super()._maybe_cast_slice_bound(label, side, kind=kind)
  File "/home/auderson/miniconda3/lib/python3.8/site-packages/pandas/core/indexes/datetimelike.py", line 310, in _maybe_cast_slice_bound
    parsed, reso = self._parse_with_reso(label)
  File "/home/auderson/miniconda3/lib/python3.8/site-packages/pandas/core/indexes/datetimelike.py", line 231, in _parse_with_reso
    parsed, reso_str = parsing.parse_time_string(label, self.freq)
TypeError: Argument 'arg' has incorrect type (expected str, got numpy.str_)

Expected Behavior

There are cases where I'll slice the dataframe with an element from a numpy str array, which has the dtype of np.str_. Not sure if this should be supported by pandas?

Installed Versions

INSTALLED VERSIONS

commit : d023ba7
python : 3.8.10.final.0
python-bits : 64
OS : Linux
OS-release : 5.8.0-63-generic
Version : #71-Ubuntu SMP Tue Jul 13 15:59:12 UTC 2021
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.4.0rc0
numpy : 1.20.3
pytz : 2021.3
dateutil : 2.8.2
pip : 21.2.4
setuptools : 58.0.4
Cython : 0.29.24
pytest : 6.2.5
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.6.2
html5lib : 1.1
pymysql : 1.0.2
psycopg2 : 2.8.6 (dt dec pq3 ext lo64)
jinja2 : 2.11.3
IPython : 7.29.0
pandas_datareader: 0.9.0
bs4 : None
bottleneck : None
fsspec : 2021.10.1
fastparquet : None
gcsfs : None
matplotlib : 3.5.0
numexpr : 2.7.3
odfpy : None
openpyxl : 3.0.7
pandas_gbq : None
pyarrow : 6.0.1
pyxlsb : None
s3fs : None
scipy : 1.7.3
sqlalchemy : 1.4.27
tables : None
tabulate : 0.8.9
xarray : None
xlrd : None
xlwt : None
numba : 0.54.1
zstandard : None

@auderson auderson added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jan 24, 2022
@lithomas1 lithomas1 added Compat pandas objects compatability with Numpy or Python functions Strings String extension data type and string data Indexing Related to indexing on series/frames, not to indexes themselves and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Jan 27, 2022
@jreback jreback added this to the 1.5 milestone Jan 28, 2022
@BrieucRyelandt
Copy link

BrieucRyelandt commented Mar 1, 2022

``I have the same issue in a project that use a custom subclass of str to access DataFrame fields.
The fixed that was implemented in #45626 only works for numpy.str_ but they might be many cases of packages that have objects inheriting from string.
Why don't you update the fix using issubclass(arg, str) (even isinstance would work) to handle all these cases at a time?
It would avoid to consider each cases one by one.

edit:
If people are looking for a patch, it can be fixed by adding a call method to the custom class:

class CustomClass(str):
    def __call__(self, _) -> str:
        return super().__str__()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Compat pandas objects compatability with Numpy or Python functions Indexing Related to indexing on series/frames, not to indexes themselves Strings String extension data type and string data
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants