Skip to content

BUG: Regex parameter from replace function doesn't work on string dtype #36472

@GYHHAHA

Description

@GYHHAHA

import pandas as pd
print(pd.version)
1.1.1
s1 = pd.Series(['a','b'], dtype='string')
s2 = pd.Series(['a','b'], dtype='object')
s1.replace(r'[a]',pd.NA,regex=True)
0 a
1 b
dtype: string
s2.replace(r'[a]',pd.NA,regex=True)
0
1 b
dtype: object

Hi, it seems that regex doesn't function well for the replace method on string dtype, but normal on object dtype.

Besides, I also find setting pd.NA as repl of str.replace method will raise error, both on string and object.
Yet I think it may be better not to raise this error because the pd.NA indeed could be a missing string value.

pd.Series(['a','b'],dtype='string').str.replace('a',pd.NA)
TypeError: repl must be a string or callable
pd.Series(['a','b'],dtype='object').str.replace('a',pd.NA)
TypeError: repl must be a string or callable

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions