Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature - Adding Dataframe applicability to some Series string methods #22911

Open
kaybhutani opened this issue Sep 30, 2018 · 6 comments
Open
Labels
Enhancement Strings String extension data type and string data

Comments

@kaybhutani
Copy link

kaybhutani commented Sep 30, 2018

Hello,
Initially the string methods, like replace, lower, zfill, strip etc etc.. are restricted to Series use only.
It would be good if a parameter is put to use it on data frames too. Methods like strip won't affect numeric columns since they wount be having spaces already. But if there is a method which can affect a numeric column, it can be excluded using exclude parameter (which should be added).
A simple way of doing it is demonstrated below.

import pandas as pd
data=pd.read_csv("nba.csv")
data
dtypes=data.dtypes.astype(str)
for columns in data.columns:
    data[columns]=data[columns].astype(str)
    data[columns]=data[columns].str.replace(" ","")
    data[columns]=data[columns].astype(dtypes[columns])
data

In this example, the method is working fine with Series of all dtypes. And after successfully applying method, the columns are converted back to their original dtype.

If this issue is approved, I would like to work and contribute to this feature.

@TomAugspurger
Copy link
Contributor

It's not clear what should happen when you have a mix of numeric and string dtypes.

We typically recommend something like

columns = data.select_dtypes("str")
data[column] = data[columns].apply(pd.Series.str.replace(" ", ","))

for applying the same column-wise transformation to a subset of the columns.

@kaybhutani
Copy link
Author

kaybhutani commented Sep 30, 2018 via email

@sinhrks sinhrks added API Design Strings String extension data type and string data labels Oct 1, 2018
@TomAugspurger
Copy link
Contributor

Ah, in that case I think that #17211 may cover everything you're asking for here. Can you confirm?

@kaybhutani
Copy link
Author

kaybhutani commented Oct 8, 2018 via email

@kaybhutani
Copy link
Author

kaybhutani commented Oct 8, 2018 via email

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Oct 8, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Strings String extension data type and string data
Projects
None yet
Development

No branches or pull requests

4 participants