Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Wrap pandas string methods #360

Closed
ericmjl opened this issue May 14, 2019 · 5 comments · Fixed by #694
Closed

[ENH] Wrap pandas string methods #360

ericmjl opened this issue May 14, 2019 · 5 comments · Fixed by #694
Assignees
Labels
available for hacking This issue has not been claimed by any individual. enhancement New feature or request good intermediate issue Issues that are good for seasoned programmers to make a contribution

Comments

@ericmjl
Copy link
Member

ericmjl commented May 14, 2019

Brief Description

One thing that might be nice is to wrap all of the string methods such that they are now method-chainable.

If this could be done programmatically, that would be awesome. Otherwise, manually wrapping ones of interest, and building up a submodule of string method operations, would be a superb alternative.

Example API

The example API can be found in this notebook.

@szuckerman
Copy link
Collaborator

What about this?

Instead of:

@pf.register_dataframe_method
def str_join(df, column_name: str, sep: str, *args, **kwargs):
    """
    Wrapper around `df.str.join`
    Joins items in a list.    
    """

    df[column_name] = df[column_name].str.join(sep)
    return df

Make the function generic:

@pf.register_dataframe_method
def str(df, method_type, column, *arg, **kwargs):
    df[column_name] = getattr(df[column_name].str, method_type)(sep)
    return df

df.str('join', 'column_name', sep=',')

I've been looking for a way to make the function name generic, too, (i.e. so we could use df.str_join
and in pyjanitor the function name would be something like df.str_ + method_name) but that looks like you have to use setattr within a class and might not work with @pf.register_dataframe_method.

@zbarry zbarry added available for hacking This issue has not been claimed by any individual. enhancement New feature or request good intermediate issue Issues that are good for seasoned programmers to make a contribution labels Jul 12, 2019
@eyaltrabelsi
Copy link
Contributor

eyaltrabelsi commented Sep 22, 2019

@szuckerman @ericmjl I would like o work on this, what testing strategy do you think i should take?

@samukweku
Copy link
Collaborator

@ericmjl , Kindly assign to me if this is still available. @eyaltrabelsi asked first though, so please let’s check that he/she is still interested in it. Thanks

@ericmjl
Copy link
Member Author

ericmjl commented Jun 27, 2020

Oh my, it looks like I totally let this one drop. @eyaltrabelsi, I'm so sorry.

@samukweku, given the time interval, I think you can go ahead without any issues. Please do so 😄.

@ericmjl
Copy link
Member Author

ericmjl commented Jul 13, 2020

🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
available for hacking This issue has not been claimed by any individual. enhancement New feature or request good intermediate issue Issues that are good for seasoned programmers to make a contribution
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants