Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling column names when creating new data #514

Closed
argenisleon opened this issue Apr 29, 2019 · 4 comments

Comments

Projects
2 participants
@argenisleon
Copy link
Member

commented Apr 29, 2019

When operations generate new data the target columns can be:

  • The actual column
  • A new column named by the user
  • Auto named column in case an operation is applied to multiple columns
  • Manually named multiple columns

We need to define how the user is going to handle this. Actually, some methods create new columns and some not.

@issue-label-bot

This comment has been minimized.

Copy link

commented Apr 29, 2019

Issue-Label Bot is automatically applying the label feature_request to this issue, with a confidence of 0.53. Please mark this comment with 👍 or 👎 to give our bot feedback!

Links: app homepage, dashboard and code for this bot.

@argenisleon argenisleon self-assigned this Apr 29, 2019

@argenisleon argenisleon added this to To do in Road to 2.3 via automation Apr 29, 2019

@argenisleon argenisleon moved this from To do to In progress in Road to 2.3 May 9, 2019

@wilmeragsgh

This comment has been minimized.

Copy link
Collaborator

commented May 16, 2019

Is it a specific example/use case where you see this? i would like to propose some ideas to see what you think based on those scenarios

@argenisleon

This comment has been minimized.

Copy link
Member Author

commented May 16, 2019

Hi @wilmeragsgh,

This works like this:

# Cast in the same column
df.cols.cast("col_1,","str")

# Cast and create col_new
df.cols.cast("col_1,","str",output_cols="col_new")

# Cast multiple columns to col_1_new, col_2_new, col_3_new
df.cols.cast(["col_1","col2","col3"],"str",output_cols="_new")

# Cast multiple columns to col_1_new, col_2_new, col_3_new specifing the column name
df.cols.cast(["col_1","col2","col3"],"str",output_cols=["col_1_new","col_2_new", "col_3_new])

It's implemented in all the methods that affect columns like 'replace', 'impute','fill_na' and some others. You can check this on https://github.com/ironmussa/Optimus/blob/develop/optimus/dataframe/columns.py

We hope soon have this merged into master. What do you have in mind?

@argenisleon argenisleon moved this from In progress to Done in Road to 2.3 May 17, 2019

@wilmeragsgh

This comment has been minimized.

Copy link
Collaborator

commented May 20, 2019

Oh alright, just wanted to help with this:

We need to define how the user is going to handle this. Actually, some methods create new columns and some not.

But seems pretty stable for the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.