Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

keep original columns for clean_data #1

Open
dirkschumacher opened this issue Oct 19, 2018 · 4 comments
Open

keep original columns for clean_data #1

dirkschumacher opened this issue Oct 19, 2018 · 4 comments

Comments

@dirkschumacher
Copy link
Member

Having a magical function that does everything is great, but I can imagine that keeping the original values of the columns helps to trust the transformation.

E.g. the values before the transformations could be kept in the resulting data frame with an added suffix date_of_onset_original or something

zkamvar added a commit that referenced this issue Jan 11, 2019
This is in accordance with @dirkschumacher's suggestion in #1 Also, I
found the `comment()` function, which seems really useful for this task
:)
@zkamvar
Copy link
Member

zkamvar commented Jan 21, 2019

IIRC, @thibautjombart is a bit opposed to this concept for the fact that the user can do:

old_data <- the_data
the_data <- linelist::clean_data(the_data)

@dirkschumacher
Copy link
Member Author

Keeping the original and modified data together helps spot errors. Especially if you use a magic function like "clean_data" that might behave differently with future releases of linelist. What about adding a parameter with the default to include the original columns?

@zkamvar
Copy link
Member

zkamvar commented Jan 21, 2019

I think that's a good idea! Plus, there could be a function that uses diffObj to compare the cleaned and original columns.

@thibautjombart
Copy link
Contributor

I agree, as an additional argument. I would add the columns so that the original and 'cleaned' variables are next to each other:

toto   toto_clean  tata   tata_clean ...

@zkamvar zkamvar mentioned this issue Feb 8, 2019
zkamvar pushed a commit that referenced this issue Oct 16, 2019
Merge branch regex-varnames into master
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants