Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Four date/time columns preserved after clean_raw_data() #7

Open
mackerman44 opened this issue Jun 9, 2020 · 0 comments
Open

Four date/time columns preserved after clean_raw_data() #7

mackerman44 opened this issue Jun 9, 2020 · 0 comments

Comments

@mackerman44
Copy link
Owner

This one has perhaps been resolved. Maybe we just need to re-run clean_raw_data() and determine if it functions as intended, now.

@mackerman44 April 13, 2020:
When running the clean_raw_data() function on the "raw" data, multiple date and/or time columns are generated and afterwards we end up with 4 columns (orig_date, date, time, and date_time). Those columns are also preserved through round_tag_codes().

Are those columns duplicative i.e. can we only bring the date_time column through clean_raw_data()? Or is there a reason for keeping additional columns?

@KevinSee April 14, 2020:
The org_date was left to identify which rows had an original date-stamp of "00/00/00", for QA/QC purposes, or in case we realized that assigning a date for those observations based on the file name was not a good idea. If we're settled on how that works, we could remove orig_date, date and time columns from the data when it's run through clean_raw_data().

@mackerman44 April 14, 2020:
That makes sense to me. But maybe leave the orig_date, date, and time columns for now until we're further along in the build process? But also keep this issue open to remind ourselves later to remove?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant