Four date/time columns preserved after clean_raw_data() #7

mackerman44 · 2020-06-09T15:29:37Z

This one has perhaps been resolved. Maybe we just need to re-run clean_raw_data() and determine if it functions as intended, now.

@mackerman44 April 13, 2020:
When running the clean_raw_data() function on the "raw" data, multiple date and/or time columns are generated and afterwards we end up with 4 columns (orig_date, date, time, and date_time). Those columns are also preserved through round_tag_codes().

Are those columns duplicative i.e. can we only bring the date_time column through clean_raw_data()? Or is there a reason for keeping additional columns?

@KevinSee April 14, 2020:
The org_date was left to identify which rows had an original date-stamp of "00/00/00", for QA/QC purposes, or in case we realized that assigning a date for those observations based on the file name was not a good idea. If we're settled on how that works, we could remove orig_date, date and time columns from the data when it's run through clean_raw_data().

@mackerman44 April 14, 2020:
That makes sense to me. But maybe leave the orig_date, date, and time columns for now until we're further along in the build process? But also keep this issue open to remind ourselves later to remove?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Four date/time columns preserved after clean_raw_data() #7

Four date/time columns preserved after clean_raw_data() #7

mackerman44 commented Jun 9, 2020

Four date/time columns preserved after clean_raw_data() #7

Four date/time columns preserved after clean_raw_data() #7

Comments

mackerman44 commented Jun 9, 2020