Script to amend monthly rollup microdata #1650

nmdefries · 2022-06-28T21:06:55Z

Description

Add state field based on county FIPS. Change name of wave field to
version.

Changelog

Add script microdata_add_state_col__rename_wave.R

Add state field based on county FIPS. Change name of `wave` field to `version`.

capnrefsmmat

I think this will be fine, just slightly more complicated than it has to be, since the FIPS conversion can be easier.

Not sure if it's worth changing the conversion if it already works. Could you maybe add an assertion after line 42 that checks that all rows with FIPS codes get a non-NA state? That should definitely happen, but if there's something weird about our mapping files (like with territories), we could get issues, and we don't want to have missingness because of that.

facebook/microdata_add_state_col__rename_wave.R

nmdefries · 2022-06-30T16:21:51Z

facebook/amend_monthly_microdata.R

+  # some people enter 9-digit ZIPs, which could make them easily identifiable in
+  # the individual output files. rather than truncating to 5 digits -- which may
+  # turn nonsense entered by some respondents into a valid ZIP5 -- we simply
+  # replace these ZIPs with NA.
+  data$zip5 <- ifelse(nchar(data$zip5) > 5, NA_character_,


The logic for these two functions is borrowed from delphiFacebook/R/responses.R, but I'm wondering if this line is supposed to be handling A3. We drop zip5 anyway, so nulling it out here seems unnecessary. @capnrefsmmat

filter_complete_responses does the other half of the work:

covidcast-indicators/facebook/delphiFacebook/R/responses.R

Lines 772 to 777 in 3590e64

# what zip5 values have a large enough population (>100) to include in micro

# output. Those with too small of a population are blanked to NA

zip_metadata <- produce_zip_metadata(params$static_dir)[, c("zip5", "keep_in_agg")]

zipitude <- left_join(data_full, zip_metadata, by = "zip5")

change_zip <- !is.na(zipitude$keep_in_agg) & !zipitude$keep_in_agg

data_full$A3[change_zip] <- NA

We join with the ZIP metadata based on zip5, then blank out A3 based on the results (not zip5). Don't ask me why we need to use two different columns; undoubtedly the code did that in early 2020 and the logic just got ported over to this version

nmdefries · 2022-07-11T19:05:56Z

@krivard This is ready to merge.

script to amend monthly rollup microdata

09b1648

Add state field based on county FIPS. Change name of `wave` field to `version`.

nmdefries requested a review from capnrefsmmat June 28, 2022 21:06

make hyphen optional in pattern

b9f5f93

capnrefsmmat reviewed Jun 29, 2022

View reviewed changes

facebook/microdata_add_state_col__rename_wave.R Show resolved Hide resolved

nmdefries added 3 commits June 29, 2022 18:27

check that state is missing where fips is missing

1ba94ef

drop responses from territories

e7bd51b

blank zips with low population

c7186eb

nmdefries requested a review from capnrefsmmat June 30, 2022 16:12

nmdefries commented Jun 30, 2022

View reviewed changes

capnrefsmmat approved these changes Jul 11, 2022

View reviewed changes

krivard merged commit 7e5c8a4 into main Jul 11, 2022

krivard deleted the ndefries/microdata-state-col branch July 11, 2022 19:36

krivard mentioned this pull request Jul 27, 2022

Release covidcast-indicators 0.3.19 #1663

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Script to amend monthly rollup microdata #1650

Script to amend monthly rollup microdata #1650

Uh oh!

nmdefries commented Jun 28, 2022

Uh oh!

capnrefsmmat left a comment

Uh oh!

Uh oh!

nmdefries Jun 30, 2022

Uh oh!

capnrefsmmat Jun 30, 2022

Uh oh!

nmdefries commented Jul 11, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	# what zip5 values have a large enough population (>100) to include in micro
	# output. Those with too small of a population are blanked to NA
	zip_metadata <- produce_zip_metadata(params$static_dir)[, c("zip5", "keep_in_agg")]
	zipitude <- left_join(data_full, zip_metadata, by = "zip5")
	change_zip <- !is.na(zipitude$keep_in_agg) & !zipitude$keep_in_agg
	data_full$A3[change_zip] <- NA

Script to amend monthly rollup microdata #1650

Script to amend monthly rollup microdata #1650

Uh oh!

Conversation

nmdefries commented Jun 28, 2022

Description

Changelog

Uh oh!

capnrefsmmat left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

nmdefries Jun 30, 2022

Choose a reason for hiding this comment

Uh oh!

capnrefsmmat Jun 30, 2022

Choose a reason for hiding this comment

Uh oh!

nmdefries commented Jul 11, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants