New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update: new web agg write #105
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yay thank you for doing this! And thank you for adding max_na_rm
as a function too! A few little thoughts:
- Do you think it's worth having a
state
param that works the same way as in the othercalc_aggregate_counts
? Obviously extremely easy to get thestate = F
version from the output of this, so not sure if it's worth it. - Is there a reason you called the web grouping
Prison
instead ofState
? (thinking about federal prisons) - Re: MP – I think we have to assign the whole totals to the state web grouping...? Probably whenever MP value > our value (as opposed to just when we don't have a value) to catch Texas, Ohio, etc., I think?
One thing I'm still iffy about is rates (literally never sure about rates). For State
-Web.Group
combinations where we have Residents.Population
and/or Population.Feb20
for ALL (or basically all) facilities, that seems super straightforward. And for combinations where we don't have ANY population data, that also seems like a straightforward NA situation. But when we have population values for some (but not all) facilities in a grouping, do we use the total numerator for the measure but the partial denominator? Even if that overestimates rates? I honestly don't know what we're doing right now on the website to address this, so might be a perfect enemy of good situation.
Jurisdiction == "immigration" ~ "ICE", | ||
Jurisdiction == "federal" ~ "Federal", | ||
Age == "Juvenile" ~ "Juvenile", | ||
Jurisdiction == "state" ~ "Prison", | ||
Jurisdiction == "psychiatric" ~ "Psychiatric", | ||
Jurisdiction == "county" ~ "County", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The order here means that
- facilities that are
Juvenile
andimmigration
orfederal
would NOT beJuvenile
, but - facilities that are
Juvenile
andcounty
,psychiatric
, orstate
WOULD be
There's only one federal
or immigration
row this affects right now (population data for ALL BOP CONTRACT JUVENILES
), so I feel like it's fine the way it is – but maybe worth adding a comment for future selves!
R/alt_aggregate_counts.R
Outdated
)) | ||
|
||
fac_long_df <- ucla_df %>% | ||
filter(State != "Not Available") %>% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thoughts on leaving these in and just keeping the state as Not Available
? Means we'd keep both federal / ice facilities that we haven't gotten around to adding yet, but also federal aggregations (e.g. RRCs) where they're cross-state so the actual assigned state is Not Available
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this idea but I'm not sure, practically, whether this matters. If state = "Not Available", would the numbers show up anywhere on the website? I don't think so
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just checking, since these are aggregated they wouldn't show up here right? this is only for "facilities"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's my understanding!
R/alt_aggregate_counts.R
Outdated
Age == "Juvenile" ~ "Juvenile", | ||
Jurisdiction == "state" ~ "Prison", | ||
Jurisdiction == "psychiatric" ~ "Psychiatric", | ||
Jurisdiction == "cou nty" ~ "County", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
misspelling here on "county"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside from one tiny typo, this all looks good to me! Thank you so much for dealing with all this craziness and using good variable names to wit :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks wonderful! Made one suggestion, and I still feel like we should keep the Not Available
state rows, but overall looks great // thank you so much for working through this!!
pri_df <- state_df %>% | ||
filter(Web.Group == "Prison") %>% | ||
full_join(mp_df, by = c("State", "Date", "Measure")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want the Web.Group
for everything from MP to be Prison
? If so, should we assign that here? (or wherever might make more sense) otherwise there are NA Web.Groups
so that data won't make it onto the website :(
R/read_mpap_pop_data.R
Outdated
#' | ||
#' @examples | ||
#' \dontrun{ | ||
#' read_staff_popfeb20() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be read_mpap_pop_data()
.
@@ -0,0 +1,25 @@ | |||
#' Get Feb 20, 2020 population data for applicable rows in the fac_data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we use this function anywhere? // I think it's useful, but curious why it ended up here!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to add it in if were to start using staff data denoms anywhere. Not using yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yay ty ty lgtm! Added a few last comments, but I don't think they need to preclude merging this. I guess the next step is to update write_latest_data
// let HO know?
New aggregate function which aggregates data by different jurisdictions. The function logic is very similar to
calc_aggregate_counts
except each state has a count for each jurisdiction. The main part of the code is here.Leaving this up as a draft for now to see if we agree on this categorizing and how we want to deal with incorporating MP data.