Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could not fetch variables in microdata #292

Closed
AlexLi-design opened this issue Sep 8, 2020 · 4 comments
Closed

Could not fetch variables in microdata #292

AlexLi-design opened this issue Sep 8, 2020 · 4 comments

Comments

@AlexLi-design
Copy link

Thanks for the terrific package to get the microdata! I am optimistic that someone could help me to solve my problem. I tried to capture data from the PUMS using the developed "tidycensus" data, but it seems something gets wrong. I could run the functions well, but when I run the codes to fetch the data, the reminder shows:
Error: Your API call has errors. The API message returned is <title>Error report</title>

HTTP Status 500 -

.
I do not know what happens, but I double-checked the variable names and my codes, and it seems everything is correct.
I copy/paste my codes as below:

rm(list = ls())
install.packages("devtools")
remotes::install_github("walkerke/tidycensus",force=TRUE)
library(tidycensus)
library(tidyverse)
library(dplyr)
census_api_key("my key",install=TRUE,overwrite=TRUE)
readRenviron("~/.Renviron")
pums_vars_2018 <- pums_variables %>%
filter(year == 2018, survey == "acs5")
View(pums_vars_2018)
var_housing<-c("AGEP","SEX","RAC1P","HISP",
"HINCP","SCHL",
"DDRS","DEAR","DEYE","DOUT","DPHY",
"MAR","NPF","FES",
"MV","WORKSTAT","HHT","OCPIP","GRPIP","DIVISION","BLD","TEN","VEH","YBL")
ACS<- get_pums(variables=c("AGEP","SEX","RAC1P","HISP",
"HINCP","SCHL",
"DDRS","DEAR","DEYE","DOUT","DPHY",
"MAR","NPF","FES",
"MV","WORKSTAT","HHT","OCPIP","GRPIP","DIVISION","BLD","TEN","VEH","YBL"),
state="all",
year = 2018,
survey = "acs5",
show_call = TRUE)

Thanks!

@mfherman
Copy link
Collaborator

mfherman commented Sep 8, 2020

Hi -- I'm not totally sure why this is failing, but my guess is the issue is the API is having trouble pulling the 15M+ records for the for all states. If you need PUMS data for all states, you might consider looping through the states one at a time and building in some error handling methods like tryCatch() or purrr::possibly().

Here is an example with your variables from just one (small!) state:

library(tidycensus)

var_housing <- c(
  "AGEP", "SEX", "RAC1P", "HISP", "HINCP","SCHL",
  "DDRS", "DEAR", "DEYE", "DOUT", "DPHY", "MAR",
  "NPF", "FES", "MV", "WORKSTAT", "HHT", "OCPIP", "GRPIP",
  "DIVISION", "BLD", "TEN", "VEH", "YBL"
  )

pums_data <- get_pums(
  variables = var_housing,
  state = "VT",
  year = 2018,
  survey = "acs5"
  )
#> Getting data from the 2014-2018 5-year ACS Public Use Microdata Sample
pums_data
#> # A tibble: 31,883 x 29
#>    SERIALNO SPORDER  WGTP PWGTP  AGEP  HINCP   NPF OCPIP GRPIP DIVISION ST   
#>    <chr>    <chr>   <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl> <chr>    <chr>
#>  1 2014000~ 1           5     6    55  39000     1    26     0 1        50   
#>  2 2014000~ 2           5     7    56  39000     1    26     0 1        50   
#>  3 2014000~ 1          11    12    57 100000     2     7     0 1        50   
#>  4 2014000~ 2          11    13    61 100000     2     7     0 1        50   
#>  5 2014000~ 1           8     8    71  23200     1    25     0 1        50   
#>  6 2014000~ 1          12    12    56  57600     2    11     0 1        50   
#>  7 2014000~ 2          12    13    53  57600     2    11     0 1        50   
#>  8 2014000~ 1           5     5    67  25410     2    18     0 1        50   
#>  9 2014000~ 2           5     5    67  25410     2    18     0 1        50   
#> 10 2014000~ 1           7     7    55  94700     2     0    10 1        50   
#> # ... with 31,873 more rows, and 18 more variables: BLD <chr>, TEN <chr>,
#> #   VEH <chr>, YBL <chr>, FES <chr>, HHT <chr>, MV <chr>, WORKSTAT <chr>,
#> #   DDRS <chr>, DEAR <chr>, DEYE <chr>, DOUT <chr>, DPHY <chr>, MAR <chr>,
#> #   SCHL <chr>, SEX <chr>, HISP <chr>, RAC1P <chr>

Created on 2020-09-08 by the reprex package (v0.3.0)

@mfherman
Copy link
Collaborator

mfherman commented Sep 8, 2020

Alternatively, you could download the entire PUMS file from the Census FTP:

https://www2.census.gov/programs-surveys/acs/data/pums/2018/5-Year/

It would be the files that have the us suffix like csv_hus.zip

@AlexLi-design
Copy link
Author

AlexLi-design commented Sep 8, 2020 via email

@walkerke
Copy link
Owner

walkerke commented Sep 9, 2020

@mfherman's solution is the right one. My general advice to tidycensus users has always been to look to other sources if you need bulk Census data pulls, as large requests are going to inevitably put pressure on the API. I'd recommend IPUMS and its companion R package {ipumsr} for these sorts of bulk requests.

@walkerke walkerke closed this as completed Sep 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants