Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there any way to download multiple year data at one time #131

Closed
waholulu opened this issue Nov 12, 2018 · 5 comments
Closed

Is there any way to download multiple year data at one time #131

waholulu opened this issue Nov 12, 2018 · 5 comments

Comments

@waholulu
Copy link

I intend to make a plot of how stats change over the years. But when I set year variable to be a sequence of the year. it will have the error like
Error in is.url(url) : length(url) == 1 is not TRUE

How to solve this issue. Thanks

@mfherman
Copy link
Collaborator

One approach to get multiple years is to loop over a set of years with the same get_acs()call. I like using the map() family of functions from the purrr package for this.

Note, this approach is similar to the discussion in #121 and #129 in which you can loop over multiple county/state combinations. The main difference is that I loop over a named list of years. This allows me to use the names of the list as the id variable when I bind the the rows into one data frame.

library(tidyverse)
library(tidycensus)

years <- lst(2015, 2016, 2017)

map_dfr(
  years,
  ~ get_acs(
    geography = "county",
    variables = "B01003_001",
    state = "ME",
    year = .x,
    survey = "acs1"
    ),
  .id = "year"
  )
#> Getting data from the 2015 1-year ACS
#> The one-year ACS provides data for geographies with populations of 65,000 and greater.
#> Getting data from the 2016 1-year ACS
#> The one-year ACS provides data for geographies with populations of 65,000 and greater.
#> Getting data from the 2017 1-year ACS
#> The one-year ACS provides data for geographies with populations of 65,000 and greater.
#> # A tibble: 18 x 6
#>    year  GEOID NAME                       variable   estimate   moe
#>    <chr> <chr> <chr>                      <chr>         <dbl> <dbl>
#>  1 2015  23001 Androscoggin County, Maine B01003_001   107233    NA
#>  2 2015  23003 Aroostook County, Maine    B01003_001    68628    NA
#>  3 2015  23005 Cumberland County, Maine   B01003_001   289977    NA
#>  4 2015  23011 Kennebec County, Maine     B01003_001   119980    NA
#>  5 2015  23019 Penobscot County, Maine    B01003_001   152692    NA
#>  6 2015  23031 York County, Maine         B01003_001   201169    NA
#>  7 2016  23001 Androscoggin County, Maine B01003_001   107319    NA
#>  8 2016  23003 Aroostook County, Maine    B01003_001    67959    NA
#>  9 2016  23005 Cumberland County, Maine   B01003_001   292041    NA
#> 10 2016  23011 Kennebec County, Maine     B01003_001   120569    NA
#> 11 2016  23019 Penobscot County, Maine    B01003_001   151806    NA
#> 12 2016  23031 York County, Maine         B01003_001   202343    NA
#> 13 2017  23001 Androscoggin County, Maine B01003_001   107651    NA
#> 14 2017  23003 Aroostook County, Maine    B01003_001    67653    NA
#> 15 2017  23005 Cumberland County, Maine   B01003_001   292500    NA
#> 16 2017  23011 Kennebec County, Maine     B01003_001   121821    NA
#> 17 2017  23019 Penobscot County, Maine    B01003_001   151957    NA
#> 18 2017  23031 York County, Maine         B01003_001   204191    NA

Created on 2018-11-12 by the reprex package (v0.2.1)

@waholulu
Copy link
Author

Thanks @mfherman
But this method will remove the geometry in the dataset due to forcing merge the data. But I need to plot this data, so I have to keep to geometry. Is there any other option?

@mfherman
Copy link
Collaborator

You have to do a couple extra steps to get the geometry because bind_rows() currently doesn't work on sf objects (see r-spatial/mapedit#46).

The basic approach is the same, but instead of map_dfr() I use map() to keep the results in a list, then add the id column for each year and finally combine them using rbind().

library(tidyverse)
library(tidycensus)

years <- lst(2015, 2016, 2017)

multi_year <- map(
  years,
  ~ get_acs(
    geography = "county",
    variables = "B01003_001",
    state = "ME",
    year = .x,
    survey = "acs1",
    geometry = TRUE
    )
  ) %>% 
  map2(years, ~ mutate(.x, id = .y))

reduce(multi_year, rbind)
#> Simple feature collection with 18 features and 6 fields
#> geometry type:  MULTIPOLYGON
#> dimension:      XY
#> bbox:           xmin: -70.98904 ymin: 42.97776 xmax: -67.75042 ymax: 47.45969
#> epsg (SRID):    4269
#> proj4string:    +proj=longlat +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +no_defs
#> First 10 features:
#>    GEOID                       NAME   variable estimate moe   id
#> 1  23001 Androscoggin County, Maine B01003_001   107233  NA 2015
#> 2  23003    Aroostook County, Maine B01003_001    68628  NA 2015
#> 3  23005   Cumberland County, Maine B01003_001   289977  NA 2015
#> 4  23011     Kennebec County, Maine B01003_001   119980  NA 2015
#> 5  23019    Penobscot County, Maine B01003_001   152692  NA 2015
#> 6  23031         York County, Maine B01003_001   201169  NA 2015
#> 7  23001 Androscoggin County, Maine B01003_001   107319  NA 2016
#> 8  23003    Aroostook County, Maine B01003_001    67959  NA 2016
#> 9  23005   Cumberland County, Maine B01003_001   292041  NA 2016
#> 10 23011     Kennebec County, Maine B01003_001   120569  NA 2016
#>                          geometry
#> 1  MULTIPOLYGON (((-70.48529 4...
#> 2  MULTIPOLYGON (((-70.01975 4...
#> 3  MULTIPOLYGON (((-69.94153 4...
#> 4  MULTIPOLYGON (((-70.13259 4...
#> 5  MULTIPOLYGON (((-69.35567 4...
#> 6  MULTIPOLYGON (((-70.61725 4...
#> 7  MULTIPOLYGON (((-70.48529 4...
#> 8  MULTIPOLYGON (((-70.01975 4...
#> 9  MULTIPOLYGON (((-69.94153 4...
#> 10 MULTIPOLYGON (((-70.13259 4...

Created on 2018-11-12 by the reprex package (v0.2.1)

@waholulu
Copy link
Author

Thanks @mfherman
But seems like still something wrong keep from plotting the df.

library(tidyverse)
library(tidycensus)
years <- lst(2013,2014,2015)
multi_year <- map(years,  ~ get_acs(geography = "county",   variables = "B01003_001",   state = "NC",  year = .x,  geometry = TRUE)) %>% 
  map2(years, ~ mutate(.x, id = .y))
nc <- reduce(multi_year, rbind)
ggplot(nc) + geom_sf(aes(fill = estimate), color = NA) +coord_sf(datum = NA) + facet_wrap(~id)

> ggplot(nc) + geom_sf(aes(fill = estimate), color = NA) +coord_sf(datum = NA)   +  facet_wrap(~id)
Error in st_sfc(NextMethod(), crs = st_crs(old), precision = st_precision(old)) : 
  found multiple dimensions: XY XYZ

@mfherman
Copy link
Collaborator

Hmm...It's working on my end. Maybe try updating to the latest versions of ggplot2 and sf?

Aside from the code, another issue is that you probably shouldn't be comparing overlapping 5-year datasets (see https://www.census.gov/programs-surveys/acs/guidance/comparing-acs-data.html). In the code you posted, you are getting data from 2009-2013, 2010-2014, and 2011-2015. To compare across these years, you would need to use the ACS 1-year estimates. If you want to use 5-year estimates, you could compare say 2007-2011 to 2012-2016.

library(tidyverse)
library(tidycensus)

years <- lst(2013, 2014, 2015)

multi_year <-
  map(
    years,
    ~ get_acs(
      geography = "county",
      variables = "B01003_001",
      state = "NC",
      year = .x,
      geometry = TRUE
      )
  ) %>%
  map2(years, ~ mutate(.x, id = .y))

nc <- reduce(multi_year, rbind)

ggplot(nc) +
  geom_sf(aes(fill = estimate), color = NA) +
  coord_sf(datum = NA) +
  facet_wrap(~ id)


Created on 2018-11-12 by the reprex package (v0.2.1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants