vignettes/articles/geo_context_of_svi.Rmd

---
title: "Geographic Context of SVI"
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

Since SVI is calculated from percentile ranks of census variables, it's important to keep in mind the SVI values may be affected by the total area that we are ranking each geographic unit against. Several options are available when it comes to the total area included in SVI calculations. 

Currently [CDC/ATSDR Social Vulnerability Index (SVI) database](https://www.atsdr.cdc.gov/placeandhealth/svi/data_documentation_download.html) provides nationwide and statewide SVI at the census tract and county level. Nationwide-SVI will obviously include SVI for all the states, but the value may be different from the statewide-SVI for the corresponding state due to the ranking method. For example, nationwide-SVI at the census tract level is obtained by ranking the variables in each census tract among all tracts in the US, while statewide-SVI for PA at the census tract level is obtained by ranking the variables in each tract among all tracts in PA. Despite the likely different SVI values, both versions offers valuable information about the relative social vulnerability of communities within a different geographic context.

Building upon this, in addition to nationwide or statewide SVI calculations for flexible geographic levels (e.g. at the ZCTA level; mentioned in [Introduction to findSVI](findSVI.html)), findSVI also supports SVI calculations for more specific geographic context (below the state level, e.g. county). Similarly, SVI of any geographic level below state will be included in nationwide and statewide SVI, but the values calculated from findSVI may be different because the percentile ranking is done within a different area (different total units are used in ranking).

Here, we'll briefly show the tract-level SVI values for Philadelphia (County), PA on maps using CDC's nationwide and statewide SVI, as well as a county-specific SVI calculated from findSVI.

```{r setup, warning=FALSE, message=FALSE}
library(dplyr)
library(findSVI)
library(tidyr)
library(stringr)
library(sf)
library(tmap)
```


# Nationwide and statewide SVI

Although we could use findSVI for nationwide- and statewide-SVI, it may be more convenient to download the data from [CDC/ATSDR Social Vulnerability Index (SVI) database](https://www.atsdr.cdc.gov/placeandhealth/svi/data_documentation_download.html), especially for nationwide census tract-level data with geometry.

```{r, eval=FALSE}
#source: https://www.atsdr.cdc.gov/placeandhealth/svi/data_documentation_download.html
#choose map data (recently changed to ESTI geodatabase), unzip files

# nationwide
us_svi <- st_read("../../SVI2020_US_tract.gdb/")

# statewide (PA)
pa_svi <- st_read("../../SVI2020_PENNSYLVANIA_tract.gdb/")
# old format: pa_svi <- st_read("pa_ct_2020_shapefile/SVI2020_PENNSYLVANIA_tract.shp")

# alternative: findSVI
# us_data <- get_census_data(2020, 
#   state = "US", #"PA" for statewide data
#   geography = "tract", 
#   geometry = TRUE)
# 
# us_svi <- get_svi(2020, us_data)

```

After obtaining the nationwide and statewide(Pennsylvania) SVI by either findSVI or CDC's database, we could filter the result by the county and only keep the data for Philadelphia.

```{r, echo=FALSE}
load(system.file("extdata", "us_ct_svi_phl_2020_geo.rda", package = "findSVI"))
load(system.file("extdata", "pa_ct_svi_phl_2020_geo.rda", package = "findSVI"))
load(system.file("extdata", "phl_ct_svi_2020_geo.rda", package = "findSVI"))
```

```{r, eval=FALSE}
# nationwide
us_svi_phl <- us_svi %>%
  select(1:7, contains("RPL_THEME")) %>%
  rename(GEOID = FIPS,
    #format switched to ESRI gdb, col name change -7/28/23
    geometry = Shape) %>%
    #CDC use -999 as NAs
  filter(RPL_THEMES >= 0,
    ST_ABBR == "PA",
    COUNTY == "Philadelphia")

glimpse(us_svi_phl)
```
```{r, echo=FALSE}
glimpse(us_svi_phl)
```

```{r, eval=FALSE}
# statewide (PA)
pa_svi_phl <- pa_svi %>%
  select(1:7, contains("RPL_THEME")) %>%
  rename(GEOID = FIPS) %>%
  filter(RPL_THEMES>= 0,
    COUNTY == "Philadelphia")

glimpse(pa_svi_phl)
```
```{r, echo=FALSE}
glimpse(pa_svi_phl)
```

# County-specific SVI

For retrieving data and calculation at any geographic level below state, we need to use `get_census_data()` followed by `get_svi()`. (For nation- or state-level data processing without geometry, the easiest option is one-step `find_svi()`.)

```{r, eval=FALSE}
phl_ct_2020_data <- get_census_data(
  2020, 
  state = "PA",
  county = "Philadelphia",
  geography = "tract",
  geometry = TRUE
)

phl_ct_svi_2020 <- get_svi(2020, phl_ct_2020_data)%>%
  select(GEOID, contains("RPL_theme")) %>%
  drop_na()
```

```{r}
glimpse(phl_ct_svi_2020)

# matching id from nation/statewide SVI for mapping
match_id <- pa_svi_phl$GEOID

phl_svi <- phl_ct_svi_2020 %>% 
  filter(GEOID %in% match_id)
```
# 3 angles of a story

For visualization, we'll create maps on the same color scale (using [tmap](https://r-tmap.github.io/tmap/)).

```{r}
nation <- us_svi_phl %>%
  select(GEOID, geometry, RPL_THEMES) %>%
  drop_na() %>%
  #using NAD83 in tmap
  tm_shape(projection = sf::st_crs(26915))+
  tm_polygons("RPL_THEMES",
    palette = c("orange","navy"),
    style = "cont",
    breaks = c(0, 0.25, 0.5, 0.75, 1),
    title = "SVI (by tract)")+
  tm_layout(title = "Nationwide SVI: \nPhiladelphia subset",
    title.size = 1,
    title.position = c("left", "TOP"),
    legend.position = c("RIGHT", "bottom"),
    legend.title.size = 0.9,
    legend.width = 2)

state <- pa_svi_phl %>%
  select(GEOID, geometry, RPL_THEMES) %>%
  drop_na() %>%
  tm_shape(projection = sf::st_crs(26915))+
  tm_polygons("RPL_THEMES",
    palette = c("orange","navy"),
    style = "cont",
    breaks = c(0, 0.25, 0.5, 0.75, 1),
    title = "SVI (by tract)")+
  tm_layout(title = "Statewide SVI: \nPhiladelphia subset",
    title.size = 1,
    title.position = c("left", "TOP"),
    legend.position = c("RIGHT", "bottom"),
    legend.title.size = 0.9,
    legend.width = 2)

county <- phl_svi %>%
  select(GEOID, geometry, RPL_themes) %>%
  drop_na() %>%
  tm_shape(projection = sf::st_crs(26915))+
  tm_polygons("RPL_themes",
    palette = c("orange","navy"),
    style = "cont",
    breaks = c(0, 0.25, 0.5, 0.75, 1),
    title = "SVI (by tract)")+
  tm_layout(title = "County-specific SVI: \nPhiladelphia",
    title.size = 1,
    title.position = c("left", "TOP"),
    legend.position = c("RIGHT", "bottom"),
    legend.title.size = 0.9,
    legend.width = 2)

plots <- list(nation, state, county)

current.mode <- tmap_mode("plot")

tmap_arrange(
  plots,
  nrow = 1,
  width = c(0.34, 0.33, 0.33)
)

```

As we can see, while the SVI values (actual color) vary among the three maps depending on how the percentile ranking is performed to obtain the SVI, they generally show the same pattern (which tracts are darker/lighter in each map). 

Overall, while total areas included in percentile ranking affect resulting SVI values, they provide different perspectives to look at the same story. On another note, when studying the social vulnerability within a smaller region, especially metropolitan areas/cities, it may be helpful to consider the geographic context in interpreting SVI to better understand the disparities among communities.