# Data Integration for Austin Granular Model

## Data categories and sources

### Census block group population data by age

See `epimodels/notebooks/AustinGranularModel/CBG/TRAVISCO_TX_CBG_age_populations.r` for details on data download.

### Census block group geometries by year

See `epimodels/notebooks/AustinGranularModel/CBG/TRAVISCO_TX_CBG_age_populations.r` for details on data download.

### Census block group level private school enrollment

See `epimodels/notebooks/AustinGranularModel/Schools/enrollment_by_district.r` for details on data download.

### AISD school attendance boundaries

- 2020-21 downloaded from AISD website
- 2019-20 and 2018-19 obtained through open records request to AISD

### AISD school calendars



### Baseline contact rates

Derived from Mossong J, Hens N, Jit M, Beutels P, Auranen K, Mikolajczyk R, et al. (2008) Social Contacts and Mixing Patterns Relevant to the Spread of Infectious Diseases. PLoS Med 5(3): e74. https://doi.org/10.1371/journal.pmed.0050074

### Mobility patterns

From Safegraph/Kelly Gaither

## Core assumptions

1. Population is uniform over census block groups and school attendance boundaries (no accounting is made for commercial/non-residential areas).
2. The same percentage of children in each CBG are enrolled in private school across all age groups (the percent of elementary school aged students enrolled in private elementary schools is the same as the percent of middle school aged students enrolled in middle schools, etc.).
3. Safegraph travel for people 13 and older is reflective of travel patterns across all ages.
4. School attendance transfers have a negligible impact on travel and contact and can be disregarded.

## Integration workflow

### 1. Subtract private school students from each census block group's child population

1. Multiply the child population in each census block group by the percent of children enrolled in public school in the corresponding census block group (enrollment in private vs public school has a coarse age breakdown, so assume the percentage is constant across all age groups).

### 2. Trim census block groups to AISD boundary

1. Intersect the census block group shapefile with the AISD boundary shapefile
2. Calculate the overlap between each census block group and the total AISD area
3. If overlap is less than 100% (for CBGs on the edge of the district), multiply the CBG population by the percentage overlap.
4. Save a trimmed census block group shapefile and an adjusted population dataset.

### 3. Students attending school by home census block group (trimmed)

1. Calculate the area of intersection between each census block group and all school boundary areas that intersect (typically one elementary school, one middle school, and one high school).
2. Calculate the percentage of each census block group that is assigned to each school.
3. Stratify census block group popluation data by age.
    - ages 5-10: elementary school
    - ages 11-13: middle school
    - ages 14-17: high school
4. Calculate the students from each CBG$_{i}$ attending school$_{j}$ for each school level *k* as 
    
    attendance$_{jk}$ = areal overlap$_{ij}$ / area CBG$_{i}$ * population CBG$_{ik}$
    
### 4. Integrate mobility data

1. Un-pivot the CBG visits matrix to a long-form table with source and destination columns.
2. Group mobility data by source and calculate the percentage of each source CBG traveling to each destination CBG.
3. For weekends/non-school-days, multiply this percentage of travel across census block group populations for all age groups.
4. For weekdays/school days, multiply this percentage of travel across census block group populations for ages 18 and up only.
5. For weekdays/school days, append the student attendance data calculated in step 3 to account for school age groups.
