Redistricting analytics data & shared code
To get the data locally, clone the repository:
$ git clone https://github.com/alecramsay/rdabase
$ cd rdabase
To use the shared code, install the package:
$ pip install rdabase
The data are stored in the data
directory by state.
These pages describe each dataset:
- Data: Census and election data by precinct.
- Shapes: Shape properties by precinct.
- Graph: Precinct adjacency graph.
At present, data for fifteen states have been extracted. In the future, we may extract data for other states.
Some shared code and scripts are described here:
- Shared Code: Common code used in multiple applications.
- Scripts: Scripts to re-format the data for specific applications.
The data comes from the following sources:
- The total census population & VAP demographics data comes from the 2020_census_XX-N.csv in the DRA vtd_data GitHub repository, where XX is the state abbreviation and N is the suffix. We take the latest version of the data, which is the one with the highest N.
- The election data comes from the 2020_election_XX-N.csv in the same repo.
- The shapes are copies of tl_2020_FF_vtd20.zip from the Census Bureau, where FF is the state FIPS code, e.g., 37 for North Carolina.
Some things to be aware of:
- If it exists, we use the adjusted population data instead of the official 2020 census total population data.
- For Florida, the official VTDs from the Census Bureau are bad. We used DRA's corrected precinct shapes (GeoJSON), removed the intersections, and then converted it to a shapefile.
- We simplify the precinct shapes (see
extract_shape_data.py
) to approximate the simplification that DRA does, so compactness measurements align.
$ pytest