# Download Census Tracts

In this notebook we download the census tract shapefiles for each state, then merge them all together with ogr2ogr. We perform this process for both the 2000 and 2010 census tract definitions to create 2 "master" census tract shapefiles that are used in later scripts.

This notebook uses `boundary_shapefiles/counties/tl_2010_us_county10.shp` as an input and creates `boundary_shapefiles/census_tracts/tl_2010_all_tract00.shp` and `boundary_shapefiles/census_tracts/tl_2010_all_tract10.shp`.

In [1]:
import os
import subprocess
import fiona

In [2]:
# create the directories that we will use in this notebook
dirs = [
    "boundary_shapefiles/census_tracts/zips/",
    "boundary_shapefiles/census_tracts/single_2000/",
    "boundary_shapefiles/census_tracts/single_2010/"
]
for path in dirs:
    if not os.path.exists(path):
        os.makedirs(path)

In [4]:
# use the 2010 county boundary shapefile to get the 2 digit state codes for all states (we need these to get the URLs for
# the census tracts for each state)
f = fiona.open("boundary_shapefiles/counties/tl_2010_us_county10.shp","r")
state_ids = set()
for shape in f:
    state_ids.add(shape["properties"]["STATEFP10"])
f.close()

## Download/Merge 2010 Tracts

Download and merge all the 2010 tract definitions.

In [5]:
# download per-state census tract files
for state_id in state_ids:
    subprocess.call([
        "wget",
        "--directory-prefix=boundary_shapefiles/census_tracts/zips",
        "https://www2.census.gov/geo/tiger/TIGER2010/TRACT/2010/tl_2010_%s_tract10.zip" % (state_id)
    ])

In [6]:
# make sure that we were actually able to download each file (maybe we could check the return code from wget instead?) 
for state_id in state_ids:
    assert os.path.exists("boundary_shapefiles/census_tracts/zips/tl_2010_%s_tract10.zip" % (state_id))

In [7]:
# unzip each downloaded files
for state_id in state_ids:
    subprocess.call([
        "unzip",
        os.path.join("boundary_shapefiles/census_tracts/zips/", "tl_2010_%s_tract10.zip" % (state_id)),
        "-d", "boundary_shapefiles/census_tracts/single_2010/"
    ])

In [8]:
# use ogr2ogr to merge all of the individual shapefiles
# I'm pretty sure that all of these will be in the epsg:4269 coordinate system
for state_id in state_ids:
    subprocess.call([
        "ogr2ogr",
        "-f", "ESRI Shapefile",
        "-update",
        "-append",
        "boundary_shapefiles/census_tracts",
        os.path.join("boundary_shapefiles/census_tracts/single_2010/", "tl_2010_%s_tract10.shp" % (state_id)),
        "-nln", "tl_2010_all_tract10"
    ])

## Download/Merge 2000 Tracts

Repeating the above steps for the 2000 tract definitions.

In [9]:
for state_id in state_ids:
    subprocess.call([
        "wget",
        "--directory-prefix=boundary_shapefiles/census_tracts/zips",
        "https://www2.census.gov/geo/tiger/TIGER2010/TRACT/2000/tl_2010_%s_tract00.zip" % (state_id)
    ])

In [10]:
for state_id in state_ids:
    assert os.path.exists("boundary_shapefiles/census_tracts/zips/tl_2010_%s_tract00.zip" % (state_id))

In [11]:
for state_id in state_ids:
    subprocess.call([
        "unzip",
        os.path.join("boundary_shapefiles/census_tracts/zips/", "tl_2010_%s_tract00.zip" % (state_id)),
        "-d", "boundary_shapefiles/census_tracts/single_2000/"
    ])

In [12]:
for state_id in state_ids:
    subprocess.call([
        "ogr2ogr",
        "-f", "ESRI Shapefile",
        "-update",
        "-append",
        "boundary_shapefiles/census_tracts",
        os.path.join("boundary_shapefiles/census_tracts/single_2000/", "tl_2010_%s_tract00.shp" % (state_id)),
        "-nln", "tl_2010_all_tract00"
    ])