# Index Counties

In order to connect the geographic shapes to the Sanborn metadata, the geographic shapes file (us.json) also needs to have information on which item in the Sanborn data it corresponds to. The Sanborn data is organized in ordered lists, so we can use the indices to access the respective information.

## Load Modules and Files

First, let's load in the modules and files:

In [1]:
import json

In [2]:
with open('../../sanborn-with-fips.json') as f:
    sanborn = json.load(f)

In [3]:
with open('../../us-indexed.json') as f: #previously was us.json
    us = json.load(f)

## Create FIPS to Index Dictionary

Next, loop through the Sanborn data and create a dictionary that maps from the FIPS code to the indices of the record(s). These locations are a list of dictionaries that have a 'state' and 'county' key, where each holds the state and county index, respectively.

In [33]:
fips2index = dict()

for i in range(len(sanborn)):
    state = sanborn[i]
    for j in range(len(state['counties'])):
        county = state['counties'][j]
        for code in county['fips']:        
            if code not in fips2index:
                fips2index[code] = []
            fips2index[code].append({'state': i, 'county': j})

In [34]:
fips2index

{1067: [{'state': 0, 'county': 0}],
 1055: [{'state': 0, 'county': 1}],
 1123: [{'state': 0, 'county': 2}, {'state': 0, 'county': 51}],
 1107: [{'state': 0, 'county': 3}],
 1015: [{'state': 0, 'county': 4}],
 1083: [{'state': 0, 'county': 5}],
 1081: [{'state': 0, 'county': 6}, {'state': 0, 'county': 43}],
 1073: [{'state': 0, 'county': 7}],
 1053: [{'state': 0, 'county': 8}],
 1071: [{'state': 0, 'county': 9}],
 1109: [{'state': 0, 'county': 10}],
 1117: [{'state': 0, 'county': 11}],
 1131: [{'state': 0, 'county': 12}],
 1097: [{'state': 0, 'county': 13}],
 1021: [{'state': 0, 'county': 14}],
 1005: [{'state': 0, 'county': 15}],
 1069: [{'state': 0, 'county': 16}],
 1043: [{'state': 0, 'county': 17}],
 1103: [{'state': 0, 'county': 18}],
 1091: [{'state': 0, 'county': 19}],
 1031: [{'state': 0, 'county': 20}],
 1063: [{'state': 0, 'county': 21}],
 1035: [{'state': 0, 'county': 22}],
 1039: [{'state': 0, 'county': 23}],
 1077: [{'state': 0, 'county': 24}],
 1003: [{'state': 0, 'county'

Note: Some FIPS codes show up multiple times in the Sanborn data due to cities that are in multiple counties. For example, the code 19083 is Hardin County in Iowa. In the Sanborn data, this county shows up once listed as Hardin County AND a second time listed as Hardin and Franklin Counties:

In [9]:
for state in sanborn:
    for county in state['counties']:
        if 19083 in county['fips']:
            print(county['county'], county['fips'], state['state'])

Hardin and Franklin Counties [19083, 19069] Iowa
Hardin County [19083] Iowa


## Connect to Geographic Shapes

Now, we need to look through each county in us, check the FIPS code, and match it to the respective Sanborn indices list.

Let's first take a look at how the us data is structured:

In [10]:
us

{'type': 'Topology',
 'objects': {'counties': {'type': 'GeometryCollection',
   'bbox': [-179.1473399999999,
    17.67439566600018,
    179.7784800000003,
    71.38921046500008],
   'geometries': [{'type': 'MultiPolygon',
     'id': 53073,
     'arcs': [[[0, 1, 2]]],
     'properties': {'index': [{'state': 47, 'county': 11}], 'count': 24}},
    {'type': 'Polygon',
     'id': 30105,
     'arcs': [[3, 4, 5, 6, 7, 8]],
     'properties': {'index': [{'state': 26, 'county': 38}], 'count': 13}},
    {'type': 'Polygon',
     'id': 30029,
     'arcs': [[9, 10, 11, 12, 13, 14, 15, 16, 17, 18]],
     'properties': {'index': [{'state': 26, 'county': 8}], 'count': 11}},
    {'type': 'Polygon',
     'id': 16021,
     'arcs': [[19, 20, 21, 22]],
     'properties': {'index': [{'state': 12, 'county': 10}], 'count': 3}},
    {'type': 'Polygon',
     'id': 30071,
     'arcs': [[-8, 23, 24, 25, 26, 27]],
     'properties': {'index': [{'state': 26, 'county': 13}], 'count': 8}},
    {'type': 'Polygon',
   

We can see that the data has a list of county dictionaries nested under objects['counties']. To access each county, we then have to go into the list of geometries, and to get to the FIPS code, we need to access the 'id' element.

In [4]:
us['objects']['counties']['geometries'][0]['id']

53073

Then, we can take that and add the corresponding list of index dictionaries to the county's properties. Since this file is in a [TopoJSON](https://github.com/topojson/topojson/wiki) format, this information must be added underneath the properties key. Otherwise, the TopoJSON reader will not understand the file. 

In [39]:
for countyBounds in us['objects']['counties']['geometries']:
    countyBounds['properties'] = dict()
    if countyBounds['id'] in fips2index:
        countyBounds['properties']['index'] = fips2index[countyBounds['id']]
    else:
        countyBounds['properties']['index'] = []

We'll do one more check to make sure we have what we want, and we do:

In [40]:
us['objects']['counties']['geometries'][0]

{'type': 'MultiPolygon',
 'id': 53073,
 'arcs': [[[0, 1, 2]]],
 'properties': {'index': [{'state': 47, 'county': 11}]}}

And finally, we'll write our data out into a file.

In [41]:
f = open('us-indexed.json', 'w')
f.write(json.dumps(us))
f.close()