# Records Counter

This notebook counts the number of records in each geographic subset of the existing Sanborn Maps dataset and adds that information into the geographic file. This will then be used to visualize the counts on the map.

First, we need to import the JSON module and load in the two files.

In [1]:
import json

In [2]:
with open('sanborn-with-fips.json') as f:
    sanborn = json.load(f)

with open('us-indexed.json') as f:
    us = json.load(f)

## Data Structure

The us-indexed.json file contains shapes for counties and states, so we'll want to count the number of records in each state as well as each county from the Sanborn data. To do that, we can create a list of dictionaries that parallels the structure of the Sanborn data.

In [9]:
sanborn[0]

{'state': 'Alabama',
 'counties': [{'county': 'Henry County',
   'cities': [{'city': 'Abbeville',
     'items': [{'name': 'Sanborn Fire Insurance Map from Abbeville, Henry County, Alabama.',
       'date': '1907-06',
       'thumbnail_urls': ['https://tile.loc.gov/storage-services/service/gmd/gmd397m/g3974m/g3974am/g3974am_g000011907/00001_1907-0001.gif',
        'https://tile.loc.gov/storage-services/service/gmd/gmd397m/g3974m/g3974am/g3974am_g000011907/00001_1907-0001.gif#h=150&w=126'],
       'iiif_urls': ['https://tile.loc.gov/image-services/iiif/service:gmd:gmd397m:g3974m:g3974am:g3974am_g000011907:00001_1907-0001/full/pct:12.5/0/default.jpg',
        'https://tile.loc.gov/image-services/iiif/service:gmd:gmd397m:g3974m:g3974am:g3974am_g000011907:00001_1907-0001/full/pct:12.5/0/default.jpg'],
       'item_url': 'https://www.loc.gov/item/sanborn00001_001/'},
      {'name': 'Sanborn Fire Insurance Map from Abbeville, Henry County, Alabama.',
       'date': '1913-08',
       'thumbnai

In [13]:
us['objects']['counties']['geometries'][0]

{'type': 'MultiPolygon',
 'id': 53073,
 'arcs': [[[0, 1, 2]]],
 'properties': {'index': [{'state': 47, 'county': 11}]}}

In [14]:
us['objects']['states']['geometries'][0]

{'type': 'MultiPolygon',
 'arcs': [[[6960,
    -6779,
    -6725,
    -6740,
    -6751,
    -6750,
    -6812,
    -6811,
    -6818,
    -6817,
    -6833,
    6996,
    -7015,
    -7019,
    7048,
    -7189,
    -7192,
    7226,
    -7446,
    7518,
    7519,
    -7600,
    7723,
    7724,
    -7870,
    7896,
    -8039,
    8080,
    -8132,
    -8178,
    8215,
    -8312,
    8339,
    8340,
    -8502,
    8502,
    -8602,
    8701,
    8702,
    8709,
    8710,
    8711,
    8589,
    8590,
    8704,
    8705,
    8706,
    8696,
    8697,
    8722,
    8699,
    8723,
    8724,
    8725,
    -8640,
    8540,
    -8455,
    8270,
    8271,
    -8148,
    7976,
    7977,
    -7850,
    7703,
    -7627,
    7474,
    -7447,
    7327,
    -7278,
    7200,
    -6974,
    -6977]],
  [[8693, 8694]]],
 'id': 1}

## Counting and Matching

We can also see that the states are already arranged in the same (alphabetical) order in both sets, so that'll be easy to match up. The counties in the us-indexed file also have a property that tells us what index of state and county they match to in the Sanborn data, so that will help us match that up.

In [23]:
counts = []

for i in range(len(sanborn)):
    state = sanborn[i]
    temp = dict()
    state_total = 0
    counties = []
    temp['count'] = 0
    for j in range(len(state['counties'])):
        county = state['counties'][j]
        counties.append(0)
        for city in county['cities']:
            counties[j] += len(city['items'])
        temp['count'] += counties[j]
        temp['counties'] = counties
    counts.append(temp)

In [26]:
counts

[{'count': 392,
  'counties': [6,
   16,
   12,
   1,
   24,
   7,
   11,
   20,
   7,
   8,
   10,
   8,
   3,
   8,
   1,
   12,
   13,
   6,
   12,
   7,
   4,
   6,
   5,
   7,
   7,
   2,
   5,
   8,
   9,
   7,
   4,
   6,
   11,
   2,
   2,
   1,
   3,
   5,
   5,
   4,
   12,
   11,
   5,
   1,
   1,
   5,
   6,
   7,
   12,
   2,
   11,
   1,
   6,
   6,
   5,
   1,
   4,
   1]},
 {'count': 25, 'counties': [2, 4, 1, 1, 6, 2, 1, 1, 1, 2, 2, 2]},
 {'count': 167, 'counties': [26, 18, 23, 15, 11, 12, 14, 12, 6, 6, 7, 9, 8]},
 {'count': 573,
  'counties': [12,
   8,
   11,
   11,
   15,
   10,
   6,
   13,
   10,
   21,
   10,
   15,
   2,
   26,
   11,
   11,
   13,
   14,
   9,
   11,
   6,
   7,
   11,
   1,
   2,
   8,
   8,
   2,
   9,
   14,
   10,
   7,
   10,
   6,
   9,
   6,
   8,
   16,
   21,
   5,
   7,
   9,
   1,
   9,
   2,
   10,
   7,
   13,
   9,
   1,
   5,
   3,
   10,
   6,
   4,
   3,
   5,
   9,
   3,
   9,
   7,
   7,
   9,
   4,
   6,
   1,
   1,
   1,
   

Now we have a list of dictionaries that we can use to add information into us.

In [38]:
for i in range(len(us['objects']['states']['geometries'])):
    state = us['objects']['states']['geometries'][i]
    if 'properties' not in state:
        state['properties'] = dict()
    if state['id'] < 57:
        state['properties']['count'] = counts[i]['count']
    else:
        state['properties']['count'] = 0

In [44]:
us['objects']['states']['geometries'][0]

{'type': 'MultiPolygon',
 'arcs': [[[6960,
    -6779,
    -6725,
    -6740,
    -6751,
    -6750,
    -6812,
    -6811,
    -6818,
    -6817,
    -6833,
    6996,
    -7015,
    -7019,
    7048,
    -7189,
    -7192,
    7226,
    -7446,
    7518,
    7519,
    -7600,
    7723,
    7724,
    -7870,
    7896,
    -8039,
    8080,
    -8132,
    -8178,
    8215,
    -8312,
    8339,
    8340,
    -8502,
    8502,
    -8602,
    8701,
    8702,
    8709,
    8710,
    8711,
    8589,
    8590,
    8704,
    8705,
    8706,
    8696,
    8697,
    8722,
    8699,
    8723,
    8724,
    8725,
    -8640,
    8540,
    -8455,
    8270,
    8271,
    -8148,
    7976,
    7977,
    -7850,
    7703,
    -7627,
    7474,
    -7447,
    7327,
    -7278,
    7200,
    -6974,
    -6977]],
  [[8693, 8694]]],
 'id': 1,
 'properties': {'count': 392}}

In [49]:
for county in us['objects']['counties']['geometries']:
    county['properties']['count'] = 0
    for index_set in county['properties']['index']:
        state_index = index_set['state']
        county_index = index_set['county']    
        county['properties']['count']+=counts[state_index]['counties'][county_index]

In [52]:
us['objects']['counties']['geometries'][:10]

[{'type': 'MultiPolygon',
  'id': 53073,
  'arcs': [[[0, 1, 2]]],
  'properties': {'index': [{'state': 47, 'county': 11}], 'count': 24}},
 {'type': 'Polygon',
  'id': 30105,
  'arcs': [[3, 4, 5, 6, 7, 8]],
  'properties': {'index': [{'state': 26, 'county': 38}], 'count': 13}},
 {'type': 'Polygon',
  'id': 30029,
  'arcs': [[9, 10, 11, 12, 13, 14, 15, 16, 17, 18]],
  'properties': {'index': [{'state': 26, 'county': 8}], 'count': 11}},
 {'type': 'Polygon',
  'id': 16021,
  'arcs': [[19, 20, 21, 22]],
  'properties': {'index': [{'state': 12, 'county': 10}], 'count': 3}},
 {'type': 'Polygon',
  'id': 30071,
  'arcs': [[-8, 23, 24, 25, 26, 27]],
  'properties': {'index': [{'state': 26, 'county': 13}], 'count': 8}},
 {'type': 'Polygon',
  'id': 38079,
  'arcs': [[28, 29, 30, 31]],
  'properties': {'index': [], 'count': 0}},
 {'type': 'Polygon',
  'id': 30053,
  'arcs': [[-18, 32, 33, -20, 34]],
  'properties': {'index': [{'state': 26, 'county': 32}], 'count': 4}},
 {'type': 'Polygon',
  'id'

Finally, we need to write the file back out.

In [53]:
f = open('us-indexed.json', 'w')
f.write(json.dumps(us))
f.close()

### Checking Highest Counts

For my project, I'm also interested in what the maximum count is for states and counties, since I want to visualize that on the map.

In [2]:
with open('us-indexed.json') as f:
    us = json.load(f)

In [4]:
max_state = 0
max_county = 0

for county in us['objects']['counties']['geometries']:
    if county['properties']['count'] > max_county:
        max_county = county['properties']['count']
        
for state in us['objects']['states']['geometries']:
    if state['properties']['count'] > max_state:
        max_state = state['properties']['count']

In [5]:
max_county

192

In [6]:
max_state

2447