# Sanborn Maps Data Collection

This notebook contains code for querying the Library of Congress API for the records from each state, organizing by county, and writing them into a file. Be aware, this does take a while to run! It took me several hours, and the final data file ended up around 23 MB.

We'll start by importing the requests and json modules, which will allow us to query and write to JSON formatting.

In [2]:
import requests
import json

Then, let's take a look at what information we get from the API and how it's organized.

I'm going to query by state because it's an organization that will be useful in my project. For large amounts of data, you need to split up collection into chunks in some mutually exclusive way because the Library's API limits how many requests you can do in a period of time for security purposes. You can do this by adding parameters to your request url. In this case, I'm adding "&fa=location:alabama" to limit searching to Alabama.

In [3]:
alabama = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:alabama").json()

The results are within the resulting JSON file, in a list under the "results" key.

In [9]:
alabama["results"][0]

{'item': {'digital_id': ['https://www.loc.gov/item/sanborn00001_001',
   'http://hdl.loc.gov/loc.gmd/g3974am.g3974am_g000011907',
   'https://www.loc.gov/resource/g3974am.g3974am_g000011907'],
  'repository': 'Library of Congress Geography and Map Division Washington, D.C. 20540-4650 USA',
  'language': 'English',
  'title': 'Sanborn Fire Insurance Map from Abbeville, Henry County, Alabama.',
  'notes': [' Jun 1907. ', ' 2. ', ' Acquired after 1981. '],
  'format': 'cartographic',
  'created_published': 'Sanborn Map Company, Jun 1907'},
 'access_restricted': False,
 'site': ['ammem'],
 'original_format': ['map'],
 'id': 'http://www.loc.gov/item/sanborn00001_001/',
 'partof': ['american memory', 'sanborn maps', 'geography and maps division'],
 'index': 1,
 'group': ['sanborn'],
 'title': 'Sanborn Fire Insurance Map from Abbeville, Henry County, Alabama.',
 'segments': [{'url': 'https://www.loc.gov/resource/g3974am.g3974am_g000011907/',
   'count': 2,
   'link': 'https://www.loc.gov/coll

## Defining Functions

Of this information, I want to keep or construct the name, date, thumbnail URLs, IIIF URLs, and the URL that links back to the LOC site. These functions defined below are helper functions that aid in doing each of those, with a tester statement to make sure that the function produces the desired results.

Also, instead of using the location in the metadata, I pulled city, county, and state from the item name. While not perfect, this method allows for standardization across the records — the names always have city, county, state, in that order and with the comma separation. For the listed place names, the order is not always the same, and even the number of places listed can vary.

In [2]:
# function to return location in an array [city, county, state]
def getLocation(name):
    nameArr = name.split(", ")
    nameArr[0] = nameArr[0][32:]
    nameArr[2] = nameArr[2][:-1]
    return nameArr

print(getLocation("Sanborn Fire Insurance Map From City, County, State."))

['City', 'County', 'State']


While I have not yet incorporated the IIIF URLs into my project, I decided to construct these from the metadata so that I would be able to use the high definition images if needed later. For more information on the International Image Interoperability Framework, see the Library's [guide to using IIIF](https://github.com/LibraryOfCongress/data-exploration/blob/master/IIIF.ipynb).

In [3]:
# function to get the IIIF urls for the images
def getIIIF(image_urls):
    iiif_urls = []
    start = int(len(image_urls)/2)
    for i in range(start, len(image_urls)):
        format_url = image_urls[int(len(image_urls)/2)].split("#")
        iiif_urls.append(format_url[0])
    return iiif_urls

print(getIIIF(["test.jpg23rwe", "test.jpg#adfadsf"]))

['test.jpg']


This function writes a single item's information into a JSON file. I first create an empty dictionary, temp, then add each of the pieces of information into it. That can then be formatted and written into a file using json.dumps().

In [10]:
# function to write the info of one item to a JSON file
def writeItem(item, filewriter):
    temp = dict()
    temp["name"] = item["title"]
    temp["all_image_urls"] = item["image_url"]
    temp["thumbnail_urls"] = all_image_urls[:int(len(all_image_urls)/2)]
    temp["iiif_urls"] = getIIIF(all_image_urls)
    temp["item_url"] = item["url"]
    if "date" in item:
        temp["date"] = item["date"]
    else:
        temp["date"] = "null"
        print(name)
        print(item_url) # these print statements are used to show if there are any issues that need to be fixed manually
    
    # json.dumps() is used to add the double quotes where expected
    filewriter.write(json.dumps(temp))

The separate function categorizes the results, given the list of items and the level being separated. It creates a new dictionary that attaches county names as keys to a list of items in that county (with level==1), and can do the same for city names and a list of items in the city (with level==0).

In [5]:
# function to put results into buckets
# should work for state level separating counties and county level separating cities
def separate(parent_items, level):
    child_dict = dict()
    for item in parent_items:
        child = getLocation(item["title"])[level]
        if child not in child_dict.keys():
            child_dict[child] = []
        child_dict[child].append(item)
    return child_dict

In [6]:
# function for one state - separating and writing
def writeState(state_items, state, filewriter):
    # check if all are in the correct state
    temp = []
    for item in state_items:
        if getLocation(item["title"])[2]==state:
            temp.append(item)
    state_items = temp
    # separate by county, then by city
    by_county = separate(state_items, 1)
    for county in by_county.keys():
        by_county[county] = separate(by_county[county], 0)
    
    filewriter.write('{"state": '+ json.dumps(state) + ', "counties": [')
    
    ccounty=0
    for county in by_county.keys():
        ccounty += 1
        filewriter.write('{"county": ' + json.dumps(county) + ', "cities": [')
        ccity=0
        for city in by_county[county].keys():
            ccity += 1
            filewriter.write('{"city": ' + json.dumps(city) + ', "items": [')
            citem=0
            for item in by_county[county][city]:
                citem += 1
                writeItem(item, filewriter)
                if citem < len(by_county[county][city]):
                    filewriter.write(', ')
                else:
                    filewriter.write(']}')
            if ccity < len(by_county[county]):
                filewriter.write(', ')
            else:
                filewriter.write(']}')
        if ccounty < len(by_county.keys()):
            filewriter.write(', ')
        else:
            filewriter.write(']}')

## Querying and Writing Data

As mentioned above, I separated requests by state. This was useful for the way my project is organized, but may not be the goal of other future projects. I also added an additional tag ("&c=[results per page]") where I could to reduce the number of pages, though I noticed through some trial and error that adding a count per page greater than about 900 would create other issues.

The first opening of the file needed to write the file using the "w" tag where the later ones append using "a".

In [130]:
# 392
alabama = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:alabama&c=400").json()

In [131]:
f = open("sanborn-maps-data-all.json", "w")
f.write("[")
writeState(alabama["results"], "Alabama", f)
f.write(", ")
f.close()

In [132]:
# 26
alaska = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:alaska").json()

f = open("sanborn-maps-data-all.json", "a")
writeState(alaska["results"], "Alaska", f)
f.write(", ")
f.close()

In [133]:
# 167
arizona = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:arizona&c=200").json()

f = open("sanborn-maps-data-all.json", "a")
writeState(arizona["results"], "Arizona", f)
f.write(", ")
f.close()

In [134]:
# 573
arkansas = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:arkansas&c=600").json()

f = open("sanborn-maps-data-all.json", "a")
writeState(arkansas["results"], "Arkansas", f)
f.write(", ")
f.close()

In [135]:
# 716
california = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:california&c=750").json()

f = open("sanborn-maps-data-all.json", "a")
writeState(california["results"], "California", f)
f.write(", ")
f.close()

In [136]:
# 527
colorado = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:colorado&c=550").json()

f = open("sanborn-maps-data-all.json", "a")
writeState(colorado["results"], "Colorado", f)
f.write(", ")
f.close()

In [8]:
# 463
connecticut = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:connecticut&c=470").json()

f = open("sanborn-maps-data-all.json", "a")
writeState(connecticut["results"], "Connecticut", f)
f.write(", ")
f.close()

In [9]:
# 101
delaware = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:delaware&c=120").json()

f = open("sanborn-maps-data-all.json", "a")
writeState(delaware["results"], "Delaware", f)
f.write(", ")
f.close()

In [10]:
# 428
florida = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:florida&c=450").json()

f = open("sanborn-maps-data-all.json", "a")
writeState(florida["results"], "Florida", f)
f.write(", ")
f.close()

In [20]:
# 592
georgia = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:georgia&c=600").json()

f = open("georgia-fix.json", "w") # in a different file because had a missing date, updated writeItem to account for that
writeState(georgia["results"], "Georgia", f)
f.write(", ")
f.close()

Sanborn Fire Insurance Map from Darien, McIntosh County, Georgia.
https://www.loc.gov/item/sanborn01417_009/


In [22]:
# 39
hawaii = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:hawaii&c=50").json()

f = open("sanborn-maps-data-all.json", "a")
writeState(hawaii["results"], "Hawaii", f)
f.write(", ")
f.close()

In [23]:
# 395
idaho = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:idaho&c=400").json()

f = open("sanborn-maps-data-all.json", "a")
writeState(idaho["results"], "Idaho", f)
f.write(", ")
f.close()

Illinois is an example of a state with so many results that I needed to loop through the pages. This particular code has some issues in that each page was written separately instead of combined and written together. That created multiple of Illinois with some overlapping counties that I later had to go in and fix with [another script](https://github.com/selenaqian/sanborn-maps-navigator/blob/master/data/sanborn/paginatedstate-fix-script.py).

In [26]:
# 1883
illinois = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:illinois").json()

while True: #As long as we have a next page, go and fetch it
    f = open("illinois-fix.json", "a")
    writeState(illinois["results"], "Illinois", f)
    f.write(", ")
    f.close()
    next_page = illinois["pagination"]["next"] #get the next page url
    print(next_page) #check to make sure working as expected
    if next_page is not None: #make sure we haven't hit the end of the pages
        illinois = requests.get(next_page).json()
    else:
        break #we are done and can stop looping

https://www.loc.gov/collections/sanborn-maps/?fa=location:illinois&fo=json&sp=2
https://www.loc.gov/collections/sanborn-maps/?fa=location:illinois&fo=json&sp=3
https://www.loc.gov/collections/sanborn-maps/?fa=location:illinois&fo=json&sp=4
https://www.loc.gov/collections/sanborn-maps/?fa=location:illinois&fo=json&sp=5
https://www.loc.gov/collections/sanborn-maps/?fa=location:illinois&fo=json&sp=6
https://www.loc.gov/collections/sanborn-maps/?fa=location:illinois&fo=json&sp=7
https://www.loc.gov/collections/sanborn-maps/?fa=location:illinois&fo=json&sp=8
https://www.loc.gov/collections/sanborn-maps/?fa=location:illinois&fo=json&sp=9
https://www.loc.gov/collections/sanborn-maps/?fa=location:illinois&fo=json&sp=10
https://www.loc.gov/collections/sanborn-maps/?fa=location:illinois&fo=json&sp=11
https://www.loc.gov/collections/sanborn-maps/?fa=location:illinois&fo=json&sp=12
https://www.loc.gov/collections/sanborn-maps/?fa=location:illinois&fo=json&sp=13
https://www.loc.gov/collections/sanb

In [27]:
# 1271
indiana = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:indiana&c=100").json()

while True: #As long as we have a next page, go and fetch it
    f = open("sanborn-maps-data-all.json", "a")
    writeState(indiana["results"], "Indiana", f)
    f.write(", ")
    f.close()
    next_page = indiana["pagination"]["next"] #get the next page url
    print(next_page) #check to make sure working as expected
    if next_page is not None: #make sure we haven't hit the end of the pages
        indiana = requests.get(next_page).json()
    else:
        break #we are done and can stop looping

https://www.loc.gov/collections/sanborn-maps/?c=100&fa=location:indiana&fo=json&sp=2
https://www.loc.gov/collections/sanborn-maps/?c=100&fa=location:indiana&fo=json&sp=3
https://www.loc.gov/collections/sanborn-maps/?c=100&fa=location:indiana&fo=json&sp=4
https://www.loc.gov/collections/sanborn-maps/?c=100&fa=location:indiana&fo=json&sp=5
https://www.loc.gov/collections/sanborn-maps/?c=100&fa=location:indiana&fo=json&sp=6
https://www.loc.gov/collections/sanborn-maps/?c=100&fa=location:indiana&fo=json&sp=7
https://www.loc.gov/collections/sanborn-maps/?c=100&fa=location:indiana&fo=json&sp=8
https://www.loc.gov/collections/sanborn-maps/?c=100&fa=location:indiana&fo=json&sp=9
https://www.loc.gov/collections/sanborn-maps/?c=100&fa=location:indiana&fo=json&sp=10
https://www.loc.gov/collections/sanborn-maps/?c=100&fa=location:indiana&fo=json&sp=11
https://www.loc.gov/collections/sanborn-maps/?c=100&fa=location:indiana&fo=json&sp=12
https://www.loc.gov/collections/sanborn-maps/?c=100&fa=locatio

In [28]:
# 1186
iowa = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:iowa&c=100").json()

while True: #As long as we have a next page, go and fetch it
    f = open("sanborn-maps-data-all.json", "a")
    writeState(iowa["results"], "Iowa", f)
    f.write(", ")
    f.close()
    next_page = iowa["pagination"]["next"] #get the next page url
    print(next_page) #check to make sure working as expected
    if next_page is not None: #make sure we haven't hit the end of the pages
        iowa = requests.get(next_page).json()
    else:
        break #we are done and can stop looping

https://www.loc.gov/collections/sanborn-maps/?c=100&fa=location:iowa&fo=json&sp=2
https://www.loc.gov/collections/sanborn-maps/?c=100&fa=location:iowa&fo=json&sp=3
https://www.loc.gov/collections/sanborn-maps/?c=100&fa=location:iowa&fo=json&sp=4
https://www.loc.gov/collections/sanborn-maps/?c=100&fa=location:iowa&fo=json&sp=5
https://www.loc.gov/collections/sanborn-maps/?c=100&fa=location:iowa&fo=json&sp=6
https://www.loc.gov/collections/sanborn-maps/?c=100&fa=location:iowa&fo=json&sp=7
https://www.loc.gov/collections/sanborn-maps/?c=100&fa=location:iowa&fo=json&sp=8
https://www.loc.gov/collections/sanborn-maps/?c=100&fa=location:iowa&fo=json&sp=9
https://www.loc.gov/collections/sanborn-maps/?c=100&fa=location:iowa&fo=json&sp=10
https://www.loc.gov/collections/sanborn-maps/?c=100&fa=location:iowa&fo=json&sp=11
https://www.loc.gov/collections/sanborn-maps/?c=100&fa=location:iowa&fo=json&sp=12
None


In [29]:
# 967
kansas = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:kansas&c=500").json()

while True: #As long as we have a next page, go and fetch it
    f = open("sanborn-maps-data-all.json", "a")
    writeState(kansas["results"], "Kansas", f)
    f.write(", ")
    f.close()
    next_page = kansas["pagination"]["next"] #get the next page url
    print(next_page) #check to make sure working as expected
    if next_page is not None: #make sure we haven't hit the end of the pages
        kansas = requests.get(next_page).json()
    else:
        break #we are done and can stop looping

https://www.loc.gov/collections/sanborn-maps/?c=500&fa=location:kansas&fo=json&sp=2
None


In [30]:
# 570
kentucky = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:kentucky&c=600").json()

f = open("sanborn-maps-data-all.json", "a")
writeState(kentucky["results"], "Kentucky", f)
f.write(", ")
f.close()

In [31]:
# 434
louisiana = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:louisiana&c=450").json()

f = open("sanborn-maps-data-all.json", "a")
writeState(louisiana["results"], "Louisiana", f)
f.write(", ")
f.close()

In [32]:
# 597
maine = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:maine&c=600").json()

f = open("sanborn-maps-data-all.json", "a")
writeState(maine["results"], "Maine", f)
f.write(", ")
f.close()

Sanborn Fire Insurance Map from South Portland, Cumberland County, Maine.
https://www.loc.gov/item/sanborn03545_002/


In [34]:
# 329
maryland = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:maryland&c=350").json()

f = open("sanborn-maps-data-all.json", "a")
writeState(maryland["results"], "Maryland", f)
f.write(", ")
f.close()

In [35]:
# 1011
massachusetts = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:massachusetts&c=500").json()

while True: #As long as we have a next page, go and fetch it
    f = open("sanborn-maps-data-all.json", "a")
    writeState(massachusetts["results"], "Massachusetts", f)
    f.write(", ")
    f.close()
    next_page = massachusetts["pagination"]["next"] #get the next page url
    print(next_page) #check to make sure working as expected
    if next_page is not None: #make sure we haven't hit the end of the pages
        massachusetts = requests.get(next_page).json()
    else:
        break #we are done and can stop looping

Sanborn Fire Insurance Map from Cohasset, Norfolk County, Massachusetts.
https://www.loc.gov/item/sanborn03710_006/
https://www.loc.gov/collections/sanborn-maps/?c=500&fa=location:massachusetts&fo=json&sp=2
https://www.loc.gov/collections/sanborn-maps/?c=500&fa=location:massachusetts&fo=json&sp=3
None


In [36]:
# 1370
michigan = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:michigan&c=500").json()

while True: #As long as we have a next page, go and fetch it
    f = open("sanborn-maps-data-all.json", "a")
    writeState(michigan["results"], "Michigan", f)
    f.write(", ")
    f.close()
    next_page = michigan["pagination"]["next"] #get the next page url
    print(next_page) #check to make sure working as expected
    if next_page is not None: #make sure we haven't hit the end of the pages
        michigan = requests.get(next_page).json()
    else:
        break #we are done and can stop looping

Sanborn Fire Insurance Map from Bay City, Bay County, Michigan.
https://www.loc.gov/item/sanborn03921_008/
https://www.loc.gov/collections/sanborn-maps/?c=500&fa=location:michigan&fo=json&sp=2
https://www.loc.gov/collections/sanborn-maps/?c=500&fa=location:michigan&fo=json&sp=3
None


In [37]:
# 857
minnesota = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:minnesota&c=900").json()

f = open("sanborn-maps-data-all.json", "a")
writeState(minnesota["results"], "Minnesota", f)
f.write(", ")
f.close()

Sanborn Fire Insurance Map from Duluth, Saint Louis County, Minnesota.
https://www.loc.gov/item/sanborn04287_015/
Sanborn Fire Insurance Map from Minneapolis, Hennepin County, Minnesota.
https://www.loc.gov/item/sanborn04339_014.5/


In [38]:
# 456
mississippi = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:mississippi&c=500").json()

f = open("sanborn-maps-data-all.json", "a")
writeState(mississippi["results"], "Mississippi", f)
f.write(", ")
f.close()

In [39]:
# 1155
missouri = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:missouri&c=500").json()

while True: #As long as we have a next page, go and fetch it
    f = open("sanborn-maps-data-all.json", "a")
    writeState(missouri["results"], "Missouri", f)
    f.write(", ")
    f.close()
    next_page = missouri["pagination"]["next"] #get the next page url
    print(next_page) #check to make sure working as expected
    if next_page is not None: #make sure we haven't hit the end of the pages
        missouri = requests.get(next_page).json()
    else:
        break #we are done and can stop looping

https://www.loc.gov/collections/sanborn-maps/?c=500&fa=location:missouri&fo=json&sp=2
https://www.loc.gov/collections/sanborn-maps/?c=500&fa=location:missouri&fo=json&sp=3
None


In [40]:
# 421
montana = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:montana&c=450").json()

f = open("sanborn-maps-data-all.json", "a")
writeState(montana["results"], "Montana", f)
f.write(", ")
f.close()

In [41]:
# 525
nebraska = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:nebraska&c=550").json()

f = open("sanborn-maps-data-all.json", "a")
writeState(nebraska["results"], "Nebraska", f)
f.write(", ")
f.close()

In [42]:
# 94
nevada = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:nevada&c=100").json()

f = open("sanborn-maps-data-all.json", "a")
writeState(nevada["results"], "Nevada", f)
f.write(", ")
f.close()

In [43]:
# 362
new_hampshire = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:new+hampshire&c=400").json()

f = open("sanborn-maps-data-all.json", "a")
writeState(new_hampshire["results"], "New Hampshire", f)
f.write(", ")
f.close()

In [44]:
# 848
new_jersey = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:new+jersey&c=850").json()

f = open("sanborn-maps-data-all.json", "a")
writeState(new_jersey["results"], "New Jersey", f)
f.write(", ")
f.close()

Sanborn Fire Insurance Map from Newark, Essex County, New Jersey.
https://www.loc.gov/item/sanborn05571_009.2/
Sanborn Fire Insurance Map from Newark, Essex County, New Jersey.
https://www.loc.gov/item/sanborn05571_009.4/
Sanborn Fire Insurance Map from Newark, Essex County, New Jersey.
https://www.loc.gov/item/sanborn05571_009.6/
Sanborn Fire Insurance Map from Rutherford, Bergen County, New Jersey.
https://www.loc.gov/item/sanborn05620_006/
Sanborn Fire Insurance Map from Port Norris, Cumberland County, New Jersey.
https://www.loc.gov/item/sanborn05604_001.5/
Sanborn Fire Insurance Map from Paterson, Passaic County, New Jersey.
https://www.loc.gov/item/sanborn05590_006.5/


In [45]:
# 153
new_mexico = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:new+mexico&c=160").json()

f = open("sanborn-maps-data-all.json", "a")
writeState(new_mexico["results"], "New Mexico", f)
f.write(", ")
f.close()

In [46]:
# 2500
new_york = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:new+york&c=500").json()

while True: #As long as we have a next page, go and fetch it
    f = open("sanborn-maps-data-all.json", "a")
    writeState(new_york["results"], "New York", f)
    f.write(", ")
    f.close()
    next_page = new_york["pagination"]["next"] #get the next page url
    print(next_page) #check to make sure working as expected
    if next_page is not None: #make sure we haven't hit the end of the pages
        new_york = requests.get(next_page).json()
    else:
        break #we are done and can stop looping

Sanborn Fire Insurance Map from Buffalo, Erie County, New York.
https://www.loc.gov/item/sanborn05793_020.5/
https://www.loc.gov/collections/sanborn-maps/?c=500&fa=location:new+york&fo=json&sp=2
https://www.loc.gov/collections/sanborn-maps/?c=500&fa=location:new+york&fo=json&sp=3
https://www.loc.gov/collections/sanborn-maps/?c=500&fa=location:new+york&fo=json&sp=4
Sanborn Fire Insurance Map from Staten Island (Borough Of Richmond), Richmond County, New York.
https://www.loc.gov/item/sanborn06213_004.5/
https://www.loc.gov/collections/sanborn-maps/?c=500&fa=location:new+york&fo=json&sp=5
None


In [47]:
# 553
north_carolina = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:north+carolina&c=600").json()

f = open("sanborn-maps-data-all.json", "a")
writeState(north_carolina["results"], "North Carolina", f)
f.write(", ")
f.close()

In [48]:
# 227
north_dakota = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:north+dakota&c=250").json()

f = open("sanborn-maps-data-all.json", "a")
writeState(north_dakota["results"], "North Dakota", f)
f.write(", ")
f.close()

In [49]:
# 623
ohio = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:ohio&c=650").json()

f = open("sanborn-maps-data-all.json", "a")
writeState(ohio["results"], "Ohio", f)
f.write(", ")
f.close()

In [50]:
# 1191
oklahoma = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:oklahoma&c=500").json()

while True: #As long as we have a next page, go and fetch it
    f = open("sanborn-maps-data-all.json", "a")
    writeState(oklahoma["results"], "Oklahoma", f)
    f.write(", ")
    f.close()
    next_page = oklahoma["pagination"]["next"] #get the next page url
    print(next_page) #check to make sure working as expected
    if next_page is not None: #make sure we haven't hit the end of the pages
        oklahoma = requests.get(next_page).json()
    else:
        break #we are done and can stop looping

Sanborn Fire Insurance Map from Claremore, Rogers County, Oklahoma.
https://www.loc.gov/item/sanborn07040_012/
https://www.loc.gov/collections/sanborn-maps/?c=500&fa=location:oklahoma&fo=json&sp=2
https://www.loc.gov/collections/sanborn-maps/?c=500&fa=location:oklahoma&fo=json&sp=3
None


In [52]:
# 594
oregon = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:oregon&c=600").json()

f = open("sanborn-maps-data-all.json", "a")
writeState(oregon["results"], "Oregon", f)
f.write(", ")
f.close()

In [53]:
# 2029
pennsylvania = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:pennsylvania&c=500").json()

while True: #As long as we have a next page, go and fetch it
    f = open("sanborn-maps-data-all.json", "a")
    writeState(pennsylvania["results"], "Pennsylvania", f)
    f.write(", ")
    f.close()
    next_page = pennsylvania["pagination"]["next"] #get the next page url
    print(next_page) #check to make sure working as expected
    if next_page is not None: #make sure we haven't hit the end of the pages
        pennsylvania = requests.get(next_page).json()
    else:
        break #we are done and can stop looping

https://www.loc.gov/collections/sanborn-maps/?c=500&fa=location:pennsylvania&fo=json&sp=2
Sanborn Fire Insurance Map from Lancaster, Lancaster County, Pennsylvania.
https://www.loc.gov/item/sanborn07754_009/
https://www.loc.gov/collections/sanborn-maps/?c=500&fa=location:pennsylvania&fo=json&sp=3
Sanborn Fire Insurance Map from New Bethlehem, Clarion County, Pennsylvania.
https://www.loc.gov/item/sanborn07855_008/
https://www.loc.gov/collections/sanborn-maps/?c=500&fa=location:pennsylvania&fo=json&sp=4
https://www.loc.gov/collections/sanborn-maps/?c=500&fa=location:pennsylvania&fo=json&sp=5
None


In [55]:
# 131
rhode_island = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:rhode+island&c=150").json()

f = open("sanborn-maps-data-all.json", "a")
writeState(rhode_island["results"], "Rhode Island", f)
f.write(", ")
f.close()

Sanborn Fire Insurance Map from Warwick, Kent County, Rhode Island.
https://www.loc.gov/item/sanborn08105_003/


In [57]:
# 377
south_carolina = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:south+carolina&c=400").json()

f = open("sanborn-maps-data-all.json", "a")
writeState(south_carolina["results"], "South Carolina", f)
f.write(", ")
f.close()

In [58]:
# 426
south_dakota = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:south+dakota&c=450").json()

f = open("sanborn-maps-data-all.json", "a")
writeState(south_dakota["results"], "South Dakota", f)
f.write(", ")
f.close()

In [59]:
# 545
tennessee = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:tennessee&c=550").json()

f = open("sanborn-maps-data-all.json", "a")
writeState(tennessee["results"], "Tennessee", f)
f.write(", ")
f.close()

Sanborn Fire Insurance Map from Nashville, Davidson County, Tennessee.
https://www.loc.gov/item/sanborn08356_026.14/
Sanborn Fire Insurance Map from Nashville, Davidson County, Tennessee.
https://www.loc.gov/item/sanborn08356_026.16/
Sanborn Fire Insurance Map from Nashville, Davidson County, Tennessee.
https://www.loc.gov/item/sanborn08356_026.2/
Sanborn Fire Insurance Map from Nashville, Davidson County, Tennessee.
https://www.loc.gov/item/sanborn08356_026.3/
Sanborn Fire Insurance Map from Nashville, Davidson County, Tennessee.
https://www.loc.gov/item/sanborn08356_026.4/
Sanborn Fire Insurance Map from Nashville, Davidson County, Tennessee.
https://www.loc.gov/item/sanborn08356_026.5/
Sanborn Fire Insurance Map from Nashville, Davidson County, Tennessee.
https://www.loc.gov/item/sanborn08356_026.6/


In [60]:
# 1598
texas = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:texas&c=500").json()

while True: #As long as we have a next page, go and fetch it
    f = open("sanborn-maps-data-all.json", "a")
    writeState(texas["results"], "Texas", f)
    f.write(", ")
    f.close()
    next_page = texas["pagination"]["next"] #get the next page url
    print(next_page) #check to make sure working as expected
    if next_page is not None: #make sure we haven't hit the end of the pages
        texas = requests.get(next_page).json()
    else:
        break #we are done and can stop looping

https://www.loc.gov/collections/sanborn-maps/?c=500&fa=location:texas&fo=json&sp=2
https://www.loc.gov/collections/sanborn-maps/?c=500&fa=location:texas&fo=json&sp=3
https://www.loc.gov/collections/sanborn-maps/?c=500&fa=location:texas&fo=json&sp=4
None


In [61]:
# 164
utah = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:utah&c=200").json()

f = open("sanborn-maps-data-all.json", "a")
writeState(utah["results"], "Utah", f)
f.write(", ")
f.close()

In [62]:
# 327
vermont = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:vermont&c=350").json()

f = open("sanborn-maps-data-all.json", "a")
writeState(vermont["results"], "Vermont", f)
f.write(", ")
f.close()

In [63]:
# 158
virginia = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:virginia&c=175").json()

f = open("sanborn-maps-data-all.json", "a")
writeState(virginia["results"], "Virginia", f)
f.write(", ")
f.close()

In [14]:
# 879
washington = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:washington&c=500").json()

while True: #As long as we have a next page, go and fetch it
    f = open("sanborn-maps-data-all.json", "a")
    writeState(washington["results"], "Washington", f)
    f.write(", ")
    f.close()
    next_page = washington["pagination"]["next"] #get the next page url
    print(next_page) #check to make sure working as expected
    if next_page is not None: #make sure we haven't hit the end of the pages
        washington = requests.get(next_page).json()
    else:
        break #we are done and can stop looping

ConnectionError: HTTPSConnectionPool(host='www.loc.gov', port=443): Max retries exceeded with url: /collections/sanborn-maps/?fo=json&fa=location:washington&c=500 (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x104336450>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known'))

In [124]:
# washington had some errors so:
washington = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:washington&c=1000").json()

test = open("washington-fix.json", "w")
writeState(washington["results"], "Washington", test)
test.close()

In [67]:
# 296
west_virginia = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:west+virginia&c=300").json()

f = open("sanborn-maps-data-all.json", "a")
writeState(west_virginia["results"], "West Virginia", f)
f.write(", ")
f.close()

In [68]:
# 978
wisconsin = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:wisconsin&c=500").json()

while True: #As long as we have a next page, go and fetch it
    f = open("sanborn-maps-data-all.json", "a")
    writeState(wisconsin["results"], "Wisconsin", f)
    f.write(", ")
    f.close()
    next_page = wisconsin["pagination"]["next"] #get the next page url
    print(next_page) #check to make sure working as expected
    if next_page is not None: #make sure we haven't hit the end of the pages
        wisconsin = requests.get(next_page).json()
    else:
        break #we are done and can stop looping

https://www.loc.gov/collections/sanborn-maps/?c=500&fa=location:wisconsin&fo=json&sp=2
None


In [69]:
# 139
wyoming = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:wyoming&c=150").json()

f = open("sanborn-maps-data-all.json", "a")
writeState(wyoming["results"], "Wyoming", f)
f.write(", ")
f.close()

In [70]:
# 4
dc = requests.get("https://www.loc.gov/collections/sanborn-maps/?fo=json&fa=location:washington+d.c.").json()

f = open("sanborn-maps-data-all.json", "a")
writeState(dc["results"], "District of Columbia", f)
f.write("]")
f.close()

Now we have all the data, ready to be further manipulated if needed and used!