# `britts_in_harvard`

A program to fetch the names and birth places of all British persons listed in the Harvard Museum API





In [0]:
# start with importing requests
import requests

Get an API first

In [0]:
'''
need to generate an API by submitting 
a request at: https://docs.google.com/forms/d/e/1FAIpQLSfkmEBqH76HLMMiCC-GPPnhcvHC9aJS86E32dOd0Z8MpY2rvQ/viewform 
'''

api = 'fc2c1ee0-871e-11ea-901a-298e9b8e7864'


Objective:Get names and birthplaces of ALL Britts from Harvard API. 
  
Endpoint to use: Person

Query parameters available at: https://github.com/harvardartmuseums/api-docs/blob/master/sections/person.md

An example tells us that culture can be used to query persons of a certain Nationality 


In [0]:
# base URL then becomes
url = 'https://api.harvardartmuseums.org/person?q=culture:British'


##  `process_data`

Helper function to extract attributes of interest from records of personal data in Harvard Museum (HM) records


In [0]:
'''
  List of dictionaries -> List of dictionaries
  purpose: takes a list of person attributes and extract 3 attributes of interest for each person

  args(raw_incidents): records of persons with culture = British 
'''
def process_data(raw_incidents):
  #iterate the results and only grab the properties that we are interested in
  return [{
      'name': raw_incidents[incident]['displayname'],  #[<incident is the index of record>][<target attribute>]
      'birthplace': raw_incidents[incident]['birthplace'],
      'nationality': raw_incidents[incident]['culture']
  } for incident in range(len(raw_incidents))]
  

# 'britts_in_hm_data'

Function to extract Britain nationals from Harvard Museum (hm) 'person' records that span multiple pages

In [0]:

def britts_in_hm_data(): # hm= Harvard Museum
  import requests
  # declare a list to store all results
  incidents = []
  # declare variables to track pages amount since data spread across multiple pages
  page = 1
  # make an initial call
  url = 'https://api.harvardartmuseums.org/person?q=culture:British'
  # define api
  api = 'fc2c1ee0-871e-11ea-901a-298e9b8e7864'
  # get maximum records per call i.e. 100 and set page to a constant 
  query = {"apikey": api, "size": 100, "page": page}  
  
  response = requests.get(url, query)
  
  # make sure we got a valid response
  if(response.ok):
    # get the full data from the response
    data = response.json()
    
    # get the meta data
    meta_data = data['info']

    total = meta_data['totalrecords']
    # print('There is a total of {} results to fetch'.format(total))

    # data spread over multiple pages
    total_pages = meta_data['pages']
    # print('And a total of {} pages to process'.format(total_pages))
    
    # process the results we have so far
    incidents = process_data(data['records'])
    
    # increment page number to skip those already accessed
    page = page + 1
    
    # loop over all pages. Alternatively could write a for loop since total_pages is pre-determined
    while page <= total_pages:
      query = {"apikey": api, "size": 100, "page": page} 
      response = requests.get(url, query)
      if(response.ok):
        #  now incidents will be the old values plus the new ones      
        incidents = incidents + process_data(response.json()['records'])
        # increment page number
        page = page + 1
    # print("This yields the following list:")
    return incidents

In [0]:
# the function
britts_in_hm_data()

[{'birthplace': None, 'name': 'Alastair Wright', 'nationality': 'British'},
 {'birthplace': None, 'name': 'David Charles Read', 'nationality': 'British'},
 {'birthplace': None, 'name': 'Herbert Barraud', 'nationality': 'British'},
 {'birthplace': 'Leicester, England',
  'name': 'John Fulleylove',
  'nationality': 'British'},
 {'birthplace': None, 'name': 'Barraud & Jerrard', 'nationality': 'British'},
 {'birthplace': None, 'name': 'Joachim Smith', 'nationality': 'British'},
 {'birthplace': None, 'name': 'John I. Frost', 'nationality': 'British'},
 {'birthplace': 'Berlin', 'name': 'Frank Auerbach', 'nationality': 'British'},
 {'birthplace': 'London',
  'name': 'Hablot Knight Browne ("Phiz")',
  'nationality': 'British'},
 {'birthplace': 'Melbourne, Australia',
  'name': 'Horace Brodzky',
  'nationality': 'British'},
 {'birthplace': None, 'name': 'James Clarke Hook', 'nationality': 'British'},
 {'birthplace': 'London, England',
  'name': 'Constance Mary Pott',
  'nationality': 'British'}