Import Python library "requests" to access the MET Museum's public API and retrieve information on landscape artworks from the Asian Art department.



Import Python library "Pandas" which is a powerful data manipulation library in Python. It provides data structures and data analysis tools for handling and manipulating tables. In this case, pandas is used to manupulate and analyze the data returned by the API

Import Python library "time" to delay the API requests in order to avoid overloading the API server with too many requests in a short period of time.

In [1]:
import requests

import pandas as pd

import time

Create a variable url that contains the URL for the API endpoint that will be used to retrieve the data. The URL includes the base URL for the API, https://collectionapi.metmuseum.org/public/collection/v1/search, and two query parameters: departmentId=6 and q=landscape. The departmentId=6 parameter specifies that the API should only return artworks from the Asian Art department, and the q=landscape parameter specifies that the API should only return artworks that are landscapes.



In [2]:
url = "https://collectionapi.metmuseum.org/public/collection/v1/search?departmentId=6&q=landscape"

response = requests.get(url) # makes a GET request to the API endpoint specified in the url variable using the requests.get() function. 

results_json = response.json() #uses the response.json() function to parse the JSON data returned by the API and save it in the results_json variable.

Use the results_json variable, which contains the JSON data returned by the MET Museum's API, to extract specific information such as "objectID", "title","culture","period","accessionYear",""primaryImageSmall" and "diementions"on landscape artworks from the Asian Art department and store it in separate variables.

It's important to note that this code is expecting that the json structure returned by the API contains the fields "objectID", "title","culture", "period","accessionYear", "primaryImageSmall", "dimensions" and that those fields are in a format that can be stored in the lists, otherwise an error will be raised.

In [3]:
all_objects = results_json["objectIDs"] # access the value of the "objectIDs" key in the results_json varibale 

# creating empty lists for storing specific information about each artwork. 

objectID = []
title = []
culture = []
period = []
accessionYear =[]
primaryImageSmall = []
dimensions = []

In [4]:
for idx, i in enumerate(all_objects):
    #print(f"{idx} of {len(all_objects)}.")
    url2 = "https://collectionapi.metmuseum.org/public/collection/v1/objects/" + str(i)
    response2 = requests.get(url2)
    results_json2 = response2.json()

    try:
        objectID.append(results_json2["objectID"])
    except:
        objectID.append("N/A")

    try:
        title.append(results_json2["title"])
    except:
        title.append("N/A")

    try:
        culture.append(results_json2["culture"])
    except:
        culture.append("N/A")
        
    try:
        period.append(results_json2["period"])
    except:
        period.append("N/A")

    try:
        accessionYear.append(results_json2["accessionYear"])
    except:
        accessionYear.append("N/A")
        
    try:
        primaryImageSmall.append(results_json2["primaryImageSmall"])
    except:
        primaryImageSmall.append("N/A")

    try:
        dimensions.append(results_json2["dimensions"])
    except:
        dimensions.append("N/A")

    # increase if want to wait for seconds between requests
    time.sleep(0)


In [5]:
all_data = pd.DataFrame(
    {
        "objectID": objectID,
        "title": title,
        "culture": culture,
        "period": period,
        "accessionYear":accessionYear,
        "primaryImageSmall": primaryImageSmall,
        "dimensions": dimensions,
    }
)

In [6]:
all_data.to_csv("landscape_asian_art.csv", index=False)