Ever wondered where records related to a certain topic are located within the hierarchical structure of a large collection? Or where you might find particular kinds of formats, like photographs or moving image materials? What if you could use visual methods to find out more about what's in a collection? This notebook walks you through the process of answering some of those questions.

# Setup
First, we need to import the `json` and `requests` library so we can fetch and process JSON data from the RAC Collections API.

In [7]:
import json
import requests

We'll also need to import some libraries that will help us to create the visualizations:

In [4]:
%matplotlib inline
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as snss

# Fetching data
We'll be working with the [Rockefeller Foundation records](https://dimes.rockarch.org/collections/WY7fpswEV3oLhyjiArpHES), a large collection with a deeply nested structure.

The first thing we want to do is get some basic information about this collection from the RAC API using the `collections` endpoint.

In [9]:
rf_records = requests.get("https://api.rockarch.org/collections/WY7fpswEV3oLhyjiArpHES").json()
print(json.dumps(rf_records, indent=4))

{
    "uri": "/collections/WY7fpswEV3oLhyjiArpHES",
    "title": "Rockefeller Foundation records",
    "type": "collection",
    "category": "collection",
    "offset": null,
    "group": {
        "identifier": "/collections/WY7fpswEV3oLhyjiArpHES",
        "title": "Rockefeller Foundation records"
    },
    "external_identifiers": [
        {
            "identifier": "/repositories/2/resources/13058",
            "source": "archivesspace"
        }
    ],
    "level": "collection",
    "parent": null,
    "languages": [
        {
            "expression": "English",
            "identifier": "eng"
        }
    ],
    "description": "The collection comprehensively documents the philanthropic activities of the Rockefeller Foundation through available records in the areas of projects (grants), fellowships, general correspondence, administration, program and policy, board minutes and officers' actions, China Medical Board records, the International Health Board/Division (IHB/IHD), the

We can also get a list of the record groups in this collection by using the `children` endpoint:

In [10]:
record_groups = requests.get("https://api.rockarch.org/collections/WY7fpswEV3oLhyjiArpHES/children").json()
print(json.dumps(record_groups, indent=4))

{
    "count": 24,
    "next": null,
    "previous": null,
    "results": [
        {
            "title": "Rockefeller Foundation records, Projects (Grants), RG 1",
            "type": "collection",
            "online": false,
            "hit_count": null,
            "online_hit_count": null,
            "uri": "/collections/H7ZDcPjFivwzjBG7s7eiiB",
            "dates": "1913-2000",
            "description": "Project files including grant actions, reports, assessments, and associated administrative records.",
            "group": {
                "identifier": "/collections/WY7fpswEV3oLhyjiArpHES",
                "title": "Rockefeller Foundation records"
            }
        },
        {
            "title": "Rockefeller Foundation records, General Correspondence, RG 2",
            "type": "collection",
            "online": false,
            "hit_count": null,
            "online_hit_count": null,
            "uri": "/collections/9uHv2eRkaUdURyy8tgbbqU",
            "dates":

# todo
- The API needs to be updated before we can do this (need offset and minimap endpoint)
- search terms
    - Yellow fever
    - agriculture
    - nursing education
    - tuberculosis
    - refugee scholar
- digitized content
- formats (Moving Images, Sound recordings, Photographs, and Annual reports)