Here's how I built the database.

Download the data from https://github.com/artsmia/collection and unzip it here.  That makes a directory called `collection-main` that has an '`object` subdirectory with a json file for each object. They're arranged in numbered sub-sub-directories, so we'll need to get all of those. Because of how the json files a structured, `pandas` didn't do what I wanted.  So I fell back on the `json` library. I also didn't want to keep every single column (the `see_also` was causing problems)

In [None]:
import pandas as pd
import json
import glob
import os

# columns to keep
keys = ['accession_number', 'artist', 'life_date', 'title', 'classification', 'department', 'continent', 'country', 'culture',
        'creditline', 'dated', 'description', 'dimensions', 'medium', 'style', 'text', 
        'markings', 'room']


# And a function to process the files, only keeping the requested columns
def get_data(filename):
    with open(filename) as f:
        temp = json.load(f)
    return {k: temp.get(k) for k in keys}

# The json parse 
def is_non_empty(filename):
    return os.stat(filename).st_size > 0

It seemed simplest to use `pandas` to convert it into a sql table here.

In [None]:
parsed = [get_data(x) for x in glob.glob('collection-main/objects/*/*.json') if is_non_empty(x)]
df = pd.DataFrame(parsed)

In [None]:
import sqlalchemy

engine = sqlalchemy.create_engine('postgresql://rich:testpass@localhost:5432/art')
connection = engine.connect()

df.to_sql('mia', connection)

If you're going to do this yourself, you may find it easier to work in `sqlite` instead of setting up a `postgres` server on your laptop. If you do, you might need to adjust the `%s` we used to protect against SQL injection to a `?` (though it depends on the versions of everything).

Also, if you see the `pandas` above and think that you can use that to _read_ from the SQL database, you're correct.  However, it's a bit dangerous - if the table is too large, `pandas` will try to pull it down anyway and possibly crash your kernel.