# Fetching Setlist Data from Setlist.fm

We'll be using the Setlist.fm API to get Dead and Company and Grateful Dead setlists. The eventual goal for this data is to create a neural network that predicts setlists, but in the interim some charting of song playing patterns might be interesting.

## Getting the Data

I've already registered an API Key with Setlist.fm, so we'll use that to get the data. We're limited to 1440 API calls a day at a maximum rate of 2 per second. Each request will only return 20 results. That said - we should be able to retrieve all the data we're interested without running into those issues. The bigger issue is that we'll need to use a few hundred calls to get all the Grateful Dead setlists.

We'll be pulling data for Grateful Dead and Dead & Company setlists. It's unclear how much data will be returned from one request - so we'll see how this goes.

### Relevant API Calls

Since I'm really only interested in the two artists, we can use the `1.0/artist/{mbid}/setlists` call. 

`{mbid}` refers to the artist's id in the [MusicBrainz](https://musicbrainz.org/) database.

|Artist|`{mbid}`|Setlists Available|API Calls Needed|
|------|------|------:|------:|
|Grateful Dead| 6faa7ca7-0d99-4a5e-bfa6-1fd5037520c6 | 2329 | 117 |
|Dead & Company| 94f8947c-2d9c-4519-bcf9-6d11a24ad006 | 156 | 8 |

My API key is stored in a local environment variable called `setlistkey`. We'll reference that to make our API calls.

In [28]:
# import needed packages

import os
import requests
import urllib.parse
import json
import pandas as pd
import math
import time
from pprint import pprint
from tqdm.notebook import tqdm

In [31]:
# create class for artists
# 

class Artist:
    """
    Class to house artist name, mbid, setlists available and api calls needed
    Automatically generate url for use in api calls
    """
    def __init__(self, name, abbrev, mbid, sets):
        self.name = name
        self.abbrev = abbrev
        self.mbid = mbid
        self.sets = sets
        self.pages = math.ceil(sets/20)
        self.base_url = 'https://api.setlist.fm/rest/1.0/artist/' + self.mbid + '/setlists' 

In [32]:
# instantiate dead and company and grateful dead
dead_and_co = Artist("Dead & Company", 'dc', '94f8947c-2d9c-4519-bcf9-6d11a24ad006', 156)
grateful_dead = Artist("Grateful Dead", 'gd', '6faa7ca7-0d99-4a5e-bfa6-1fd5037520c6', 2329)

In [17]:
dead_and_co.base_url

'https://api.setlist.fm/rest/1.0/artist/94f8947c-2d9c-4519-bcf9-6d11a24ad006/setlists'

In [18]:
# set header for api calls
headers = {'x-api-key': os.environ.get("setlistkey"),
           'Accept' : 'application/json'}

In [19]:
# Request first page for Dead and Company as a test
r = requests.get(dead_and_co.base_url, headers=headers, params={'p':1}) # p represents page number

In [20]:
# enabled scrolling because this is really long
pprint(r.json())

{'itemsPerPage': 20,
 'page': 1,
 'setlist': [{'artist': {'disambiguation': '',
                         'mbid': '94f8947c-2d9c-4519-bcf9-6d11a24ad006',
                         'name': 'Dead & Company',
                         'sortName': 'Dead & Company',
                         'url': 'https://www.setlist.fm/setlists/dead-and-company-2bc42076.html'},
              'eventDate': '24-07-2020',
              'id': '13843195',
              'info': 'Scheduled to play. Sold out. COVID-19. Tour cancelled. '
                      'Show cancelled. ',
              'lastUpdated': '2020-07-23T06:31:01.000+0000',
              'sets': {'set': []},
              'url': 'https://www.setlist.fm/setlist/dead-and-company/2020/wrigley-field-chicago-il-13843195.html',
              'venue': {'city': {'coords': {'lat': 41.850033,
                                            'long': -87.6500523},
                                 'country': {'code': 'US',
                                             'na

In [21]:
# read the request in as json, then cut down to the relevant parts
dc_p1 = json.loads(r.text)

In [22]:
# pick a show to test on
# 12/30/2019 at Chase Center (one of my favorites)
set_test = dc_p1['setlist'][7]

In [23]:
set_test['sets']['set']

[{'name': 'Set 1:',
  'song': [{'name': 'Shakedown Street',
    'cover': {'mbid': '6faa7ca7-0d99-4a5e-bfa6-1fd5037520c6',
     'name': 'Grateful Dead',
     'sortName': 'Grateful Dead',
     'disambiguation': '',
     'url': 'https://www.setlist.fm/setlists/grateful-dead-bd6ad4a.html'},
    'info': '>'},
   {'name': 'Mississippi Half-Step Uptown Toodeloo',
    'cover': {'mbid': '6faa7ca7-0d99-4a5e-bfa6-1fd5037520c6',
     'name': 'Grateful Dead',
     'sortName': 'Grateful Dead',
     'disambiguation': '',
     'url': 'https://www.setlist.fm/setlists/grateful-dead-bd6ad4a.html'},
    'info': '>'},
   {'name': 'Cumberland Blues',
    'cover': {'mbid': '6faa7ca7-0d99-4a5e-bfa6-1fd5037520c6',
     'name': 'Grateful Dead',
     'sortName': 'Grateful Dead',
     'disambiguation': '',
     'url': 'https://www.setlist.fm/setlists/grateful-dead-bd6ad4a.html'}},
   {'name': 'It Hurts Me Too',
    'cover': {'mbid': '1b62df85-00d2-464f-81bc-a5c0cdcad278',
     'name': 'Tampa Red',
     'sortName'

In [24]:
# Generate set lists as sequences
# Not relevant for data fetching, but ensures that the data we collect will be useful
# Later we'll probably make these into lists rather than printing
for s in set_test['sets']['set']: # can't use set as a unit because it's a reserved word, hence s
    if 'encore' in s:
        print("Encore:")
    else:
        print(f"{s['name']}")
    for song in s['song']:
        print(f"{song['name']}")
        if 'info' in song:
            if song['info'] == '>':
                print(">")

Set 1:
Shakedown Street
>
Mississippi Half-Step Uptown Toodeloo
>
Cumberland Blues
It Hurts Me Too
High Time
Cold Rain and Snow
Bird Song
Set 2:
The Music Never Stopped
>
Deal
>
St. Stephen
>
William Tell Bridge
>
The Eleven
>
Turn On Your Love Light
>
Drums
>
Space
>
The Wheel
>
Stella Blue
Casey Jones
Encore:
Quinn the Eskimo (The Mighty Quinn)


In [62]:
def fetch_setlists(artist, outfolder = 'data'):
    """
    Fetch each page of setlist data and save it to file
    Takes in the artist objects we made earlier
    """
    for p in tqdm(range(1, artist.pages+1)):
        time.sleep(0.75)
        file_path = os.path.join(outfolder, artist.abbrev + '_' + str(p) + '.json')
        r = requests.get(artist.base_url, headers=headers, params={'p': p})
        if r.status_code != 200:
            print(f"Response {r.status_code} for page {p}")
        with open(file_path, 'wb') as file:
            file.write(r.content)
            file.close()

In [58]:
fetch_setlists(dead_and_co)

HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

Page 1 requested: response 200
Page 2 requested: response 200
Page 3 requested: response 200
Page 4 requested: response 200
Page 5 requested: response 200
Page 6 requested: response 200
Page 7 requested: response 200
Page 8 requested: response 200



In [61]:
# test reading in data and 
with open('data/dc_1.json', 'rb') as file:
    dc_test = json.load(file)
    
dc_test

{'type': 'setlists',
 'itemsPerPage': 20,
 'page': 1,
 'total': 156,
 'setlist': [{'id': '13843195',
   'versionId': '1b260d70',
   'eventDate': '24-07-2020',
   'lastUpdated': '2020-07-23T06:31:01.000+0000',
   'artist': {'mbid': '94f8947c-2d9c-4519-bcf9-6d11a24ad006',
    'name': 'Dead & Company',
    'sortName': 'Dead & Company',
    'disambiguation': '',
    'url': 'https://www.setlist.fm/setlists/dead-and-company-2bc42076.html'},
   'venue': {'id': '53d6cf29',
    'name': 'Wrigley Field',
    'city': {'id': '4887398',
     'name': 'Chicago',
     'state': 'Illinois',
     'stateCode': 'IL',
     'coords': {'lat': 41.850033, 'long': -87.6500523},
     'country': {'code': 'US', 'name': 'United States'}},
    'url': 'https://www.setlist.fm/venue/wrigley-field-chicago-il-usa-53d6cf29.html'},
   'sets': {'set': []},
   'info': 'Scheduled to play. Sold out. COVID-19. Tour cancelled. Show cancelled. ',
   'url': 'https://www.setlist.fm/setlist/dead-and-company/2020/wrigley-field-chicago-

In [63]:
fetch_setlists(grateful_dead)

HBox(children=(FloatProgress(value=0.0, max=117.0), HTML(value='')))




In [65]:
with open('data/gd_117.json', 'rb') as file:
    gd_test = json.load(file)
    
gd_test

{'type': 'setlists',
 'itemsPerPage': 20,
 'page': 117,
 'total': 2329,
 'setlist': [{'id': 'bf2212e',
   'versionId': '23b8a8a7',
   'eventDate': '17-09-1965',
   'lastUpdated': '2015-12-19T21:51:50.000+0000',
   'artist': {'mbid': '6faa7ca7-0d99-4a5e-bfa6-1fd5037520c6',
    'name': 'Grateful Dead',
    'sortName': 'Grateful Dead',
    'disambiguation': '',
    'url': 'https://www.setlist.fm/setlists/grateful-dead-bd6ad4a.html'},
   'venue': {'id': '23d5ec83',
    'name': 'The In Room',
    'city': {'id': '5327455',
     'name': 'Belmont',
     'state': 'California',
     'stateCode': 'CA',
     'coords': {'lat': 37.5202145, 'long': -122.2758008},
     'country': {'code': 'US', 'name': 'United States'}},
    'url': 'https://www.setlist.fm/venue/the-in-room-belmont-ca-usa-23d5ec83.html'},
   'sets': {'set': []},
   'url': 'https://www.setlist.fm/setlist/grateful-dead/1965/the-in-room-belmont-ca-bf2212e.html'},
  {'id': '5bdbbf3c',
   'versionId': '3b105044',
   'eventDate': '01-09-1965