## Fun with the Genius API

*I wrote version 1.0 of this notebook in Fall 2019. It has since been supplemented with material from Melanie Walsh's chapter [Song Genius API](https://melaniewalsh.github.io/Intro-Cultural-Analytics/features/Data-Collection/Genius-API.html) from her online textbook [_Introduction to Cultural Analytics & Python_](https://melaniewalsh.github.io/Intro-Cultural-Analytics/features/welcome.html) as well as from Prof. Dan Sinykin's 2020 iteration of QTM 340. I last revised this notebook in Fall 2021*.

Many web sites and organizations offer web APIs. We're going to go over how one API in particular works---the [Genius API](https://docs.genius.com/). By introducing you to this one API, you'll learn the tools necessary to sign up for, query, and interpret APIs from other providers.

### Signing up for an API Key (aka Client Access Token)

Before you can use the Genius API, you need to sign up for a "client access token," which is another name for an API key. Do so by filling out the [New API Client form](https://genius.com/api-clients/new). If you don't yet have an account on Genius.com, you'll be prompted to register first. 

The next questions don't really apply to our use in class, but they're required to get your token. You'll be prompted to fill out a short form about the "App" that you need the Genius API for. You only need to fill out "App Name" and "App Website URL." You can enter any words you want in "App Name." Similarly, you can enter any URL in the "App Website URL," like so:

<img src="http://lklein.com/wp-content/uploads/2021/09/Screen-Shot-2021-09-15-at-11.21.25-AM.png" style="width:400px">

When you click "Save," you'll be given a series of API keys: a "Client ID" and a "Client Secret." To generate your "Client Access Token," which is the API key that we'll be using in this notebook, you need to click "Generate Access Token".

The token is just a string of letters and numbers. It'll look something like this:

    6617c28c371f0a138f7912a35365564afe538605
    
That's your "key" for that API. Whenever you make a request to that API, you'll need to include your key in the request. The exact method for including the key will be explained below. (Note: the key above is just something I made up; it's not a valid key; don't try using it in actual requests.)

In [None]:
# sign up for a client access token from Genius

copy and paste your "Client Access Token" into the quotation marks below, and run the cell to save your variable

In [1]:
client_access_token = ""



### Making an API Request

Remember: making an API request looks a lot like typing a specially-formatted URL. That's kind of what it is. But instead of getting a rendered HTML web page in return, you get some data in return.

There are a few different ways that we can query the Genius API, all of which are discussed in the [Genius API documentation](https://docs.genius.com/#/getting-started-h1). (In general, an API's documentation will explain how to use the API.) The way we're going to cover in this lesson is the [basic search](https://docs.genius.com/#songs-h2), which allows you to get a bunch of Genius data about any artist or songs that you search for, and it looks something like this:

`http://api.genius.com/search?q={search_term}&access_token={client_access_token}`

Let's break it down. But first, we need to: 

In [7]:
import requests # requests again

Then we need the base URL for the Genius API. We'll assign it like this:

In [18]:
base_url = "http://api.genius.com" # this is the URL for the Genius API; we're just storing it as a string
base_url

'http://api.genius.com'

Up next, we add '/search', which is what we learned about from reading the documentation. It tells the Genius API that we want to do a basic search. We'll add it to the end of the base_url (which is just a string) like so:

In [19]:
search_url = base_url + "/search" 
search_url

'http://api.genius.com/search'

Next, we have '?q={search term}'. 

The "q" is Genius's search paramater; it tells Genius that what follows is what we're searching _for_. Let's search for the first song in our candidate playlist: Aretha Franklin's "Respect."

In [20]:
search_term = "Respect" 

Finally, we have '&access_token={client_access_token}'. You've already defined this term above with your own token!

We can put it all back together now:

In [21]:
genius_search_url = f'http://api.genius.com/search?q={search_term}&access_token={client_access_token}'

But wait? What's that 'f' doing in front of the URL? 

This yet another way of formatting strings, known as a [formatted string literal or f-string](https://cito.github.io/blog/f-strings/). 

What it means is that, if you preface a string with an "f", any variables placed in curly braces ( `{}` ) will be interpreted inline. So in this case, {search_term} will be replaced by our search_term, and {client_access_token} will be replaced by our client_access_token.

Note that you could *also* do: 

In [22]:
genius_search_url2 = search_url + "?q=" + search_term + "&access_token=" + client_access_token

But in this case the f-string is a bit more legible.

So now here we go with the API call!

In [26]:
# and here's the API call
resp = requests.get(genius_search_url)
data = resp.json()

data

{'meta': {'status': 200},
 'response': {'hits': [{'highlights': [],
    'index': 'song',
    'type': 'song',
    'result': {'annotation_count': 23,
     'api_path': '/songs/4007572',
     'full_title': "Respect My Cryppin' by\xa0Blueface",
     'header_image_thumbnail_url': 'https://images.genius.com/8abd9b34eae9e1d97e3862287337532f.300x300x1.jpg',
     'header_image_url': 'https://images.genius.com/8abd9b34eae9e1d97e3862287337532f.1000x1000x1.jpg',
     'id': 4007572,
     'lyrics_owner_id': 1532287,
     'lyrics_state': 'complete',
     'path': '/Blueface-respect-my-cryppin-lyrics',
     'pyongs_count': 24,
     'song_art_image_thumbnail_url': 'https://images.genius.com/8abd9b34eae9e1d97e3862287337532f.300x300x1.jpg',
     'song_art_image_url': 'https://images.genius.com/8abd9b34eae9e1d97e3862287337532f.1000x1000x1.jpg',
     'stats': {'unreviewed_annotations': 13,
      'concurrents': 2,
      'hot': False,
      'pageviews': 578895},
     'title': 'Respect My Cryppin’',
     'title

This request is finding all songs that include the search string `Respect`. 

As described in the [documentation](https://docs.genius.com/#/response-format-h1), the results take the form of a dictionary with two keys: `response` (which points to a dictionary of a list of dictionaries; phew!) and `meta`, whose value is a string (`'status'`), which gives you the HTML status code for the response (i.e. whether the request was successful). 

Because the response is a dictionary, we can isolate the two top-level keys to get an overall view of the response:

In [27]:
data.keys()

dict_keys(['meta', 'response'])

So we know that the response was successful. 

But let's dig a little deeper into the `response` key. It itself is a dictionary, so we can look at _its_ keys.

In [28]:
data['response'].keys()

dict_keys(['hits'])

So there is only one key, `hits`, which I will tell you contains a _further_ list of dictionaries: one for each of the hits in the search result.

Let's take a look at the first result:

In [29]:
data['response']['hits'][0]

{'highlights': [],
 'index': 'song',
 'type': 'song',
 'result': {'annotation_count': 23,
  'api_path': '/songs/4007572',
  'full_title': "Respect My Cryppin' by\xa0Blueface",
  'header_image_thumbnail_url': 'https://images.genius.com/8abd9b34eae9e1d97e3862287337532f.300x300x1.jpg',
  'header_image_url': 'https://images.genius.com/8abd9b34eae9e1d97e3862287337532f.1000x1000x1.jpg',
  'id': 4007572,
  'lyrics_owner_id': 1532287,
  'lyrics_state': 'complete',
  'path': '/Blueface-respect-my-cryppin-lyrics',
  'pyongs_count': 24,
  'song_art_image_thumbnail_url': 'https://images.genius.com/8abd9b34eae9e1d97e3862287337532f.300x300x1.jpg',
  'song_art_image_url': 'https://images.genius.com/8abd9b34eae9e1d97e3862287337532f.1000x1000x1.jpg',
  'stats': {'unreviewed_annotations': 13,
   'concurrents': 2,
   'hot': False,
   'pageviews': 578895},
  'title': 'Respect My Cryppin’',
  'title_with_featured': "Respect My Cryppin'",
  'url': 'https://genius.com/Blueface-respect-my-cryppin-lyrics',
  '

So this is what we want: the dictionary for each of the search results.

But lo and behold, it contains additional levels of data, and they each appear to be dictionaries! 

Three of the four-- `highlights`, `index`, and `type`-- each only have one item.

But the `result` dictionary is where the good stuff is. 

Important items in this dictionary are the song title itself (`title`), the URL for the song lyrics (`url`), and the `primary artist` key, which points to *another* dictionary with the name of the artist (`name`). 

The artist name could be used with a different API endpoint to get more detail about a particular artist. But this information is enough for our purposes today.

To get a more compact view of the results of our initial query, for song titles with "Respect" in them, let's see if we can print out the full song title for each search hit:

In [31]:
# Remember list comprehension format: [ predicate expression FOR temporary variable name IN source list ]

titles = [song['result']['title'] for song in data['response']['hits']]

titles

# This means, for each song in data['response']['hits'], add its ['result']['title'] to a new list called "titles"


['Respect My Cryppin’',
 'N. J Respect R',
 'Respect',
 'Respect the Game',
 'Respect',
 'BTS - Respect (English Translation)',
 'No Respect Freestyle',
 'All Due Respect',
 'Money, Power & Respect',
 'Money, Power, Respect']

**Question:** What key would we change to list the URLs for the lyrics of each of these songs?

In [34]:
urls = [song['result']['url'] for song in data['response']['hits']]

urls

['https://genius.com/Blueface-respect-my-cryppin-lyrics',
 'https://genius.com/Damso-n-j-respect-r-lyrics',
 'https://genius.com/Aretha-franklin-respect-lyrics',
 'https://genius.com/Meek-mill-respect-the-game-lyrics',
 'https://genius.com/The-notorious-big-respect-lyrics',
 'https://genius.com/Genius-english-translations-bts-respect-english-translation-lyrics',
 'https://genius.com/Lil-peep-no-respect-freestyle-lyrics',
 'https://genius.com/Run-the-jewels-all-due-respect-lyrics',
 'https://genius.com/The-lox-money-power-and-respect-lyrics',
 'https://genius.com/Travis-scott-money-power-respect-lyrics']

**Exercise:** Adapting the syntax above, list the name of the artist for each of these songs.
    
**Hint:** Remember that the artist `name` is contained within the dictionary `primary artist`

In [15]:
[song['result']['primary_artist']['name'] for song in data['response']['hits']]

['Blueface',
 'Damso',
 'Meek Mill',
 'Aretha Franklin',
 'The Notorious B.I.G.',
 'Lil Peep',
 'Genius English Translations',
 'Run The Jewels',
 'Travis Scott',
 'Wiz Khalifa']

### Working with responses

Now we have a response from the API, and we've parsed it into a Python data structure that we know how to use (a dictionary). But now what do we do with it?

Let's find the URL for the lyrics for Aretha Franklin's "Respect"

Remember that we've already got `search_term` stored from way back up at the top: it's what we searched for in our initial query:

In [35]:
search_term

'Respect'

In [36]:
artist="Aretha Franklin" # you should already be thinking: how can I hook this up with the NYT article data...
lyrics_url = []

In [37]:
for song in data['response']['hits']:
    if song['result']['primary_artist']['name'] == artist:
        lyrics_url = (song['result']['url'])
print(lyrics_url)

https://genius.com/Aretha-franklin-respect-lyrics


We've got our URL!