## Fun with the Genius API

_Lauren F. Klein wrote version 1.0 of this notebook. I have supplemented it with material from Melanie Walsh's chapter [Song Genius API](https://melaniewalsh.github.io/Intro-Cultural-Analytics/features/Data-Collection/Genius-API.html) from her online textbook [_Introduction to Cultural Analytics & Python_](https://melaniewalsh.github.io/Intro-Cultural-Analytics/features/welcome.html)._

Many web sites and organizations offer web APIs. We're going to go over how one API in particular works---the [Genius API](https://docs.genius.com/). By introducing you to this one API, you'll learn the tools necessary to sign up for, query, and interpret APIs from other providers.

### Signing up for an API Key (aka Client Access Token)

Before you can use the Genius API, you need to sign up for a "client access token," which is another name for an API key. Do so by filling out the [New API Client form](https://genius.com/api-clients/new). If you don't yet have an account on Genius.com, you'll be prompted to register first. 

The next questions don't really apply to our use in class, but they're required to get your token. You'll be prompted to fill out a short form about the "App" that you need the Genius API for. You only need to fill out "App Name" and "App Website URL." You can enter any words you want in "App Name." Similarly, you can enter any URL in the "App Website URL."

When you click "Save," you'll be given a series of API keys: a "Client ID" and a "Client Secret." To generate your "Client Access Token," which is the API key that we'll be using in this notebook, you need to click "Generate Access Token".

The token is just a string of letters and numbers. It'll look something like this:

    6617c28c371f0a138f7912a35365564afe538605
    
That's your "key" for that API. Whenever you make a request to that API, you'll need to include your key in the request. The exact method for including the key will be explained below. (Note: the key above is just something I made up; it's not a valid key; don't try using it in actual requests.)

In [None]:
# sign up for a client access token from Genius

copy and paste your "Client Access Token" into the quotation marks below, and run the cell to save your variable

In [1]:
client_access_token = "0xTf4yE2Jnlgy7euDhEIYHh0AwJ73dus_S4icC1aEjESZUiHVk8B6m3fkOnMRF_1"

### Making an API Request

Remember: making an API request looks a lot like typing a specially-formatted URL. That's kind of what it is. But instead of getting a rendered HTML web page in return, you get some data in return.

There are a few different ways that we can query the Genius API, all of which are discussed in the [Genius API documentation](https://docs.genius.com/#/getting-started-h1). (In general, an API's documentation will explain how to use the API.) The way we're going to cover in this lesson is the [basic search](https://docs.genius.com/#songs-h2), which allows you to get a bunch of Genius data about any artist or songs that you search for, and it looks something like this:

http://api.genius.com/search?q={search_term}&access_token={client_access_token}

Let's break it down. But first, we need to: 

In [2]:
import requests # requests again

Okay. We have the URL for the Genius API. We'll call this:

In [3]:
base_url = "http://api.genius.com" # this is the URL for the Genius API

Up next, we add '/search', which tells the Genius API that we want to do a basic search. We'll add it to the end of the base_url, like so:

In [4]:
search_url = base_url + "/search" # remember, this is how you format the URL for a search, as described above

Next, we have '?q={search term}'. The q is Genius's search paramater; it tells Genius that what follows is what we're searching _for_. Let's search for the first song in our candidate playlist: Aretha Franklin's "Respect."

In [5]:
search_term = "Respect" # Remember you have to put quotation marks around the term

Finally, we have '&access_token={client_access_token}'. You've already defined this term above with your own token!

We can put it all back together now:

In [6]:
genius_search_url = f'http://api.genius.com/search?q={search_term}&access_token={client_access_token}'

But wait? What's that 'f' doing in front of the URL? See how '{search term}' and '{client_access_token}' are in black font unlike the rest of the URL? That's because of the f, which designates the string that follows as a [formatted string literal or f-string](https://cito.github.io/blog/f-strings/). It means that {search_term} will be replaced by our search_term, in this case "Respect", and {client_access_token} will be replaced by our client_access_token.

In [7]:
# and here's the API call
resp = requests.get(genius_search_url)
data = resp.json()

This request is finding all songs that include the search string `Respect`. As described in the [documentation](https://docs.genius.com/#/response-format-h1), the results take the form of a dictionary with two keys: `response` (which points to a dictionary of a list of dictionaries; phew!) and `meta`, whose value is a string (`'status'`), which gives you the HTML status code for the response (i.e. whether the request was successful). 

The JSON data that we get from our API query looks something like this:

In [8]:
data

{'meta': {'status': 200},
 'response': {'hits': [{'highlights': [],
    'index': 'song',
    'type': 'song',
    'result': {'annotation_count': 18,
     'api_path': '/songs/4007572',
     'full_title': "Respect My Cryppin' by\xa0Blueface",
     'header_image_thumbnail_url': 'https://images.genius.com/8abd9b34eae9e1d97e3862287337532f.300x300x1.jpg',
     'header_image_url': 'https://images.genius.com/8abd9b34eae9e1d97e3862287337532f.1000x1000x1.jpg',
     'id': 4007572,
     'lyrics_owner_id': 1532287,
     'lyrics_state': 'complete',
     'path': '/Blueface-respect-my-cryppin-lyrics',
     'pyongs_count': 21,
     'song_art_image_thumbnail_url': 'https://images.genius.com/8abd9b34eae9e1d97e3862287337532f.300x300x1.jpg',
     'song_art_image_url': 'https://images.genius.com/8abd9b34eae9e1d97e3862287337532f.1000x1000x1.jpg',
     'stats': {'unreviewed_annotations': 8, 'hot': False, 'pageviews': 530318},
     'title': 'Respect My Cryppin’',
     'title_with_featured': "Respect My Cryppin'

Remember, JSON data takes the form of a dictionary, so we can look isolate the keys to get a view of the top level:

In [9]:
data.keys()

dict_keys(['meta', 'response'])

We have just two keys 'meta' and 'response'. 

From the output, we know that the response was successful. So we can ignore whatever is associated with the `meta` key. But let's dig a little deeper into the `response` key. It itself is a dictionary, so we can look at _its_ keys.

In [10]:
data['response'].keys()

dict_keys(['hits'])

So there is only one key, `hits`, which I will tell you contains a _further_ list of dictionaries: one for each of the hits in the search result.

Let's take a look at the first result:

In [11]:
data['response']['hits'][0]

{'highlights': [],
 'index': 'song',
 'type': 'song',
 'result': {'annotation_count': 18,
  'api_path': '/songs/4007572',
  'full_title': "Respect My Cryppin' by\xa0Blueface",
  'header_image_thumbnail_url': 'https://images.genius.com/8abd9b34eae9e1d97e3862287337532f.300x300x1.jpg',
  'header_image_url': 'https://images.genius.com/8abd9b34eae9e1d97e3862287337532f.1000x1000x1.jpg',
  'id': 4007572,
  'lyrics_owner_id': 1532287,
  'lyrics_state': 'complete',
  'path': '/Blueface-respect-my-cryppin-lyrics',
  'pyongs_count': 21,
  'song_art_image_thumbnail_url': 'https://images.genius.com/8abd9b34eae9e1d97e3862287337532f.300x300x1.jpg',
  'song_art_image_url': 'https://images.genius.com/8abd9b34eae9e1d97e3862287337532f.1000x1000x1.jpg',
  'stats': {'unreviewed_annotations': 8, 'hot': False, 'pageviews': 530318},
  'title': 'Respect My Cryppin’',
  'title_with_featured': "Respect My Cryppin'",
  'url': 'https://genius.com/Blueface-respect-my-cryppin-lyrics',
  'primary_artist': {'api_path'

We can tell that this data describes the song "Respect My Cryppin'" and contains other information about the song, such as its number of Genius annotations, its number of web page views, and links to images of its album cover.

So this is what we want: the value, which is a dictionary, for each of the search results. But this value itself contains further nested data: the first three entries are 'highlights', 'index', and 'type'. We want the `result` dictionary. 

Important items in this dictionary are the song title itself (`title`), the URL for the song lyrics (`url`), and the `primary artist` key, which points to *another* dictionary with the name of the artist (`name`). 

The artist name could be used with a different API endpoint to get more detail about a particular artist. But this information is enough for our purposes today.

To get a sense of what we're looking at, let's get the full song title for each search hit:

In [12]:
# Remember list comprehension format: [ predicate expression FOR temporary variable name IN source list ]

[song['result']['title'] for song in data['response']['hits']]

['Respect My Cryppin’',
 'N. J Respect R',
 'Respect the Game',
 'Respect',
 'Respect',
 'No Respect Freestyle',
 'BTS - Respect (English Translation)',
 'All Due Respect',
 'Money, Power, Respect',
 'Respect']

We can also get the song title AND its page views (this time, for fun, using a for loop):

In [13]:
for song in data['response']['hits']:
    print(song['result']['title'], song['result']['stats']['pageviews'])

Respect My Cryppin’ 530318
N. J Respect R 510858
Respect the Game 259092
Respect 324605
Respect 219001
No Respect Freestyle 178055
BTS - Respect (English Translation) 162971
All Due Respect 81529
Money, Power, Respect 62504
Respect 50654


**Exercise:** Using the syntax above, list the URLs for the lyrics of each of these songs

In [14]:
[song['result']['url'] for song in data['response']['hits']]

['https://genius.com/Blueface-respect-my-cryppin-lyrics',
 'https://genius.com/Damso-n-j-respect-r-lyrics',
 'https://genius.com/Meek-mill-respect-the-game-lyrics',
 'https://genius.com/Aretha-franklin-respect-lyrics',
 'https://genius.com/The-notorious-big-respect-lyrics',
 'https://genius.com/Lil-peep-no-respect-freestyle-lyrics',
 'https://genius.com/Genius-english-translations-bts-respect-english-translation-lyrics',
 'https://genius.com/Run-the-jewels-all-due-respect-lyrics',
 'https://genius.com/Travis-scott-money-power-respect-lyrics',
 'https://genius.com/Wiz-khalifa-respect-lyrics']

**Exercise:** Adapting the syntax above, list the name of the artist for each of these songs.
    
Hint: Remember that `name` is contained with the dictionary `primary artist`

In [15]:
[song['result']['primary_artist']['name'] for song in data['response']['hits']]

['Blueface',
 'Damso',
 'Meek Mill',
 'Aretha Franklin',
 'The Notorious B.I.G.',
 'Lil Peep',
 'Genius English Translations',
 'Run The Jewels',
 'Travis Scott',
 'Wiz Khalifa']

### Working with responses

Now we have a response from the API, and we've parsed it into a Python data structure that we know how to use (a dictionary). But now what do we do with it?

Let's find the URL for the lyrics for Aretha Franklin's "Respect"

Remember that we've already got search_term stored from way back up at the top: it's what we searched for:

In [16]:
search_term

'Respect'

In [17]:
artist="Aretha Franklin" # you should already be thinking: how can I hook this up with the NYT article data...
lyrics_url = []

In [18]:
for song in data['response']['hits']:
    if song['result']['primary_artist']['name'] == artist:
        lyrics_url = (song['result']['url'])
print(lyrics_url)

https://genius.com/Aretha-franklin-respect-lyrics


We've got our URL!