## Nested Dicitonaries refresher...

In [1]:
people_dic = {"person_1": {'name': 'John', 'age': '27', 'sex': 'Male'},
          "person_2": {'name': 'Marie', 'age': '22', 'sex': 'Female'}}

print(people_dic["person_1"]['name'])
print(people_dic["person_1"]['age'])
print(people_dic["person_1"]['sex'])

John
27
Male


<b> Iterating through nested dictionary

In [2]:
for p_id, p_info in people_dic.items():
    print("\nPerson ID:", p_id)
    
    for key in p_info:
        print(key + ':', p_info[key])


Person ID: person_1
name: John
age: 27
sex: Male

Person ID: person_2
name: Marie
age: 22
sex: Female


# Installing Spotipy

What we need:

- BeautifulSoup, a powerful webscraping library. Install it into your conda environment with `conda install -c anaconda beautifulsoup4`

- requests, a library to send HTTP requests. Install it with `conda install -c anaconda requests`

- spotipy, an API wrapper library to conveniently access the Spotify API in Python. We need that towards middle of the week. `Install via conda install -c conda-forge spotipy`

- A spotify account. Not a paid one. A free one will be just fine. Although you can use your existing one, if you already have one. Nonetheless, It is better to have separate spotify developer account and not using your normal spotify account.

In order to get access to the Spotify API, we need to get our own client_id and client_secret (never share these with anyone and don't upload it to github. You can reset the secret though, if you have accidentally published it). To get it, go to developer.spotify.com , login with your account credentials, accept the terms of conditions. In the following click on "create an app". Give it a name, check that you understand the TOS. That green rectangle that now appears is your app (see screenshots). Click it, and in there you need to copy your client_id and client_secret.


In [3]:
import spotipy

## Loading credentials from another config file

In Python, a configuration file is a file that contains settings or configuration options for a program or application. These files are typically written in a format that can be easily parsed by Python, such as INI, YAML, or JSON.

An import statement in Python allows 'you to load and use code from another Python file or module in your program. So, when we say "import config file" in Python, we typically mean that we are loading a Python module or file that contains configuration settings for our program.

To create a config file in Python, you might create a separate Python module or file that defines configuration variables or settings as global variables or as a dictionary or other data structure. For example, you might create a file called "config.py" . Then, in your main program, you can import these configuration variables by using the "import" statement. 

This will load the variables defined in the "config.py" file and make them available for use in the "main.py" program. By keeping configuration settings separate from the main program logic, you can more easily update and manage these settings without affecting the core functionality of your program.





In [4]:
import config

ModuleNotFoundError: No module named 'config'

## Starting with Spotify API

In order to use the Spotify API (SpotiPy) we will have to create an account in Spotify and follow these steps. Once we have done it we will start initializing the API and look at the search method for which we can introduce a "query" q, in this example we will try it with Lady Gaga:

In [3]:
import spotipy
import json
from spotipy.oauth2 import SpotifyClientCredentials


#Initialize SpotiPy with user credentias
sp = spotipy.Spotify(auth_manager=SpotifyClientCredentials(client_id= config.client_id,
                                                           client_secret= config.client_secret))

# The "sp" variable has two useful funtions:
# The first usefull function is:
# .search(q='',limit=n)
# .search(q="track:"+song_name+" artist:"+artist_name,limit=5) to restrict to a song name and artist.
# Where the "q" keyword is the query you want to perform on spotify: song_name, artist,...
# while The "limit" keyword will limit the number of returned results.
#
# The second usefull function is:
# .audio_features([URL|URI|ID])
# which returns some 'features of the song', that after cleanup, we can use in order to characterize a song.

results = sp.search(q="Lose yourself",limit=3,market="GB")
results
#json_results = json.dumps(results, ensure_ascii=True)
#json_results
#results['tracks']['items'][0]['external_urls']["spotify"]

NameError: name 'config' is not defined

<b> more readable version

In [None]:
import pprint

pprint.pprint(results)

<b> navigating through the dictionary..

In [None]:
results['tracks']["items"][0].keys()

In [None]:
results['tracks']['items'][0]

<b> getting the track id

In [None]:
results['tracks']['items'][0]['id']

In [None]:
for item in results['tracks']['items']:
    print("The name of song is: '{}' and the id is: {}".format(item['name'],item["id"]))

In [None]:
import pandas as pd

song = sp.search(q="Bad Guy", limit=50,market="GB") 
song

In [None]:
song["tracks"]["items"][2]

In [None]:
#pprint.pprint(song['tracks']['items'][0]['uri'])
song["tracks"]["items"][0]["uri"]

# Understanding the json

Understanding the hierachy of a JSON can make you mad. Therefore you can cosider using some online pages where you can paste your JSON file and see the "tree" structure of the file.

https://codebeautify.org/jsonviewer

So, copy the json output from the previous query and paste it on the website's left panel. On the right panel you will be able to see the hierachy of the json file.

Let's get used to the json at hand.

In [None]:
print("The json file has the following keys: ",list(results.keys())) # We can see that we only have tracks
print("The 'tracks' key has the following child keys: ",list(results["tracks"].keys())) # Let's check the values
print("The query we made is: ",results["tracks"]["href"]) # Query we have searched 
print("The song's info is contained in: ",results["tracks"]["items"]) #items (actual tracks)
print("The limit of the query we've made is: ",results["tracks"]["limit"]) #Limit we have chosen
print("The next page if any: ",results["tracks"]["next"]) #link to the next page (next 50 tracks)
print("The starting webpage: ",results["tracks"]["offset"]) # Actual offset (starting point)
print("Starting webpage: ",results["tracks"]["previous"]) #Previous search
print("Total number of results: ",results["tracks"]["total"]) # Number of matches

## Checking albums

In [None]:
print(results["tracks"]["items"][0]["album"]) # we have more info about the album
print("****************\n")
print(list(results["tracks"]["items"][0]["album"].keys())) # Will check artists, id, name, release date, total tracks 
print("****************\n")
print(results["tracks"]["items"][0]["album"]["artists"]) # List with artists and information
print("****************\n")
print("The album ID is: ",results["tracks"]["items"][0]["album"]["id"]) # Album ID 
print("****************\n")
print(results["tracks"]["items"][0]["album"]["name"]) # Album name (if its a single u'll get the name of the song)

## Other Info

In [None]:
results["tracks"]["items"][0]["artists"] # Track artists
results["tracks"]["items"][0]["id"] # Track ID
results["tracks"]["items"][0]["name"] # Track name
results["tracks"]["items"][0]["popularity"] # Popularity index
results["tracks"]["items"][0]["uri"] # Basically ID

## Embeded track player

{'spotify': https://open.spotify.com/track/4O2N861eOnF9q8EtpH8IJu

In [None]:
from IPython.display import IFrame

#track_id = "1rfORa9iYmocEsnnZGMVC4"
track_id= 'spotify:track:3hgl7EQwTutSm6PESsB7gZ'
IFrame(src="https://open.spotify.com/embed/track/"+track_id+"?utm_source=generator",
       width="320",
       height="80",
       frameborder="0",
       allowtransparency="true",
       allow="encrypted-media",
      )

In [None]:
def play_song(track_id):
    return IFrame(src="https://open.spotify.com/embed/track/"+track_id,
       width="320",
       height="80",
       frameborder="0",
       allowtransparency="true",
       allow="encrypted-media",
      )

In [None]:
play_song("1rfORa9iYmocEsnnZGMVC4")

# Getting the Audio feature of a song

In [None]:
results["tracks"]["items"][0]["id"]

In [None]:
sp.audio_features(results["tracks"]["items"][0]["id"] )

In [None]:
## example of bethoven song
sp.audio_features("1Y25uib0Cu5kYTtNuRqyRU")

## building Data frame of audio features

In [None]:
sp.audio_features(song["tracks"]["items"][0]["uri"])

In [None]:
#my_dict = sp.audio_features(song["tracks"]["items"][0]["uri"])[0] # you can provide a list of uri's

list_of_songs=[]
for index in range(0,len(song["tracks"]["items"])):
    list_of_songs.append(sp.audio_features(song["tracks"]["items"][index]["uri"])[0])
df=pd.DataFrame(list_of_songs)    
df=df[["danceability","energy","loudness","speechiness","acousticness",
    "instrumentalness","liveness","valence","tempo","id","duration_ms"]]

df

## Searching a playlist

In [None]:
playlist = sp.user_playlist_tracks("spotify", "7beGd4yYY1qpsBv6K3clFZ",market="GB")

In [None]:
playlist["items"][0]

## extracting a song from playlist

In [None]:
playlist["items"][0]["track"]["id"]

In [None]:
play_song(playlist["items"][0]["track"]["id"])

In [None]:
print(list(playlist.keys())) # Let's look at items and total:
print("Total number of songs in the playlist: ",playlist["total"]) #  Let's check items:
len(playlist["items"]) # It is limited to 100 tracks, we will have to fix it:

## Extracting the songs of a playlist

Pagination using "next"
When you collect songs from a playlist using sp.playlist_tracks, you're limited by the limit parameter, which has a maximum (and default) value of 100. When the playlist has more than 100 songs, you have to collect them by navigating through the "pages" of the results.

The parameter offset allows you to retrieve resuls starting at a certain position: if you start at position 101, you'd get the next "page" of results. An offset of 201 would give you the third page, and so on.

The function sp.next() does the same, but in a simpler way: it can be used on the results from any request to directly retrieve the results for the next page.

We can check whether there's a next page or not by accessing the key next on the results from any request.

In [None]:
def get_playlist_tracks(username, playlist_id):
    results = sp.user_playlist_tracks(username,playlist_id,market="GB")
    tracks = results['items']
    while results['next']:
        results = sp.next(results)
        tracks.extend(results['items'])
    return tracks

In [None]:
tracks=get_playlist_tracks("spotify", "4rnleEAOdmFAbRcNCgZMpY")

## extracting the songs ids from playlist

In [None]:
list_of_audio_features=[]
for item in range(0,10):
    #print (tracks[item]["track"]["id"])
    list_of_audio_features.append(sp.audio_features(tracks[item]["track"]["id"])[0])

In [None]:
df=pd.DataFrame(list_of_audio_features)    
df=df[["danceability","energy","loudness","speechiness","acousticness",
    "instrumentalness","liveness","valence","tempo","id","duration_ms"]]

df

## Extra useful functions

## Getting the artists of the playlist 

In [None]:
def get_artists_from_track(track):
    return [artist["name"] for artist in track["artists"]]

In [None]:
def get_artists_from_playlist(playlist_id):
    tracks_from_playlist = get_playlist_tracks("spotify", playlist_id)
    return list(set(artist for subset in [get_artists_from_track(track["track"]) for track in tracks_from_playlist] for artist in subset))

In [None]:
get_artists_from_playlist("4rnleEAOdmFAbRcNCgZMpY")

# Getting albums 

In this section we will work with albums to extract information. We will start by extracting all the albums of an artist.

In [None]:
def get_albums_from_artist(artist_id):
    results = sp.artist_albums(artist_id, limit = 50,country="GB")
    tracks = results['items']
    while results['next']:
        results = sp.next(results)
        tracks.extend(results['items'])
    return tracks

# Same for albums ids
def get_album_ids_from_artist(artist_id):
    results = sp.artist_albums(artist_id, limit = 50)
    tracks = results['items']
    while results['next']:
        results = sp.next(results)
        tracks.extend(results['items'])
    return [track["id"] for track in tracks]

Example: Coldplay

In [None]:
coldplay_id = "4gzpq5DPGxSnKTe4SA8HAU"
coldplay_albums = get_albums_from_artist(coldplay_id)
coldplay_album_ids = get_album_ids_from_artist(coldplay_id)

# Check artists that played with coldplay
set([artist["name"] for track in coldplay_albums for artist in track["artists"]])

## Getting the songs of a given album

In [None]:
def get_track_ids_from_albums(album_ids):
    return list(set([i["id"] for j in album_ids for i in sp.album(j)["tracks"]["items"]]))

In [None]:
coldplay_songs = get_track_ids_from_albums(coldplay_album_ids)

len(coldplay_songs)