# ***TDm 1 : Exploring the Spotify Python API Spotipy***


## *Prerequisites*

* To complete this lab, you'll need to have a few things installed:

  * The Spotipy package (you can install it using pip: pip install spotipy)
  * A Spotify account

In [None]:
# Replace ... by missing code
# Spotipy install command
...
!pip install beautifulsoup4

![Image of spotipy](https://i.morioh.com/2020/03/26/aa197f4fbded.jpg)

## *Setup*

To get started, you'll need to create an application on the Spotify Developer Dashboard. Follow these steps:

  1. Go to the Spotify Developer Dashboard https://developer.spotify.com/dashboard/
  2. Log in with your Spotify account
  3. Click "Create an App" and fill in the required information
  4. Once your app is created, click on it to view the app details
  5. Copy the "Client ID" and "Client Secret" values

## **Task 1 : Spotipy Authorization**

In order to use the Spotify API, we need to authenticate our requests using OAuth. Follow these steps to authorize your application:

  6. Import the Spotipy library :

In [None]:
# Import commands
import ...
import os 
import requests
from bs4 import BeautifulSoup

  7. Set your client ID and client secret as environment variables of this notebook :

In [None]:
# Your credentials
os.environ['SPOTIPY_CLIENT_ID']= '...'
os.environ['SPOTIPY_CLIENT_SECRET']= '...'

  8. Create a spotipy client object. You can use SpotifyClientCredentials() which will use the environment variables you just set.\
  https://spotipy.readthedocs.io/en/2.22.1/#spotipy.oauth2.SpotifyClientCredentials

In [None]:
# Import the proposed method from spotipy
from spotipy.oauth2 import ...

# Create your client using the method you imported
spotify = spotipy.Spotify(client_credentials_manager=...)

  9. Use the current_user() method to test that your authorization is working correctly : 
  https://spotipy.readthedocs.io/en/2.22.1/#spotipy.client.Spotify.current_user \
   Remember to use the client you created to call the method.

In [None]:
# Call the method
user = ...
print(user['display_name'])

## **Task 2 : Search for tracks**

Now that we're authorized, let's try searching for some tracks. Use the search() method to search for a track by name : \
https://spotipy.readthedocs.io/en/2.22.1/#spotipy.client.Spotify.search

In [None]:
# Complete the code to search for your favorite song
results = spotify ...

# Find the correct way to access the 'item' key in results
items = results ...

# Print out information about name of each track, name of album and name of first artist
for item in items:
    print(item[...], item[...][...], item['artists'][0][...])

## **Task 3 : Explore an artist**

Now that we're used to it, let's explore artists. Use the search() method again

### **Step 1 : your new query**

In [None]:
# Choose your artist
your_favorite_artist  = '...'

# Use search method. Set the limit to max. 
artist = spotify ...

What is the limit of your query ?

### **Step 2 : exploring your artist's information**
Print out the keys associated with your artist. What do you notice ?\
Then print out keys one level below. Is there more information available ?\
What's the return type (the output) of your search ?

In [None]:
# Code the printing of keys.

Now, let's access the next song in your query. Print out its name

In [None]:
# Find the next song
next_song = ...
next_song_name = ...

# Print it out


Bonus : You could even launch a new query using the url in 'next'

In [None]:
# Code here

Last but not not least, define a function returning the artists IDs for artists in a song

In [None]:
# Choose the correct information to return

def get_artist_id(artists):

  artist_id = []
  
  for i in artists:
    artist_id.append(...)

  return artist_id
  
get_artist_id(artist['tracks']['items'][0]['artists'])

## **Task 4 : Have some lyrics and create your dataset**

Here is a list of artist you can change as you like. Given this list, you'll build a dataset containing lyrics and metrics of their songs.

In [None]:
artists = ['Beyoncé', 'Ed Sheeran', 'Shakira', 'Adele', 'A.R. Rahman', 'Abida Parveen', 'Amr Diab', 'Khaled', 'Fela Kuti', 'Angélique Kidjo', 'Cesaria Evora', 'Buena Vista Social Club', 'Santana', 'Juanes', 'Julio Iglesias', 'Avicii', 'ABBA', 'Zara Larsson', 'Sigur Rós', 'Björk', 'Mozart l\'Opéra Rock', 'Indochine', 'Christine and the Queens', 'Jean-Jacques Goldman', 'Stromae', 'Gotye', 'Sia', 'AC/DC', 'Cold Chisel', 'Keith Urban', 'BTS', 'EXO', 'G-Dragon', 'Blackpink', 'Tarkan', 'Sezen Aksu', 'Nusrat Fateh Ali Khan', 'Cheb Mami', 'Souad Massi', 'Fairuz', 'Omar Khairat', 'Seu Jorge', 'Caetano Veloso', 'Os Mutantes', 'Gilberto Gil', 'Tito Puente', 'Rubén Blades', 'Daddy Yankee', 'Enrique Iglesias', 'Rosalía', 'Eminem', 'Bob Dylan', 'Ray Charles', 'Bob Marley', 'Wolfgang Amadeus Mozart', 'Johann Sebastian Bach', 'Antonio Vivaldi', 'Ludwig van Beethoven', 'Frédéric Chopin', 'Felix Mendelssohn', 'Franz Schubert', 'Hector Berlioz', 'Richard Wagner', 'Johnny Hallyday', 'The Rolling Stones', 'The Beatles', 'Jimi Hendrix', 'The Animals', 'Édith Piaf', 'Charles Trenet', 'Gilbert Bécaud', 'Louis Armstrong', 'Jacques Brel', 'Michel Polnareff', 'Stromae', 'Coldplay', 'Daft Punk', 'David Guetta', 'Tupac Shakur (2Pac)', 'Ben E. King', 'Mariah Carey', 'Whitney Houston', 'Michael Jackson', 'Prince', 'Madonna', 'Elton John', 'Billy Joel', 'Bruce Springsteen', 'Taylor Swift', 'Lady Gaga', 'Bruno Mars', 'Katy Perry', 'Rihanna', 'Justin Bieber', 'Ariana Grande', 'Drake', 'Kendrick Lamar', 'Cardi B', 'Travis Scott', 'Post Malone'] + \
 ['Ali Farka Touré', 'Tinariwen', 'Bombino', 'Salif Keita', 'Amadou & Mariam', 'Oumou Sangaré', 'Bassekou Kouyaté', 'Toumani Diabaté', 'Rokia Traoré', 'Cheikh Lô', 'Youssou N\'Dour', 'Baaba Maal', 'Nina Simone', 'Billie Holiday', 'Ella Fitzgerald', 'Nat King Cole', 'Louis Armstrong', 'Count Basie', 'Duke Ellington', 'John Coltrane', 'Miles Davis', 'Charlie Parker', 'Sonny Rollins', 'Thelonious Monk', 'Erik Satie', 'Claude Debussy', 'Maurice Ravel', 'Camille Saint-Saëns', 'Georges Bizet', 'Gabriel Fauré', 'Francis Poulenc', 'Henri Dutilleux', 'Arnold Schoenberg', 'Igor Stravinsky', 'Gustav Mahler', 'Richard Strauss', 'Benjamin Britten', 'Aaron Copland', 'Steve Reich', 'Philip Glass', 'Arvo Pärt', 'Kaija Saariaho', 'Ludovico Einaudi', 'Max Richter', 'Nils Frahm', 'Jóhann Jóhannsson', 'Ólafur Arnalds', 'Agnes Obel', 'Ane Brun', 'Sufjan Stevens', 'The National', 'Bon Iver', 'Fleet Foxes', 'Grizzly Bear', 'Animal Collective', 'Dirty Projectors', 'Arcade Fire', 'The xx', 'LCD Soundsystem', 'Radiohead', 'Sigur Rós', 'Mogwai', 'Godspeed You! Black Emperor', 'Swans', 'My Bloody Valentine', 'Cocteau Twins', 'Joy Division', 'New Order', 'The Cure', 'Depeche Mode', 'The Smiths', 'The Jesus and Mary Chain', 'Sonic Youth', 'Pixies', 'REM', 'Pearl Jam', 'Nirvana', 'Soundgarden', 'Alice in Chains', 'Tool', 'A Perfect Circle', 'Opeth', 'Mastodon', 'Meshuggah', 'Between the Buried and Me', 'The Dillinger Escape Plan', 'Converge', 'Neurosis', 'Isis', 'Cult of Luna', 'Russian Circles', 'Deafheaven', 'Emma Ruth Rundle', 'Chelsea Wolfe', 'King Woman', 'LINGUA IGNOTA', 'Pharmakon', 'Hildur Guðnadóttir', 'Anna von Hausswolff'] + \
 ['A.R. Rahman', 'Ilaiyaraaja', 'Yuvan Shankar Raja', 'Harris Jayaraj', 'Anirudh Ravichander', 'Shankar-Ehsaan-Loy', 'Vishal-Shekhar', 'G.V. Prakash Kumar', 'Amit Trivedi', 'Pritam', 'Hans Zimmer', 'Ludwig van Beethoven', 'Johann Sebastian Bach', 'Wolfgang Amadeus Mozart', 'Richard Wagner', 'Johannes Brahms', 'Franz Schubert', 'Felix Mendelssohn', 'Robert Schumann', 'Ludwig Minkus', 'Pyotr Ilyich Tchaikovsky', 'Sergei Prokofiev', 'Dmitri Shostakovich', 'Igor Stravinsky', 'Claude Debussy', 'Maurice Ravel', 'Erich Wolfgang Korngold', 'Max Steiner', 'Ennio Morricone', 'John Williams', 'Hans Zimmer', 'James Horner', 'Alan Menken', 'Randy Newman', 'Thomas Newman', 'Danny Elfman', 'Howard Shore', 'Michael Giacchino', 'Hans Zimmer', 'A.R. Rahman', 'Ilaiyaraaja', 'Yuvan Shankar Raja', 'Harris Jayaraj', 'Anirudh Ravichander', 'Shankar-Ehsaan-Loy', 'Vishal-Shekhar', 'G.V. Prakash Kumar', 'Amit Trivedi', 'Pritam', 'A.R. Rahman', 'Ilaiyaraaja', 'Yuvan Shankar Raja', 'Harris Jayaraj', 'Anirudh Ravichander', 'Shankar-Ehsaan-Loy', 'Vishal-Shekhar', 'G.V. Prakash Kumar', 'Amit Trivedi', 'Pritam', 'Ravi Shankar', 'Amjad Ali Khan', 'Ali Akbar Khan', 'Zakir Hussain', 'Ustad Vilayat Khan', 'Hariprasad Chaurasia', 'L. Subramaniam', 'N. Ravikiran', 'U. Srinivas', 'M.S. Subbulakshmi', 'Aruna Sairam', 'Bombay Jayashri', 'Sudha Ragunathan', 'T.M. Krishna', 'Sanjay Subrahmanyan', 'Shobha Gurtu', 'Kishori Amonkar', 'M. Balamuralikrishna', 'Pandit Jasraj', 'Shujaat Khan', 'Abida Parveen', 'Nusrat Fateh Ali Khan', 'Ghulam Ali', 'Mehdi Hassan', 'Reshma', 'Lata Mangeshkar', 'Mohammed Rafi', 'Kishore Kumar', 'Asha Bhosle', 'R.D. Burman', 'S.D. Burman', 'Mukesh', 'Talat Mahmood', 'Geeta Dutt', 'K.J. Yesudas', 'S. Janaki', 'P. Susheela', 'S.P. Balasubrahmanyam', 'K.S. Chithra', 'Usha Uthup', 'Alka Yagnik', 'Sunidhi Chauhan', 'Shreya Ghoshal']

artists = list(set((artists)))
print(artists)

Feel free to add the metrics you want in the dataset given two constraints : 
  * Number of metrics used must be between 2 and 5 ;
  * Metrics should include popularity.

You can explore available metrics in the API docs or directly by searching the querys you already made using search().

*Hint : you can search the API doc for a more straightforward function retrieving* ***features***... ;)

In [None]:
dataset = []

for artist in artists:

  artist = spotify.search(artist, limit = 50)

  # Retrieve songs of each artist
  dataset+=[(i['id'],i['name'],get_artist_id(i['artists'])) for i in artist['...']['...']]

Now, you got songs, find their lyrics... Unfortunately, this is not possible using th API. So, as usual, when in troouble, search the web...

In [None]:
# Search for lyrics for each song using the azlyrics.com website
for song in ...:

    # The track() function uses id as a parameter...
    track = spotify.track(...)
    track_name = track['name'].replace(' ', '').lower()
    artist_name = track['artists'][0]['name'].replace(' ', '').lower()

    url = f'https://www.azlyrics.com/lyrics/{artist_name}/{track_name}.html'
    response = requests.get(url)

    if response.status_code == 200:
        soup = BeautifulSoup(response.content, 'html.parser')
        lyrics = soup.find_all('div', class_='col-xs-12 col-lg-8 text-center')[0].find_all('div')[6].get_text()
        # Add the lyrics to the dataset
        ... = lyrics
    else:
        # If no lyrics found, insert None
        ... = None

Finally, you can add metrics.

In [None]:
# Obtain Spotify metrics for each song. Choose which to add...
for song in ... :
    features = spofify.audio_...(...)[0]
    ... = features['popularity']
    ...
    ...
    ...
    ...
