# Step 1: You will need to create a Spotify developer account

The developer account is needed to get certain keys and tokens.

[Spotify Developer](https://developer.spotify.com/dashboard/login)

**"endpoint"**

![image.png](attachment:image.png)


# Step 2: Import all necessary libraries

In [None]:
import spotipy # The library that will help in creating a connection between Python and Spotify (API)
from spotipy.oauth2 import SpotifyClientCredentials
import pandas as pd
import time #to pause the execution of the loop

# Step 3: Connect to the API
We must log in and establish a connection to Spotify's API. We'll need our "Client ID" and "Client Secret" to do this.

In [None]:
client_id = ""
client_secret = ""

client_credentials_manager = SpotifyClientCredentials(client_id, client_secret)
sp = spotipy.Spotify(client_credentials_manager=client_credentials_manager)

# Step 4: Track ID
I created a playlist in my spotify account and added all Burna Boy Love, Damini album tracks.

Playlist link: https://open.spotify.com/playlist/2ltQFUnLHmcIFYaMDOlTFZ

Now we’ll write a function to get the IDs for each track of this playlist.

In [None]:
def getTrackIDs(user, playlist_id):
    ids = []
    playlist = sp.user_playlist(user, playlist_id)
    for item in playlist['tracks']['items']:
        track = item['track']
        ids.append(track['id'])
    return ids

ids = getTrackIDs('Temidayo', '2ltQFUnLHmcIFYaMDOlTFZ')#Get the Playlist ID

#Temidayo: The name of the Playlist
#The Code: That is the playlist URL link

![image.png](attachment:image.png)

Let get the number of track in the Playlist and the track IDS

In [None]:
print(len(ids))
print(ids)

**Total Track:** 66 Tracks 

Track IDs

# Make a function that will extract all track information from IDs.

Now, that we hae the track IDs, we need to create a function that will help us get specific information about the track.

## List of features from the track ID:

- Name: Track name.
- Album: This is the album name.
- Artist: Artist name.
- Release Date: Date which the track was released.
- Length: Duration of track in millisconds.
- Popularity: Song popularity between the range of 1-100.
- Acousticness: Acousticness: A confidence measure from 0.0 to 1.0 of whether the track is acoustic.
- Danceability: based on a variety of musical components, such as tempo,rhythm stability, beat strength, and overall consistency whether appropriate a track is for dancing.
- Energy: A perceptual gauge of intensity and activity, energy ranges from 0.0 to 1.0.
- Instrumentalness: predicts whether a song is vocal-free. Sounds like "ooh" and "aah" are regarded as instrumental in this situation. Tracks that are spoken word or rap are obviously vocal. The likelihood that a track is vocal-free increases as the instrumentalness value approaches 1.0. The intent is for values above 0.5 to represent instrumental tracks, but confidence increases as the value gets closer to 1.0.
- Liveness: Identifies whether there is a listenership in the recording. Greater liveness numbers indicate a higher likelihood that the song was performed live. A score greater than 0.8 indicates a high probability that the music is live.
- Speechiness: A feature that recognizes spoken words in music.
- Tempo: a track's estimated overall tempo expressed in beats per minute (BPM). Tempo, which in musical terms refers to a piece's speed or tempo, is directly related to the length of an average beat.
- Valence: A scale from 0.0 to 1.0 used to describe the overall musical positivity of a tune. Those with a high valence sound happier, cheerier, and more euphoric, whilst tracks with a low valence sound more depressing (e.g. sad, depressed, angry).

In [None]:
def getTrackFeatures(id):
    meta = sp.track(id)
    
    features = sp.audio_features(id)
    
    #meta
    name =meta["name"] 
    album = meta['album']['name'] 
    artist = meta['album']['artists'][0]['name'] 
    release_date = meta['album']['release_date'] 
    length = meta['duration_ms'] 
    popularity = meta['popularity'] 
    
    
    #features
    acousticness = features[0]['acousticness'] 
    danceability = features[0]['danceability'] 
    energy = features[0]['energy'] 
    instrumentalness = features[0]['instrumentalness']
    liveness = features[0]['liveness']
    loudness = features[0]['loudness']
    speechiness = features[0]['speechiness']
    tempo = features[0]['tempo']
    Valence = features[0]['valence']
    time_signature = features[0]['time_signature']
    
    #Return column as track
    track = [name, album, artist, release_date, length, popularity, danceability, acousticness, energy, instrumentalness, liveness, loudness, speechiness, tempo, Valence, time_signature]
    return track

In [None]:
# Create a data 
tracks = []
for i in range(len(ids)):
    time.sleep(.5)
    track = getTrackFeatures(ids[i])
    tracks.append(track)
    
    # create dataFrame
    df = pd.DataFrame(tracks, columns = ['name', 'album', 'artist', 'release_date', 'length', 'popularity', 'danceability', 'acousticness', 'energy', 'instrumentalness', 'liveness', 'loudness', 'speechiness', 'tempo', 'Valence', 'time_signature'])
    

In [None]:
df.columns

In [None]:
df.info()

In [None]:
df.shape

In [None]:
df.head(50)

In [None]:
df.to_csv("Burnaboy_Albums_v2.csv", sep = ',')

## Reference

- https://www.theverge.com/tldr/2018/2/5/16974194/spotify-recommendation-algorithm-playlist-hack-nelson
- https://betterprogramming.pub/how-to-extract-any-artists-data-using-spotify-s-api-python-and-spotipy-4c079401bc37
- https://github.com/plamere/spotipy/tree/master/examples
- https://spotipy.readthedocs.io/en/master/
- https://developer.spotify.com/documentation/web-api/

# Section 2

# Step 1: Data Visualization

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import plotly
import plotly.express as px

In [None]:
Burna_df = pd.read_csv("Burnaboy_Albums_v2.csv")
Burna_df.head()

In [None]:
Burna_df.drop(['Unnamed: 0'], 1, inplace=True) #This is used to remove unwated columns "Unnamed"
Burna_df.head()

In [None]:
Burna_df.shape #Get the number of Rows & Columns

In [None]:
Burna_df.info() # Get information about the data

In [None]:
Burna_df.describe().transpose() # Descriptive statistics include those that summarize the central tendency, 
#dispersion and shape of a dataset’s distribution, excluding NaN values.

In [None]:
Burna_df.isnull().sum() # Get the empty cells

In [None]:
Burna_df.columns #Show column header

**Sort based on Popularity**

Sort for song popularity in decending order, from highest to lowest.

In [None]:
#Let make our release data the index value
'''Burna_df.set_index("release_date", inplace=True)
Burna_df.head()'''


#DataFrameName.set_index(“column_name_to_setas_Index”,inplace=True/False)

In [None]:
Burna_df.sort_values(by="popularity", ascending=False).head(20)

In [None]:
%matplotlib inline

f,ax = plt.subplots(figsize=(14,10))
sns.heatmap(Burna_df.corr(), annot = True, fmt=".1f", ax=ax)
plt.show()

In [None]:
Burna_df.head()

In [None]:
#Let Seperate the Year from the release date

In [None]:
Burna_df["release_date"].head()

In [None]:
Burna_df["Release Year"] = pd.DatetimeIndex(Burna_df["release_date"]).year
Burna_df["Release Year"].head()

In [None]:
Dance_df = Burna_df.groupby('Release Year', as_index=False).agg({"danceability": "mean"})
Dance_df.head()

In [None]:
px.line(x="Release Year", y="danceability", data_frame=Dance_df, title="Trend of Song Danceability")

In [None]:
popularity_df = Burna_df.groupby('Release Year', as_index=False).agg({"popularity": "mean"})
popularity_df.head()

In [None]:
px.bar(x="Release Year", y="popularity", data_frame=popularity_df, title="Trend of Song Popularity")

In [None]:
album_df = Burna_df.groupby('album', as_index=False).agg({"popularity": "mean"})
album_df.head()

In [None]:
px.bar(x="album", y="popularity", data_frame=album_df, title="Album by Popularity")

In [None]:
px.scatter(Burna_df, x='energy', y='loudness', trendline="lowess", title="Loudness vs Energy Correlation")

In [None]:
px.scatter(Burna_df, x='acousticness', y='popularity', trendline="lowess", title="popularity vs acousticness Correlation")

In [None]:
duration = Burna_df.groupby('Release Year', as_index=False).agg({"length": "mean"})
duration.head()

In [None]:
px.line(x="Release Year", y="length", data_frame=duration, title="Trend of Song Duration in ms")

In [None]:
#Let Analyze the Love, Damini Album.
LoveDamini_df= Burna_df.loc[Burna_df["album"] == "Love, Damini"] #Filters the 

In [None]:
LoveDamini_df.shape

In [None]:
px.bar(x="name", y="popularity", data_frame=LoveDamini_df, title="Song by Popularity")

In [None]:
px.bar(x="name", y="energy", data_frame=LoveDamini_df, title="Love, Damini Energy")

In [None]:
px.bar(x="name", y="danceability", data_frame=LoveDamini_df, title="Love, Damini Danceability")