# Individual Project - Spotify Library Analysis and Music Recommendation
### 2019/12/4
### Darren Wang (hsiangw2@illinois.edu)
This project contains two main part: 1. Working with Spotify API and data wrangling 2. Making music recommendation based on my music library 3. Reflection

#### All codes were tested under DataBricks runtime 6.1 (includes Apache Spark 2.4.4, Scala 2.11), the authorization flow under should not requires any login credentials other than pasting two URIs back and forth, if you can't get pass the authorization, please let me know.

# 1. Working with Spotify API and Data Wrangling

### (1) Getting Pass Spotify Authorization in DataBricks

1. Install Spotipy(Spotify API for Python) onto current cluster and restart Python <br>
2. Check current version of Spark and Python

In [6]:
dbutils.library.installPyPI('spotipy')
dbutils.library.restartPython()

# Check Current Version
from platform import python_version
print('Spark Version: ', sc.version)
print('Python version: ', python_version())

How Spotipy authorized an application works as follows: <br>
1. Input *username, scope(what data can be retrieved by this application), client_id, client_secret, and redirect_uri*
2. The "original" spotipy.util.prompt_for_user_token would pop an URL based on input redirect_uri and expects an input based on Python's raw_input function
3. User pastes the poped redirect_uri into any web browser and pastes the redirected URL back to Python
4. Authorization complete

**However**, DataBricks does not support raw_input, which means I have to modified the function <br>
In order to input value in DataBricks notebook, Databricks provides **widgets.text** method in its **dbutils** module, which shows an widget to of the notebook and expect an text input.

Now my customized authorization flow works as follows: <br>
1. Input *username, scope, client_id, client_secret, and redirect_uri* into prompt_for_user_token_modified()
2. An URL would pop up and we can get the authorization redirected URI same as above
3. Invoke another function called get_token() which grabs the redirect URI from the widget and get access token
4. With the token, invoke spotipy.Spotify
5. Authorization complete <br>

**Authorization lasts for one hour, if needs to re-authorized, have to create a new cluster and re-run the authorization flow**

**The chunk below defines the two modified functions and creates the widget**

In [8]:
from __future__ import print_function
import spotipy
import spotipy.util as util
import os
import spotipy.oauth2 as oauth2

dbutils.widgets.text("Redirect URI", "Please paste the redirect URL here")

def prompt_for_user_token_modified(username, scope = None, client_id = None,
        client_secret = None, redirect_uri = None, cache_path = None):
    ''' prompts the user to login if necessary and returns
        the user token suitable for use with the spotipy.Spotify 
        constructor
        Parameters:
         - username - the Spotify username
         - scope - the desired scope of the request
         - client_id - the client id of your app
         - client_secret - the client secret of your app
         - redirect_uri - the redirect URI of your app
         - cache_path - path to location to save tokens
    '''

    if not client_id:
        client_id = os.getenv('SPOTIPY_CLIENT_ID')

    if not client_secret:
        client_secret = os.getenv('SPOTIPY_CLIENT_SECRET')

    if not redirect_uri:
        redirect_uri = os.getenv('SPOTIPY_REDIRECT_URI')

    if not client_id:
        print('''
            You need to set your Spotify API credentials. You can do this by
            setting environment variables like so:
            export SPOTIPY_CLIENT_ID='your-spotify-client-id'
            export SPOTIPY_CLIENT_SECRET='your-spotify-client-secret'
            export SPOTIPY_REDIRECT_URI='your-app-redirect-url'
            Get your credentials at     
                https://developer.spotify.com/my-applications
        ''')
        raise spotipy.SpotifyException(550, -1, 'no credentials set')

    cache_path = cache_path or ".cache-" + username
    global sp_oauth
    sp_oauth = oauth2.SpotifyOAuth(client_id, client_secret, redirect_uri, 
        scope=scope, cache_path=cache_path)
    # try to get a valid token for this user, from the cache,
    # if not in the cache, the create a new (this will send
    # the user to a web page where they can authorize this app)

    token_info = sp_oauth.get_cached_token()

    if not token_info:
        print('''
            User authentication requires interaction with your
            web browser. Once you enter your credentials and
            give authorization, you will be redirected to
            a url.  Paste that url you were directed to to
            complete the authorization.
        ''')
        auth_url = sp_oauth.get_authorize_url()
        try:
            import webbrowser
            webbrowser.open(auth_url)
            print("Opened %s in your browser" % auth_url)
        except:
            print("Please navigate here: %s" % auth_url)

        print()
        
def get_token():
  response = dbutils.widgets.get("Redirect URI")
  code = sp_oauth.parse_response_code(response)
  token_info = sp_oauth.get_access_token(code)
  # Auth'ed API request
  if token_info:
    return token_info['access_token']
  else:
    return None

Prompt for user token

In [10]:
username = ''
scope = ''
prompt_for_user_token_modified(username,
                                   scope,
                                   client_id = '',
                                   client_secret = '',
                                   redirect_uri = 'https://www.google.com/')

Get token and invoke spotipy.Spotify

In [12]:
token = get_token()
sp = spotipy.Spotify(auth=token)

### (2) Working with Spotify API and Data Wrangling

Now we have passed the authorization, it's time to get some data from Spotify. <br>
First, get information of all saved songs in my personal library, this is done by **current_user_saved_tracks** method of **spotipy** module <br>
In each request, the API return 50 songs in maximum, so we have to set up a simple for loop.
<br>


The data was returned as a list of dictionary. <br>
Information includes *song_id*, *artist*, *available country*, *album*, *number of songs in album*, *issued date*, *album cover*, ... etc <br>
Currently, my library contains 606 songs.

In [14]:
import math
saved_tracks = []
tracks_cnt = sp.current_user_saved_tracks()['total']

for i in range(math.ceil(tracks_cnt / 50)):
    offset = 50 * i
    tracks = sp.current_user_saved_tracks(limit = 50, offset = offset)
    saved_tracks.extend(tracks['items'])

print('Total number of my saved songs: {}'.format(len(saved_tracks)))

With *id* in songs information, we can utilize **spotipy**'s **audio_features** method to get audio features for each song. This method request one song at a time so a for loop that loop through all saved tracks was utilized.<br>

The method returns 13 audio features:*duration_ms, key, mode, time_signature, acousticness, danceability, energy, instrumentalness, liveness, loudness, speechiness, valence, tempo*, as well as 5 other variables such as *id*, *href*. <br>

4 unwanted variables were dropped: *uri, track_href, analysis_url, type*. *artist name, album name, release date* were added to the data for later analysis. <br>

Because science notation cause problem when convert to Spark dataframe, *instrumentalness* was converted to string of floating point number here, and would be converted to double type in Spark dataframe later.

For more information about each variable, please refer to: https://developer.spotify.com/documentation/web-api/reference/tracks/get-audio-features/

In [16]:
tracks_features = []
for i in range(len(saved_tracks)):
  # get track id
  song_id = saved_tracks[i]['track']['id']
  # get track audio features
  features = sp.audio_features(song_id)[0]
  # modify result
  # 1. add artist name
  features['name'] = saved_tracks[i]['track']['album']['artists'][0]['name']
  # 2. add album title
  features['album'] = saved_tracks[i]['track']['album']['name']
  # 3. add release_date
  features['release_date'] = saved_tracks[i]['track']['album']['release_date']
  # 4. add song title
  features['title'] = saved_tracks[i]['track']['name']
  # convert possible scientific notation to float
  features['instrumentalness'] = format(features['instrumentalness'], '.8f')
  # delete unwanted content
  del features['uri']
  del features['track_href']
  del features['analysis_url']
  del features['type']
  # append
  tracks_features.append(features)

Now, with the desired data, create a Spark dataframe.

In [18]:
from pyspark.sql import Row
features_df = spark.createDataFrame(Row(**x) for x in tracks_features)

And convert *intrumentalness* back to double type.

In [20]:
# covert instrumentalness back to double type

from pyspark.sql.types import DoubleType
features_df = features_df.withColumn("instrumentalness", features_df["instrumentalness"].cast(DoubleType()))

Schema and each column type is shown as follows:

In [22]:
# show schema and column type
features_df.schema.fields

### Take a look at my music library

In [24]:
# first view of the data
display(features_df)

acousticness,album,danceability,duration_ms,energy,id,instrumentalness,key,liveness,loudness,mode,name,release_date,speechiness,tempo,time_signature,valence
0.0376,二十一世紀的破青年,0.528,309987,0.655,1SooDjb7CRmKCojdxespi7,0.00267,1,0.0977,-5.675,1,無妄合作社,2019-10-01,0.0258,86.581,4,0.589
0.751,Winter Sweet,0.799,329840,0.518,300O5aB8IWTCGL2xIfx2xq,0.000799,7,0.0959,-9.048,0,Soft Lipa,2009-12-31,0.0667,95.009,4,0.475
0.0683,收斂水,0.699,244320,0.765,2JItrvgRjJtv3BYs5QHh6p,0.0,4,0.283,-8.854,0,Soft Lipa,2009-07-22,0.038,96.01,4,0.706
0.298,收斂水,0.854,181600,0.682,6xAtGKQApLCPft6DYR6fFK,2.37e-05,8,0.116,-9.984,1,Soft Lipa,2009-07-22,0.0333,102.001,4,0.842
0.255,你所不知道的杜振熙之內部整修,0.582,210000,0.883,6i2oWpLh1Bt8F5BVaiyHmr,0.0214,10,0.109,-7.635,0,Soft Lipa,2013-07-31,0.0656,175.925,4,0.558
0.0309,Awake,0.549,277560,0.663,4yX50PMifhGzVEo1wv3guc,0.888,5,0.0938,-7.555,0,Tycho,2014-03-18,0.0464,114.494,4,0.245
0.0279,Those Who Ride With Giants [Deluxe],0.585,358870,0.319,1J1m1KU9TUVPu9U9WT70BK,0.752,0,0.212,-16.684,1,Those Who Ride With Giants,2014-05-30,0.0287,129.992,4,0.226
0.786,魚仔,0.666,280955,0.217,2sb6AZQLeoD3VgA4zglQB6,0.0,1,0.0766,-11.042,0,Crowd Lu,2017-05-26,0.0335,82.011,4,0.395
0.0435,Wild Cherry,0.814,300000,0.672,5uuJruktM9fMdN9Va0DUMl,0.0,9,0.061,-12.068,1,Wild Cherry,1976,0.0619,109.394,4,0.933
0.434,太陽,0.602,316160,0.481,7a6y8KUW6hpSkv5GJuH7j0,2.4e-05,4,0.0896,-8.773,1,Cheer Chen,2009-01-22,0.0253,103.885,4,0.306


With **Spark.sql** and **display** function provided by DataBricks, we can take a closer look.
#### For example, **when were songs in my library released?**
##### Ans: **From 1965 to 2019, with an obvious increasing pattern overtime.**

In [26]:
# Realease year of my library
features_df.createOrReplaceTempView("features")

song_years = spark.sql("SELECT substring(release_date, 0, 4) AS year, COUNT(substring(release_date, 0, 4)) as NUMBER FROM features GROUP BY year ORDER BY year ASC")

display(song_years)

year,NUMBER
1965,1
1967,1
1968,2
1969,2
1970,3
1971,1
1972,1
1973,4
1974,1
1975,2


Or,

#### Who are my top 10 artists?

##### Ans:

In [28]:
# My top 10 artists
top_artists = spark.sql("SELECT name AS artist, COUNT(name) as songs FROM features GROUP BY artist ORDER BY songs DESC")

display(top_artists.head(10))

artist,songs
Radiohead,28
落日飛車 Sunset Rollercoaster,23
Kendrick Lamar,20
toe,14
甜梅號,13
MONO,13
Broken Social Scene,12
Jay Chou,11
CHTHONIC,10
J. Cole,10


And,

#### How many albums are there?

##### Ans: 389 in total.

In [30]:
# How many albums are there?
albums_num = spark.sql("SELECT COUNT(DISTINCT album) as album_number FROM features")

display(albums_num)

album_number
389


Even more,

#### Duration of my saved songs, in minutes.

In [32]:
# How about song duration?
# duration is recorded in ms, for ease to understand, let's convert it to minutes
dura_mins = spark.sql("SELECT ROUND(duration_ms/60000) as minutes, COUNT(ROUND(duration_ms/60000)) as count FROM features GROUP BY minutes ORDER BY minutes ASC")

display(dura_mins)

minutes,count
2.0,19
3.0,61
4.0,190
5.0,187
6.0,72
7.0,35
8.0,18
9.0,9
10.0,5
11.0,3


Now, in order to make recommendation, we need data of songs that I don't like. However, Spotify Web API doesn't provide access to user's dislike songs. <br> 

In such case, the easiest way to get data from disliked songs would be creating a playlist containing all the songs I don't like, but it requires a lot of time and effort and it also means that I have to revisit all the songs I dislike! <br>

My alternative plan is to choose playlists from genres I don't like that much(*K-Pop*, *Electrical Dance Music*) and playlists I definitely can not enjoy(*Songs from Disney films*, *Audio books of scary stories*, etc...), and merge them together. <br>

Since **Spotipy** does not provides method for extracting songs information from a given playlist, I had to go with **curl GET** here, **curl GET** was realized by **Python**'s requests package, and the json file it returns was parsed by **Python**'s **json** package. <br>

##### Code chunk below specifies a function that, given a *playlist id* found on Spotify App, create a Spark dataframe contains all the needed audio features and information.

In [35]:
def audio_features_collector(playlistid):
  
  import json
  import requests
  #count total songs in a playlists
  url = "https://api.spotify.com/v1/playlists/{}/tracks".format(playlistid)
  headers = {'Authorization': "Bearer {}".format(token)}
  r = requests.get(url, headers=headers)
  parsed = json.loads(r.text)
  count_songs = parsed['total']

  # get inforamtion for all songs
  tracks = []
  for i in range(math.ceil(count_songs / 50)):
      offset = 50 * i
      current_url = "https://api.spotify.com/v1/playlists/{}/tracks?limit=50&offset={}".format(playlistid, offset)
      req = requests.get(current_url, headers=headers)
      current_parsed = json.loads(req.text)
      tracks.extend(current_parsed['items'])

  print('Total number of songs in playlist: {}'.format(len(tracks)))
 
  # get tracks features
  tracks_features = []
  
  for j in range(count_songs):
    # get track id
    song_id = tracks[j]['track']['id']
    # get track audio features
    features = sp.audio_features(song_id)[0]
    # modify result
    # 1. add artist name
    features['name'] = tracks[j]['track']['album']['artists'][0]['name']
    # 2. add album title
    features['album'] = tracks[j]['track']['album']['name']
    # 3. add release date
    features['release_date'] = tracks[j]['track']['album']['release_date']
    # 4. add song title
    features['title'] = tracks[j]['track']['name']
    # convert possible scientific notation to float
    features['instrumentalness'] = format(features['instrumentalness'], '.8f')
    # delete unwanted content
    del features['uri']
    del features['track_href']
    del features['analysis_url']
    del features['type']
    # append
    tracks_features.append(features)
    
  print('Total number of songs features get: {}'.format(len(tracks_features)))
  
  # convert to Spark DF
  #features_df = sqlContext.createDataFrame(tracks_features)
  from pyspark.sql import Row
  features_df = spark.createDataFrame(Row(**x) for x in tracks_features)
  #features_df = spark.createDataFrame(tracks_features)
  
  # covert instrumentalness back to double type
  from pyspark.sql.types import DoubleType
  features_df = features_df.withColumn("instrumentalness", features_df["instrumentalness"].cast(DoubleType()))
  
  return(features_df)

##### Get audio features from songs in 9 different playlists.

In [37]:
# dance party from Dance Party
dance_party = audio_features_collector("37i9dQZF1DXaXB8fQg7xif")

# k-pop from Essential K-Pop
kpop = audio_features_collector("37i9dQZF1DX14fiWYoe7Oh")

# kids song from Silly Songs for Kids
kids = audio_features_collector("37i9dQZF1DX2ls3pMfEx4A")

# pop from Pop Rising
pop = audio_features_collector("37i9dQZF1DWUa8ZRTfalHk")

# dance music from Bass Arcade
dance = audio_features_collector("37i9dQZF1DX0hvSv9Rf41p")

# scary_stories from Scary Stories
scary_stories = audio_features_collector("37i9dQZF1DX0RGhgSIsFBm")

# stories from Short Stories
stories = audio_features_collector("37i9dQZF1DWXmUJqjFaQNQ") 

# Disney songs from Disney Favorites
disney = audio_features_collector("37i9dQZF1DWVs8I62NcHks")

# k-pop dance music from K-Party Dance Mix
kpop_dance = audio_features_collector("37i9dQZF1DX4RDXswvP6Mj")

##### Merge all playlist into one Spark dataframe named *disliked*, merge it with audio features with my music library, also create a new column specifies whether the song was from *disliked* or my music library.

In [39]:
# append all disliked playlists together
disliked = dance_party.union(kpop).union(kids).union(pop).union(dance).union(scary_stories).union(stories).union(disney).union(kpop_dance)

# add a new column indicates like or dislike
from pyspark.sql.functions import lit

# 0 for all disliked songs
disliked = disliked.withColumn('like', lit(0))

# 1 for all liked(saved) songs
liked = features_df.withColumn('like', lit(1))

# join them together
full_data = disliked.union(liked)

# 2. Making Recommendation

## (1) Data Preprocessing for Building Model

Unlike other programming language, **pyspark.ml** requests all features be supressed in one single column of vectors, almost all of our audio features are numerical, or ready to use binary variable coded as 0 or 1 (mode). However, the key of a song should not be treated as numerical and therefore should be coded properly into a dummy vector. <br>

With **OneHotEncoderEstimator** from **pyspark.ml.feature** this is done as follows:

In [41]:
# since the key of a song should not be treated as numerical
# we need to transform key to dummy vector
from pyspark.ml.feature import OneHotEncoderEstimator

encoder = OneHotEncoderEstimator(inputCols=['key'], outputCols=['keyVec'])
full_data = encoder.fit(full_data).transform(full_data)

With **VectorAssembler** from **pyspark.ml.feature**, further suppress audio features into one column of vector named *features*

In [43]:
# specify needed columns
model_cols = ['acousticness', 'danceability', 'duration_ms', 'energy', 'instrumentalness', 'liveness', 'loudness', 'mode', 'speechiness', 'tempo', 'time_signature', 'valence', 'keyVec']

from pyspark.ml.feature import VectorAssembler
assembler = VectorAssembler(inputCols=model_cols,outputCol="features")
# Now let us use the transform method to transform our dataset
model_data = assembler.transform(full_data)

Most of the time, it's a good idea to scale and center features before modeling, with default settings of **StandardScaler** from **pyspark.ml.feature**, the features are scaled and centered to unit standard deviation.

In [45]:
from pyspark.ml.feature import StandardScaler
standardscaler = StandardScaler().setInputCol("features").setOutputCol("scaled_features")
model_data = standardscaler.fit(model_data).transform(model_data)

## (2) Variable Selection via Chi-sqaure Test and Train Test Split

With **ChiSqSelector** from **pyspark.ml.feature**, select features that are significantly different between liked and disliked songs under 95% of confidence level.

In [47]:
# feature selection with chisquareSelector
from pyspark.ml.feature import ChiSqSelector

css = ChiSqSelector(featuresCol='scaled_features', outputCol='selected_features', labelCol='like', fpr=0.05)
model_data = css.fit(model_data).transform(model_data)

##### Randomly split the data into training set and testing set by 80:20 proportion.

In [49]:
# random train test split
train, test = model_data.randomSplit([0.8, 0.2], seed=480)

## (3) Train a Logistic Regression Model for Making Recommendation

##### Prediction was made on testing data set, first five prediction as well as their real value are shown below

In [51]:
from pyspark.ml.classification import LogisticRegression

# logistic regression lr
lr = LogisticRegression(labelCol='like', featuresCol='scaled_features', maxIter=50)
logit_model = lr.fit(train)
predict_train = logit_model.transform(train)
predict_test = logit_model.transform(test)
predict_test.select('like', 'prediction').show(5)

## (4) Evaluating the Model

With **BinaryClassificationEvaluator** from **pyspark.ml.evaluation** we can get the value of AUC and use it to evaluate the model. <br>
The model has 0.83 training AUC and 0.78 testing AUC, is already pretty good result but we can further improve it.

In [53]:
from pyspark.ml.evaluation import BinaryClassificationEvaluator
evaluator = BinaryClassificationEvaluator(rawPredictionCol='rawPrediction', labelCol='like')
predict_test.select("like", "rawPrediction", "prediction","probability").show(5)
print("AUC for training set is {}".format(evaluator.evaluate(predict_train)))
print("AUC for testing set is {}".format(evaluator.evaluate(predict_test)))

## (5) Improving the Model and Trying Other Models

We can utilize grid search over some parameters to find the best model, also, in order to reduce the chance of over-fitting, 5 folds cross-validation was used.
*Because of the particular grid we chose to work with here, the CV doesn't improve model performance. If given more time to try different grid, the model should improve.*

In [55]:
from pyspark.ml.tuning import ParamGridBuilder, CrossValidator

paramGrid = ParamGridBuilder()\
    .addGrid(lr.elasticNetParam, [0.0, 0.5, 1.0])\
    .addGrid(lr.maxIter, [50])\
    .build()

# 5-fold crossValidator
cv = CrossValidator(estimator=lr, estimatorParamMaps=paramGrid, evaluator=evaluator, numFolds=5)
# fit model
# this is going to take a while
cvModel = cv.fit(train)
# make predictions
predict_train = cvModel.transform(train)
predict_test = cvModel.transform(test)
print("AUC for training set with CV is {}".format(evaluator.evaluate(predict_train)))
print("AUC for testing set with CV is {}".format(evaluator.evaluate(predict_test)))

## (6) Make Recommendation on Random Songs

Grab a random noisy looking playlists for making recommendation, take New Music Friday as example. <br>
Preparing data for model to make recommendation.

In [57]:
# get audio features
new_music = audio_features_collector("37i9dQZF1DX4JAvHpjipBk")

# encode key variable
new_music_data = encoder.fit(new_music).transform(new_music)

# assemble features
new_music_data = assembler.transform(new_music_data)

# scale features
new_music_data = standardscaler.fit(new_music_data).transform(new_music_data)

##### Make prediciton

In [59]:
predict_new_music = logit_model.transform(new_music_data)

#### And... These songs are probably worth checking out:

In [61]:
predict_new_music.createOrReplaceTempView("recommendation")

recommendation_que = spark.sql("SELECT title, album, name AS artist FROM recommendation WHERE prediction = 1")

display(recommendation_que)

title,album,artist
Heartless,Heartless,The Weeknd
Fantasía,Fantasía,Ozuna
Blinding Lights,Blinding Lights,The Weeknd
Glittery - From The Kacey Musgraves Christmas Show,The Kacey Musgraves Christmas Show,Kacey Musgraves
My Name is Dark - Art Mix,My Name is Dark (Art Mix),Grimes
Everything Else Has Gone Wrong,Everything Else Has Gone Wrong,Bombay Bicycle Club
Don't Let It Break Your Heart - Single Edit,Don't Let It Break Your Heart,Louis Tomlinson
W (feat. Gunna),W (feat. Gunna),Koffee
Trust3000 (feat. Dijon),Trust3000 (feat. Dijon),No Rome
California Halo Blue,California Halo Blue,AWOLNATION


# Reflection

#### (1) In order to improve the accuracy of recommendation: 

There are lots of methods worth trying in order to improve the recommendation accuracy, such as trying different models like random forest, gbm, etc... or adding customized features like genres, male or female vocals, etc... but one things that should be done first is gather songs that I don't like, the playlists that I used to contruct *disliked* data are collected from certain genres, therefore very structured, however, there are song from other genres that I disliked, and song from my favorite genres that I dislike for no particular reason, so, in order to make prediction more accurate, precise disliked songs is required. Another thing worth trying would be adjusting the model overtime by trying the model's recommendation and carefully labeled them as like and dislike.

#### (2) If given more time:

Originally, I was trying to build a real-time music recommendator, but I spent too much time on getting pass authorization in DataBricks, wrangling the data, and ran out of time, I am still going to try building a automatic music recommnedor after the semester. It would be done by **Spotipy**'s ability to update user's playlist. 

In order to realized an automatic music recommendor: chron job as well as more advanced Spark features are required.

# References

1. Analysing My Spotify Music Library with Jupyter and a Bit of Pandas: https://vsupalov.com/analyze-spotify-music-library-with-jupyter-pandas
2. Making Your Own Spotify Discover Weekly Playlist – Toward Data Science: https://towardsdatascience.com/making-your-own-discover-weekly-f1ac7546fedb
3. Music Recommendation Service with the Spotify API, Spark MLlib and Databricks: https://medium.com/@polomarcus/music-recommendation-service-with-the-spotify-api-spark-mllib-and-databricks-7cde9b16d35d
4. Extract Songs from the Spotify API: https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/6937750999095841/1043779841828769/6197123402747553/latest.html

# Documentations

1. Spotify for Developers: https://developer.spotify.com/
2. Spotipy Documentation: https://spotipy.readthedocs.io/en/latest/#