# Music Recommendation System using Spotify API

A Music Recommendation System is an application of Data Science that aims to assist users in discovering new and relevant musical content based on their preferences and listening behaviour. Personalized music recommendations have become an essential tool in the digital music landscape, enabling music streaming platforms like Spotify and Apple Music to offer personalized and engaging experiences to their users. If you want to learn how to build a music recommendation system, this article is for you. In this article, I’ll take you through building a Music Recommendation System using Spotify API and Python.

### Step 1: Set Up Your Spotify Account

First things first, you'll need a Spotify account. If you don't have one yet, no worries! Just head over to Spotify's website and sign up for free. Once you've got your account sorted, log in.

### Step 2: Navigate to Your Spotify Developer Dashboard

Now, it's time to access your Spotify Developer Dashboard. This is where the magic happens. Click on [here](https://developer.spotify.com/dashboard/) to land directly on the dashboard. If this is your first time here, make sure to sign the agreement and verify your email. Once done, we can proceed to the next step.

### Step 3: Create Your App

Exciting stuff! Click on the option to create a new app. Fill in the necessary details for your app description. After that, you'll be directed to your client ID and client secret. Keep these credentials safe, as we'll need them for building our Music Recommendation System.



## How to Use

To kickstart our Music Recommendation System, we'll utilize Python along with the Spotify API. But first, we need to obtain an access token to authenticate our requests to the Spotify API. Here's how you can get it:

## Required Libraries

Before diving into the code, make sure you have the following dependencies installed:

- Spotipy
- Pandas
- NumPy
- Scikit-learn

In [1]:
import requests
import base64

# Replace with your own Client ID and Client Secret
CLIENT_ID = '5e554a2f696f46a282012455e33c4ef7'
CLIENT_SECRET = 'c6d5de8210ab43dd89eeb7d05b80ea9c'

# Base64 encode the client ID and client secret
client_credentials = f"{CLIENT_ID}:{CLIENT_SECRET}"
client_credentials_base64 = base64.b64encode(client_credentials.encode())

# Request the access token
token_url = 'https://accounts.spotify.com/api/token'
headers = {
    'Authorization': f'Basic {client_credentials_base64.decode()}'
}
data = {
    'grant_type': 'client_credentials'
}
response = requests.post(token_url, data=data, headers=headers)

if response.status_code == 200:
    access_token = response.json()['access_token']
    print("Access token obtained successfully.")
else:
    print("Error obtaining access token.")
    exit()

Access token obtained successfully.


In [4]:
import pandas as pd
import spotipy
from spotipy.oauth2 import SpotifyOAuth

def get_trending_playlist_data(playlist_id, access_token):
    # Set up Spotipy with the access token
    sp = spotipy.Spotify(auth=access_token)

    # Get the tracks from the playlist
    playlist_tracks = sp.playlist_tracks(playlist_id, fields='items(track(id, name, artists, album(id, name)))')

    # Extract relevant information and store in a list of dictionaries
    music_data = []
    for track_info in playlist_tracks['items']:
        track = track_info['track']
        track_name = track['name']
        artists = ', '.join([artist['name'] for artist in track['artists']])
        album_name = track['album']['name']
        album_id = track['album']['id']
        track_id = track['id']

        # Get audio features for the track
        audio_features = sp.audio_features(track_id)[0] if track_id != 'Not available' else None

        # Get release date of the album
        try:
            album_info = sp.album(album_id) if album_id != 'Not available' else None
            release_date = album_info['release_date'] if album_info else None
        except:
            release_date = None

        # Get popularity of the track
        try:
            track_info = sp.track(track_id) if track_id != 'Not available' else None
            popularity = track_info['popularity'] if track_info else None
        except:
            popularity = None

        # Add additional track information to the track data
        track_data = {
            'Track Name': track_name,
            'Artists': artists,
            'Album Name': album_name,
            'Album ID': album_id,
            'Track ID': track_id,
            'Popularity': popularity,
            'Release Date': release_date,
            'Duration (ms)': audio_features['duration_ms'] if audio_features else None,
            'Explicit': track_info.get('explicit', None),
            'External URLs': track_info.get('external_urls', {}).get('spotify', None),
            'Danceability': audio_features['danceability'] if audio_features else None,
            'Energy': audio_features['energy'] if audio_features else None,
            'Key': audio_features['key'] if audio_features else None,
            'Loudness': audio_features['loudness'] if audio_features else None,
            'Mode': audio_features['mode'] if audio_features else None,
            'Speechiness': audio_features['speechiness'] if audio_features else None,
            'Acousticness': audio_features['acousticness'] if audio_features else None,
            'Instrumentalness': audio_features['instrumentalness'] if audio_features else None,
            'Liveness': audio_features['liveness'] if audio_features else None,
            'Valence': audio_features['valence'] if audio_features else None,
            'Tempo': audio_features['tempo'] if audio_features else None,
            # Add more attributes as needed
        }

        music_data.append(track_data)

    # Create a pandas DataFrame from the list of dictionaries
    df = pd.DataFrame(music_data)

    return df

In [8]:
# If your playlist link is (https://open.spotify.com/playlist/37i9dQZF1DX76Wlfdnj7AP), the playlist ID is “37i9dQZF1DX76Wlfdnj7AP”, which is what you would replace with your playlist id within the above code snippet.
playlist_id = '37i9dQZF1DX76Wlfdnj7AP'  # Here inside the playlist_id variable you can input your own playlist id 

# Call the function to get the music data from the playlist and store it in a DataFrame
music_df = get_trending_playlist_data(playlist_id, access_token)

# Display the DataFrame
music_df.head(10)

Unnamed: 0,Track Name,Artists,Album Name,Album ID,Track ID,Popularity,Release Date,Duration (ms),Explicit,External URLs,...,Energy,Key,Loudness,Mode,Speechiness,Acousticness,Instrumentalness,Liveness,Valence,Tempo
0,Lovin On Me,Jack Harlow,Lovin On Me,6VCO0fDBGbRW8mCEvV95af,4xhsWYTOGcal8zt0J161CU,98,2023-11-10,138411,True,https://open.spotify.com/track/4xhsWYTOGcal8zt...,...,0.558,2,-4.911,1,0.0568,0.0026,2e-06,0.0937,0.606,104.983
1,redrum,21 Savage,american dream,2RRYaYHY7fIIdvFlvgb5vq,52eIcoLUM25zbQupAZYoFh,97,2024-01-12,270698,True,https://open.spotify.com/track/52eIcoLUM25zbQu...,...,0.74,2,-8.445,1,0.0481,0.00529,0.000224,0.5,0.246,172.089
2,BELLAKEO,"Peso Pluma, Anitta",BELLAKEO,3VLY9g3CAG1Y5r2eGVEaZ0,05WVKTdZhlIMX4qqMLuo0f,95,2023-12-07,197333,True,https://open.spotify.com/track/05WVKTdZhlIMX4q...,...,0.88,9,-2.834,1,0.101,0.0562,0.06,0.153,0.463,180.011
3,FE!N (feat. Playboi Carti),"Travis Scott, Playboi Carti",UTOPIA,18NOKLkZETa4sWwLMIm0UZ,42VsgItocQwOQC3XWZ8JNA,93,2023-07-28,191701,True,https://open.spotify.com/track/42VsgItocQwOQC3...,...,0.882,3,-2.777,0,0.06,0.0316,0.0,0.142,0.201,148.038
4,Prada,"cassö, RAYE, D-Block Europe",Prada,5MU0RmBSpoSxOPYBfcobDc,59NraMJsLaMCVtwXTSia8i,93,2023-08-11,132359,True,https://open.spotify.com/track/59NraMJsLaMCVtw...,...,0.717,8,-5.804,1,0.0375,0.001,2e-06,0.113,0.422,141.904
5,Rich Baby Daddy (feat. Sexyy Red & SZA),"Drake, Sexyy Red, SZA",For All The Dogs,4czdORdCWP9umpbhFXK2fW,1yeB8MUNeLo9Ek1UEpsyz6,93,2023-10-06,319192,True,https://open.spotify.com/track/1yeB8MUNeLo9Ek1...,...,0.729,2,-4.56,1,0.0528,0.0377,0.0,0.384,0.142,146.01
6,I'm Good (Blue),"David Guetta, Bebe Rexha",I'm Good (Blue),7M842DMhYVALrXsw3ty7B3,4uUG5RXrOk84mYEfFvj3cK,92,2022-08-26,175238,True,https://open.spotify.com/track/4uUG5RXrOk84mYE...,...,0.965,7,-3.673,0,0.0343,0.00383,7e-06,0.371,0.304,128.04
7,fukumean,Gunna,a Gift & a Curse,5qmZefgh78fN3jsyPPlvuw,4rXLjWdF2ZZpXCVTfWcshS,92,2023-06-16,125040,True,https://open.spotify.com/track/4rXLjWdF2ZZpXCV...,...,0.622,1,-6.747,0,0.0903,0.119,0.0,0.285,0.22,130.001
8,Save Your Tears,The Weeknd,After Hours,4yP0hdKOZPNshxUOjY0cZj,5QO79kh1waicV47BqGRL3g,90,2020-03-20,215627,True,https://open.spotify.com/track/5QO79kh1waicV47...,...,0.826,0,-5.487,1,0.0309,0.0212,1.2e-05,0.543,0.644,118.051
9,Vois sur ton chemin - Techno Mix,BENNETT,Vois sur ton chemin (Techno Mix),79Cyc8GRWnLyjdJSMyJ0dB,31nfdEooLEq7dn3UMcIeB5,90,2023-08-04,178156,False,https://open.spotify.com/track/31nfdEooLEq7dn3...,...,0.824,2,-3.394,0,0.047,0.0908,0.0711,0.119,0.371,137.959


In [9]:
music_df.isnull().sum()

Track Name          0
Artists             0
Album Name          0
Album ID            0
Track ID            0
Popularity          0
Release Date        0
Duration (ms)       0
Explicit            0
External URLs       0
Danceability        0
Energy              0
Key                 0
Loudness            0
Mode                0
Speechiness         0
Acousticness        0
Instrumentalness    0
Liveness            0
Valence             0
Tempo               0
dtype: int64

## Generating Awesome Recommendations

We're going to employ two main approaches for generating music recommendations: content-based filtering and popularity-based filtering.

In [10]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
from datetime import datetime
from sklearn.metrics.pairwise import cosine_similarity

data = music_df

### Popularity-Based Filtering

Here, we recommend music based on their popularity scores. We'll even throw in a twist by weighting recent releases more heavily.

In [11]:
# Function to calculate weighted popularity scores based on release date
def calculate_weighted_popularity(release_date):
    # Convert the release date to datetime object
    release_date = datetime.strptime(release_date, '%Y-%m-%d')

    # Calculate the time span between release date and today's date
    time_span = datetime.now() - release_date

    # Calculate the weighted popularity score based on time span (e.g., more recent releases have higher weight)
    weight = 1 / (time_span.days + 1)
    return weight

In [12]:
# Normalize the music features using Min-Max scaling
scaler = MinMaxScaler()
music_features = music_df[['Danceability', 'Energy', 'Key', 
                           'Loudness', 'Mode', 'Speechiness', 'Acousticness',
                           'Instrumentalness', 'Liveness', 'Valence', 'Tempo']].values
music_features_scaled = scaler.fit_transform(music_features)

### Content-Based Filtering

This method recommends music based on the similarity of their audio features. Using Spotipy, we'll fetch music data from Spotify and calculate similarity scores based on these features.


In [13]:
# a function to get content-based recommendations based on music features
def content_based_recommendations(input_song_name, num_recommendations=5):
    if input_song_name not in music_df['Track Name'].values:
        print(f"'{input_song_name}' not found in the dataset. Please enter a valid song name.")
        return

    # Get the index of the input song in the music DataFrame
    input_song_index = music_df[music_df['Track Name'] == input_song_name].index[0]

    # Calculate the similarity scores based on music features (cosine similarity)
    similarity_scores = cosine_similarity([music_features_scaled[input_song_index]], music_features_scaled)

    # Get the indices of the most similar songs
    similar_song_indices = similarity_scores.argsort()[0][::-1][1:num_recommendations + 1]

    # Get the names of the most similar songs based on content-based filtering
    content_based_recommendations = music_df.iloc[similar_song_indices][['Track Name', 'Artists', 'Album Name', 'Release Date', 'Popularity']]

    return content_based_recommendations

### Hybrid Approach

The hybrid approach combines the best of both worlds, merging content-based and popularity-based recommendations to provide you with the ultimate personalized experience.

In [14]:
# a function to get hybrid recommendations based on weighted popularity
def hybrid_recommendations(input_song_name, num_recommendations=5, alpha=0.5):
    if input_song_name not in music_df['Track Name'].values:
        print(f"'{input_song_name}' not found in the dataset. Please enter a valid song name.")
        return

    # Get content-based recommendations
    content_based_rec = content_based_recommendations(input_song_name, num_recommendations)

    # Get the popularity score of the input song
    popularity_score = music_df.loc[music_df['Track Name'] == input_song_name, 'Popularity'].values[0]

    # Calculate the weighted popularity score
    weighted_popularity_score = popularity_score * calculate_weighted_popularity(music_df.loc[music_df['Track Name'] == input_song_name, 'Release Date'].values[0])

    # Combine content-based and popularity-based recommendations based on weighted popularity
    hybrid_recommendations = content_based_rec
    hybrid_recommendations = hybrid_recommendations.append({
        'Track Name': input_song_name,
        'Artists': music_df.loc[music_df['Track Name'] == input_song_name, 'Artists'].values[0],
        'Album Name': music_df.loc[music_df['Track Name'] == input_song_name, 'Album Name'].values[0],
        'Release Date': music_df.loc[music_df['Track Name'] == input_song_name, 'Release Date'].values[0],
        'Popularity': weighted_popularity_score
    }, ignore_index=True)

    # Sort the hybrid recommendations based on weighted popularity score
    hybrid_recommendations = hybrid_recommendations.sort_values(by='Popularity', ascending=False)

    # Remove the input song from the recommendations
    hybrid_recommendations = hybrid_recommendations[hybrid_recommendations['Track Name'] != input_song_name]


    return hybrid_recommendations

## Putting It to the Test

To see the system in action, simply provide an input song name. The system will then generate recommendations based on this input, using our hybrid approach.

In [31]:
input_song_name = input("Hey give me a Song/Album: ")
recommendations = hybrid_recommendations(input_song_name, num_recommendations=10) #Can be changed
print(f"Songs/Albums I would recommended for '{input_song_name}':")
recommendations

Hey give me a Song/Album: SICKO MODE
Songs/Albums I would recommended for 'SICKO MODE':


  hybrid_recommendations = hybrid_recommendations.append({


Unnamed: 0,Track Name,Artists,Album Name,Release Date,Popularity
2,BELLAKEO,"Peso Pluma, Anitta",BELLAKEO,2023-12-07,95.0
6,Prada,"cassö, RAYE, D-Block Europe",Prada,2023-08-11,93.0
9,IDGAF (feat. Yeat),"Drake, Yeat",For All The Dogs,2023-10-06,91.0
3,Pepas,Farruko,Pepas,2021-06-24,83.0
5,SAY MY GRACE (feat. Travis Scott),"Offset, Travis Scott",SET IT OFF,2023-10-13,83.0
1,PUFFIN ON ZOOTIEZ,Future,I NEVER LIKED YOU,2022-04-29,82.0
7,10:35,"Tiësto, Tate McRae",10:35,2022-11-03,81.0
0,Princess Diana (with Nicki Minaj),"Ice Spice, Nicki Minaj",Princess Diana (with Nicki Minaj),2023-04-14,80.0
4,MONEY ON THE DASH,"Elley Duhé, Whethan",MONEY ON THE DASH,2023-01-20,78.0
8,La Jumpa,"Arcángel, Bad Bunny",La Jumpa,2022-11-30,78.0


## Wrapping Up

Building a Music Recommendation System using the Spotify API and Python opens up a world of endless possibilities for discovering new music that resonates with your unique tastes and preferences. Let's dive in and start exploring!

PS: Inspired by Aman Kharwal's insightful project ideas, this Music Recommendation System using the Spotify API and Python aims to further explore the realm of personalized music discovery. You can check out Aman's original post [here](https://thecleverprogrammer.com/2023/07/31/music-recommendation-system-using-python/) for more inspiration and detailed insights!