## **MUSIC RECOMMENDATION SYSTEM USING SPOTIFY API AND PYTHON**

### <u>INTRODUCTION-</u>

##### "In this project, we dive into building a personalized music recommendation system. Music recommendation systems help users discover new songs and artists based on their preferences. Using Python, NLP, and deep learning techniques, we’ll harness the Spotify API to create recommendations that mirror the algorithms behind streaming giants like Spotify and Apple Music." 

### <u>Objective-</u>

#### "To implement a recommendation system that recommends music tailored to user preferences and musical history, helping users explore new music in a personalized manner."

### <u> How Music Recommandation System Works-</u>

#### This section explains the recommendation algorithms and processes:
####
#### "Recommendation systems use advanced algorithms to analyze users' listening history, liked and disliked tracks, and even feedback. Here are some popular approaches:
####
#### Content-Based Filtering: Recommends music based on similarities to previously liked songs.
#### Collaborative Filtering: Finds users with similar listening habits to generate recommendations.
#### Hybrid Systems: Combines both methods to boost recommendation quality." #####

#### *HYBRID SYSTEMS APPROACH IS USED IN IT*

#### <u>Why Use Spotify API?</u>

#### <u>Spotify API Introduction:</u>

#### "The Spotify API provides access to a vast catalog of music data, including track information, artist details, playlists, and audio features like tempo and energy. By integrating Spotify data, we can create rich user profiles and more accurate music recommendations."

#### <U> Setup and Authentication:</u>

#### "To access Spotify data, you'll need to create a Spotify developer account and generate an API key, which will allow us to interact with the Spotify catalog programmatically."

### <u> DIVING INTO THE PROJECT-</u>

#### I hope you have understood what a **Music Recommendation System** is. Now, in this section, I’ll take you through building a Music Recommendation System using **Spotify API** and **Python**.

#### To get started with building a Music Recommendation System, we first need to have an **access token**. The access token serves as a temporary authorization credential, allowing the code to make authenticated requests to the Spotify API on behalf of the application. Below is how we can get it:


In [27]:
import requests
import base64

CLIENT_ID = '7d2df6c8c1b9435882308038dff6d342'
CLIENT_SECRET = '3f647d4b8fa447e8b2be350438fdaa48'

client_credentials = f"{CLIENT_ID}:{CLIENT_SECRET}"
client_credentials_base64 = base64.b64encode(client_credentials.encode())

token_url = 'https://accounts.spotify.com/api/token'
headers = {
    'Authorization': f'Basic {client_credentials_base64.decode()}'
}
data = {
    'grant_type': 'client_credentials'
}
response = requests.post(token_url, data=data, headers=headers)

if response.status_code == 200:
    access_token = response.json()['access_token']
    print("Access token obtained successfully.")
else:
    print("Error obtaining access token.")
    exit()

Access token obtained successfully.


#### <u> Setting Up Spotify API Access Credentials-</u>

#### In the code, the *CLIENT_ID* and *CLIENT_SECRET* variables hold your unique credentials for accessing the Spotify API. **These credentials are essential for authenticating the application* and are obtained from Spotify’s developer dashboard when you register your application.

#### - *CLIENT_ID*: Uniquely identifies the application making requests to the Spotify API.
#### - **CLIENT_SECRET*: A confidential key used alongside the *CLIENT_ID* for secure access.


#### Now, I’ll write a function to get music data from any playlist on Spotify. For this task, you need to install the Spotipy library, which is a Python library providing access to Spotify’s web API. Here’s how to install it on your system by writing the command mentioned below in your command prompt or terminal:

#### *pip install spotipy*
#### Below I am defining a function responsible for collecting music data from any playlist on Spotify using the Spotipy library:


In [28]:
import pandas as pd
import spotipy
from spotipy.oauth2 import SpotifyOAuth

def get_trending_playlist_data(playlist_id, access_token):

    sp = spotipy.Spotify(auth=access_token)

    playlist_tracks = sp.playlist_tracks(playlist_id, fields='items(track(id, name, artists, album(id, name)))')

    music_data = []
    for track_info in playlist_tracks['items']:
        track = track_info['track']
        track_name = track['name']
        artists = ', '.join([artist['name'] for artist in track['artists']])
        album_name = track['album']['name']
        album_id = track['album']['id']
        track_id = track['id']

        
        audio_features = sp.audio_features(track_id)[0] if track_id != 'Not available' else None

    
        try:
            album_info = sp.album(album_id) if album_id != 'Not available' else None
            release_date = album_info['release_date'] if album_info else None
        except:
            release_date = None

        
        try:
            track_info = sp.track(track_id) if track_id != 'Not available' else None
            popularity = track_info['popularity'] if track_info else None
        except:
            popularity = None

    
        track_data = {
            'Track Name': track_name,
            'Artists': artists,
            'Album Name': album_name,
            'Album ID': album_id,
            'Track ID': track_id,
            'Popularity': popularity,
            'Release Date': release_date,
            'Duration (ms)': audio_features['duration_ms'] if audio_features else None,
            'Explicit': track_info.get('explicit', None),
            'External URLs': track_info.get('external_urls', {}).get('spotify', None),
            'Danceability': audio_features['danceability'] if audio_features else None,
            'Energy': audio_features['energy'] if audio_features else None,
            'Key': audio_features['key'] if audio_features else None,
            'Loudness': audio_features['loudness'] if audio_features else None,
            'Mode': audio_features['mode'] if audio_features else None,
            'Speechiness': audio_features['speechiness'] if audio_features else None,
            'Acousticness': audio_features['acousticness'] if audio_features else None,
            'Instrumentalness': audio_features['instrumentalness'] if audio_features else None,
            'Liveness': audio_features['liveness'] if audio_features else None,
            'Valence': audio_features['valence'] if audio_features else None,
            'Tempo': audio_features['tempo'] if audio_features else None,
        
        }

        music_data.append(track_data)

    
    df = pd.DataFrame(music_data)

    return df

In [29]:
playlist_id = '7bUJaaksI2okk2ccSq0u2G'

# Call the function to get the music data from the playlist and store it in a DataFrame
music_df = get_trending_playlist_data(playlist_id, access_token)

# Display the DataFrame
print(music_df)

                  Track Name  \
0                Mari Antaga   
1                Life Of Ram   
2              Nee Prashnalu   
3            Nammaka Tappani   
4   Ee Kshnam Oke Oka Korika   
..                       ...   
95             Nammave Ammai   
96                Alupannadi   
97            Nenani Neevani   
98  Nindu Noorella Version 2   
99            Cheppave Prema   

                                              Artists  \
0   Mickey J. Meyer, Sreerama Chandra, Sirivennela...   
1                                       Pradeep Kumar   
2                              S. P. Balasubrahmanyam   
3                                    Sagar, Sumangaly   
4                                       K. S. Chithra   
..                                                ...   
95                  Harish Raghavendra, K. S. Chithra   
96                                      K. S. Chithra   
97                                      Shweta Pandit   
98                         Gopika Poornima, K

#### <u> Extracting Music Data from a Spotify Playlist- </u>

#### In this code snippet, we use a sample **playlist ID*: 7bUJaaksI2okk2ccSq0u2G. The code calls the *get_trending_playlist_data* function, using the provided *access_token* to extract music data from this playlist. The collected data is stored in a DataFrame called *music_df*, and the DataFrame is printed to display the extracted music data.


#### Now, let’s check if the data contains any null values.


In [30]:
print(music_df.isnull().sum())

Track Name          0
Artists             0
Album Name          0
Album ID            0
Track ID            0
Popularity          0
Release Date        0
Duration (ms)       0
Explicit            0
External URLs       0
Danceability        0
Energy              0
Key                 0
Loudness            0
Mode                0
Speechiness         0
Acousticness        0
Instrumentalness    0
Liveness            0
Valence             0
Tempo               0
dtype: int64


#### <U> Importing Necessary Python Libraries- </u>

In [None]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
from datetime import datetime
from sklearn.metrics.pairwise import cosine_similarity

data = music_df

#### While providing music recommendations to users, it is important to recommend the latest releases. For this, we need to give more weight to the latest releases in the recommendations. Let’s write a function to solve this problem:

In [None]:

def calculate_weighted_popularity(release_date):
    
    release_date = datetime.strptime(release_date, '%Y-%m-%d')

    
    time_span = datetime.now() - release_date

    
    weight = 1 / (time_span.days + 1)
    return weight

#### This function calculates a weighted popularity score based on a track's release date.
#### It takes the release date (format: 'YYYY-MM-DD') as input and converts it to a datetime object.
#### The time span between the release date and today's date is calculated using datetime.now() - release_date.
# 
#### Weighted popularity score formula:
####   weight = 1 / (time_span.days + 1)
# 
#### This helps prioritize recent releases in the recommendation system.


### Now let’s normalize the music features before moving forward:

In [31]:
# Normalize the music features using Min-Max scaling
scaler = MinMaxScaler()
music_features = music_df[['Danceability', 'Energy', 'Key', 
                           'Loudness', 'Mode', 'Speechiness', 'Acousticness',
                           'Instrumentalness', 'Liveness', 'Valence', 'Tempo']].values
music_features_scaled = scaler.fit_transform(music_features)

#### Generates music recommendations based on audio similarity using an input song name (input_song_name).
#### Checks if input_song_name exists in music_df (data: 'Track Name', 'Artists', 'Album Name', 'Release Date', 'Popularity').
#### Calculates cosine similarity between the input song’s audio features and other songs, excluding the input song itself.
#### Retrieves top recommendations based on similarity and prepares for a hybrid approach with weighted popularity.


In [None]:

def content_based_recommendations(input_song_name, num_recommendations=5):
    if input_song_name not in music_df['Track Name'].values:
        print(f"'{input_song_name}' not found in the dataset. Please enter a valid song name.")
        return

    
    input_song_index = music_df[music_df['Track Name'] == input_song_name].index[0]

    
    similarity_scores = cosine_similarity([music_features_scaled[input_song_index]], music_features_scaled)

    
    similar_song_indices = similarity_scores.argsort()[0][::-1][1:num_recommendations + 1]


    content_based_recommendations = music_df.iloc[similar_song_indices][['Track Name', 'Artists', 'Album Name', 'Release Date', 'Popularity']]

    return content_based_recommendations

#### Hybrid approach combines content similarity and weighted popularity for personalized recommendations.
#### The function first gets content-based recommendations for input_song_name using content_based_recommendations function.
#### It calculates the popularity score of the input song and its weighted popularity (via calculate_weighted_popularity).
#### Alpha parameter adjusts the balance between content-based and popularity-based recommendations.

#### Combines content-based recommendations with input song details (track name, artists, album, release date, popularity).
#### Sorts hybrid_recommendations DataFrame by weighted popularity, placing most popular, relevant songs at the top.
#### Finally, the input song is excluded from the final recommendations to avoid redundancy.


In [25]:
import pandas as pd

def hybrid_recommendations(input_song_name, num_recommendations=5, alpha=0.5):
    if input_song_name not in music_df['Track Name'].values:
        print(f"'{input_song_name}' not found in the dataset. Please enter a valid song name.")
        return

    content_based_rec = content_based_recommendations(input_song_name, num_recommendations)

    popularity_score = music_df.loc[music_df['Track Name'] == input_song_name, 'Popularity'].values[0]

    weighted_popularity_score = popularity_score * calculate_weighted_popularity(
        music_df.loc[music_df['Track Name'] == input_song_name, 'Release Date'].values[0]
    )

    new_entry = pd.DataFrame({
        'Track Name': [input_song_name],
        'Artists': [music_df.loc[music_df['Track Name'] == input_song_name, 'Artists'].values[0]],
        'Album Name': [music_df.loc[music_df['Track Name'] == input_song_name, 'Album Name'].values[0]],
        'Release Date': [music_df.loc[music_df['Track Name'] == input_song_name, 'Release Date'].values[0]],
        'Popularity': [weighted_popularity_score]
    })

    hybrid_recommendations = pd.concat([content_based_rec, new_entry], ignore_index=True)

    hybrid_recommendations = hybrid_recommendations.sort_values(by='Popularity', ascending=False)

    hybrid_recommendations = hybrid_recommendations[hybrid_recommendations['Track Name'] != input_song_name]

    return hybrid_recommendations

In [26]:
input_song_name = "Mellaga"
recommendations = hybrid_recommendations(input_song_name, num_recommendations=5)
print(f"Hybrid recommended songs for '{input_song_name}':")
print(recommendations)

Hybrid recommended songs for 'Mellaga':
               Track Name                                            Artists  \
2               Ghal Ghal                             S. P. Balasubrahmanyam   
4             Avunu Nijam                                        KK, Sunitha   
1           Sri Anjaneyam                        M.L.R. Karthikeyan, Jr. NTR   
3         Musugu Veyyoddu                                            Kalpana   
0  Intikeldam Padavammo..  S. P. Balasubrahmanyam, K. S. Chithra, Swarnal...   

                                     Album Name Release Date  Popularity  
2                    Nuvvostanante Nenoddantana   2005-01-14        52.0  
4                                        Athadu         2005        49.0  
1                                   Oosaravelli   2011-09-16        43.0  
3                                       Khadgam   2002-10-29        39.0  
0  Golden Hits Of Sirivennela Seetharama Sastry   2022-05-16        15.0  


### <u>Conclusion-</u>

#### I hope you found this guide helpful for building a **Music Recommendation System** using the **Spotify API** and Python! This project demonstrates how Data Science can enhance user experiences by offering personalized music recommendations based on listening preferences. 

#### Through this project, you've learned how to utilize APIs to collect real-time data and put it to practical use in a recommendation system. Feel free to reach out with any questions or if you encounter issues signing up for a Spotify developer account.
