<a href="https://colab.research.google.com/github/anjanakp3103-wq/Music-Recommendation-System/blob/main/Music_Recommendation_System.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Music Recommendation System using Content-Based Filtering



# Music Recommendation System

##(A) Problem Definition & Objective

### Project Track
Recommendation System

### Problem Statement
With the rapid growth of digital music platforms, users are exposed to a massive collection of songs. Discovering music that matches individual preferences becomes challenging.

This project focuses on building a **content-based music recommendation system** that suggests similar songs based on their audio features.

### Objective
The objective of this project is to recommend songs similar to a given input song by analyzing musical attributes such as energy, tempo, danceability, and mood.

### Real-World Relevance
Music recommendation systems are widely used in platforms like Spotify and Apple Music to enhance user experience by helping users discover new and relevant music.


##(B) Data Understanding & Preparation

### Dataset Source
The dataset used in this project is a Spotify music dataset stored in CSV format. Each row represents a song along with its corresponding audio features.

### Important Columns
- track_name: Name of the song
- artist_name: Artist of the song
- danceability
- energy
- loudness
- speechiness
- acousticness
- instrumentalness
- liveness
- valence
- tempo

### Preprocessing Steps
- Selected relevant numerical audio features
- Reduced dataset size to avoid memory issues
- Applied feature scaling using StandardScaler


##(C) Model / System Design

### Recommendation Technique
This project uses a **content-based recommendation approach**, where songs are recommended based on similarity in their audio features.

### Similarity Method
Cosine similarity is used to measure how close two songs are in terms of their feature vectors.

### Reason for Choosing This Method
- Does not require user history
- Simple and efficient
- Suitable for understanding similarity between songs


##(D) Core Implementation

This section contains the implementation of the music recommendation system, including data loading, preprocessing, similarity computation, and recommendation generation.


## 1. Import Required Libraries


In [None]:
import pandas as pd
import numpy as np

from sklearn.preprocessing import StandardScaler
from sklearn.metrics.pairwise import cosine_similarity


## 2. Upload Dataset


In [None]:
from google.colab import files
uploaded = files.upload()



Saving spotify_tracks[1].csv to spotify_tracks[1] (5).csv


## 3. Check Uploaded Files


In [None]:
import os
os.listdir()


['.config',
 'spotify_tracks[1] (5).csv',
 'spotify_tracks[1] (3).csv',
 'spotify_tracks[1] (4).csv',
 'spotify_tracks[1] (1).csv',
 'spotify_tracks[1].csv',
 'spotify_tracks[1] (2).csv',
 'sample_data']

## 4. Load Dataset


In [None]:
df = pd.read_csv("spotify_tracks[1].csv")
df = df.sample(3000, random_state=42)
df = df.reset_index(drop=True)
df.head()

Unnamed: 0,track_id,track_name,artist_name,year,popularity,artwork_url,album_name,acousticness,danceability,duration_ms,...,key,liveness,loudness,mode,speechiness,tempo,time_signature,valence,track_url,language
0,5bQTuAxaBADfWk4vD8pCa5,"Adiye (From ""Kadal"")","A.R. Rahman, Sid Sriram",2019,15,https://i.scdn.co/image/ab67616d0000b273542d5b...,Rahman Rewind: Absolute Hits,0.411,0.387,301973.0,...,2.0,0.0843,-5.684,1.0,0.079,45.989,4.0,0.166,https://open.spotify.com/track/5bQTuAxaBADfWk4...,Unknown
1,4LrrOBgHGJNvRfmq2FRFBM,"Milne Hai Mujhse Aayi (From ""Aashiqui 2"")",Arijit Singh,2017,21,https://i.scdn.co/image/ab67616d0000b273c8d81c...,Best Of Shraddha Kapoor,0.0357,0.515,295798.0,...,11.0,0.134,-5.467,0.0,0.0426,143.752,4.0,0.304,https://open.spotify.com/track/4LrrOBgHGJNvRfm...,Hindi
2,1P4CbDljM2EGHjaElwUwOa,The Vinciguerra Affair,Daniel Pemberton,2015,1,https://i.scdn.co/image/ab67616d0000b2738f5aee...,The Man from U.N.C.L.E. (Original Motion Pictu...,0.31,0.772,202240.0,...,2.0,0.089,-13.234,0.0,0.032,120.008,4.0,0.475,https://open.spotify.com/track/1P4CbDljM2EGHja...,English
3,6sFq4k5MEWPafTW7GolQzP,Aadhi Bhagavan,"Ilaiyaraaja, Sriram Parthasarathy",2010,9,https://i.scdn.co/image/ab67616d0000b2735934ad...,Baba Pugazh Maalai,0.132,0.67,324600.0,...,1.0,0.298,-5.47,1.0,0.0267,91.008,4.0,0.346,https://open.spotify.com/track/6sFq4k5MEWPafTW...,Tamil
4,4wwi2d5XsvVDTA0rbTbOpQ,Beautiful,Amit Trivedi,2022,42,https://i.scdn.co/image/ab67616d0000b2739cfaea...,Goodbye (Original Motion Picture Soundtrack),0.514,0.743,171375.0,...,2.0,0.115,-7.922,1.0,0.032,109.98,4.0,0.512,https://open.spotify.com/track/4wwi2d5XsvVDTA0...,Hindi


## 5. Dataset Information

The dataset contains metadata of songs including their musical characteristics.
Each row represents a unique track.


In [None]:
df.shape
df.info()


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3000 entries, 0 to 2999
Data columns (total 22 columns):
 #   Column            Non-Null Count  Dtype  
---  ------            --------------  -----  
 0   track_id          3000 non-null   object 
 1   track_name        3000 non-null   object 
 2   artist_name       3000 non-null   object 
 3   year              3000 non-null   int64  
 4   popularity        3000 non-null   int64  
 5   artwork_url       3000 non-null   object 
 6   album_name        3000 non-null   object 
 7   acousticness      3000 non-null   float64
 8   danceability      3000 non-null   float64
 9   duration_ms       3000 non-null   float64
 10  energy            3000 non-null   float64
 11  instrumentalness  3000 non-null   float64
 12  key               3000 non-null   float64
 13  liveness          3000 non-null   float64
 14  loudness          3000 non-null   float64
 15  mode              3000 non-null   float64
 16  speechiness       3000 non-null   float64


## 6. Checking Missing Values


In [None]:
df.isnull().sum()


Unnamed: 0,0
track_id,0
track_name,0
artist_name,0
year,0
popularity,0
artwork_url,0
album_name,0
acousticness,0
danceability,0
duration_ms,0


## 7. Feature Selection


In [None]:
features = [
    'danceability',
    'energy',
    'loudness',
    'speechiness',
    'acousticness',
    'instrumentalness',
    'liveness',
    'valence',
    'tempo'
]

X = df[features]


## 8. Feature Scaling


In [None]:
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)


Similarity Computation

In [None]:
from sklearn.metrics.pairwise import cosine_similarity

similarity_matrix = cosine_similarity(X_scaled)


#9. Recommendation Function

In [None]:
def recommend_songs(song_name, num_recommendations=50):
    if song_name not in df['track_name'].values:
        return "Song not found in dataset"

    index = df[df['track_name'] == song_name].index[0]
    similarity_scores = list(enumerate(similarity_matrix[index]))
    similarity_scores = sorted(similarity_scores, key=lambda x: x[1], reverse=True)

    recommended_indices = [i[0] for i in similarity_scores[1:num_recommendations+1]]
    return df.iloc[recommended_indices][['track_name', 'artist_name']]






Output / Demo

In [None]:
recommend_songs("Blinding Lights")



Unnamed: 0,track_name,artist_name
2960,Blinding Lights,The Weeknd
2225,Dancing Like Butterfly Wings,ATEEZ
2415,"Aaj Phir (From ""Hate Story 2)","Arijit Singh, Samira Koppikar"
487,Dua Karo,"Arijit Singh, Bohemia, Sachin-Jigar"
125,Love On The Brain,Rihanna
2401,Thinking of You (I Drive Myself Crazy),*NSYNC
891,Bird Set Free,Sia
2600,BEST SHOT,ASTRO
1849,Holy Ground (Taylor's Version),Taylor Swift
1753,Ekkadikelle Daaridhi,"Anirudh Ravichander, Sri Krishna, Shakthisree ..."


(E) Evaluation & Analysis
## Evaluation & Analysis

### Evaluation Method
The system is evaluated qualitatively by observing whether the recommended songs share similar musical characteristics with the input song.

### Example Output
For an input song like "Blinding Lights", the system recommends songs with similar energy, tempo, and mood.

### Limitations
- No user personalization
- Depends only on audio features
- Does not consider lyrics or user feedback


##(F) Ethical Considerations & Responsible AI

### Dataset Bias
The dataset may be biased toward popular genres or artists.

### Lack of Personalization
The system does not adapt to individual user preferences.

### Limited Diversity
Content-based systems may repeatedly recommend similar types of music.


##(G) Conclusion & Future Scope

### Conclusion
A content-based music recommendation system was successfully developed using cosine similarity and audio features. The system provides relevant song recommendations based on similarity.

### Future Scope
- Incorporating user listening history
- Using hybrid recommendation techniques
- Including lyrics and popularity data
- Improving diversity in recommendations
