## STEP 7 : Model Deployment 

### Import Necessary Libraries

In [1]:
import pandas as pd
import numpy as np
import os
from sklearn.preprocessing import StandardScaler
from scipy.spatial.distance import cdist
import ipywidgets as widgets
from IPython.display import display, clear_output, HTML
import warnings

warnings.filterwarnings("ignore")

### 7.1 Load datasets from file paths

In [2]:
# Paths to datasets
data_path = '..\Dataset\data_cleaned_clustering.csv'
genre_data_path = '..\Dataset\genre_data_cleaned_clustering.csv'

# Check if files exist and load them
if os.path.exists(data_path) and os.path.exists(genre_data_path):
    data_original = pd.read_csv(data_path)
    genre_data_original = pd.read_csv(genre_data_path)
    print("Info: Data and genre data successfully loaded.")
else:
    print("Attention: One or both files are not found in the specified directory.")
    raise FileNotFoundError("Dataset files are missing. Please check the paths.")

# Create copies of the datasets for manipulation
data = data_original.copy()
genre_data = genre_data_original.copy()

Info: Data and genre data successfully loaded.



### 7.2 Types of Recommendation System

### 7.2.1. Collaborative Filtering

Collaborative Filtering helps suggest songs you might like by looking at what other people who like the same music listen to.

#### 7.2.1.1 User-based Collaborative Filtering
- **What it does**: It’s like getting song recommendations from a friend who has similar taste in music. The system finds other users who like what you like and suggests songs they enjoy.
- **Why it’s good**: It’s personalized — you get suggestions that fit your own music taste.

#### 7.2.1.2 Item-based Collaborative Filtering
- **What it does**: If you like a specific song, this method finds other songs that fans of that song also like and recommends them to you.
- **Why it’s good**: It’s reliable because it doesn’t depend much on changing user preferences; it focuses on the songs themselves.

**Note**: Both methods need a lot of data about what songs people listen to work well.

### 7.2.2. Content-Based Filtering

This method recommends songs by looking at the features of the songs themselves.

- **Example**: If you like songs with lots of guitar or a specific beat, the system will find other songs with similar music features and recommend them to you.
- **Why it’s useful**: It gives you songs that match the musical styles you like, without needing to know what others like.

### 7.2.3. Hybrid Models

Hybrid models mix both collaborative and content-based methods to give you better song recommendations.

- **How it works**: These models use both what you like and what songs are like to make suggestions. This way, even if there’s not a lot of data on what you’ve listened to, you can still get good recommendations.
- **Why it’s beneficial**: It helps fill in the gaps when there’s limited information about user preferences or new songs, making sure you still get recommendations that you are likely to enjoy.


## 7.3 Choosing the Best Model for Music Recommendation

After analyzing our datasets, `genre_data_cleaned.csv` and `data_cleaned.csv`, we need to decide on the most suitable recommendation model based on the characteristics of each dataset.

### 7.3.1 Dataset Overview

#### 1. Genre Data (`genre_data_cleaned.csv`)
This dataset provides extensive descriptive metadata about songs, which includes:

- **Audio Features**: Metrics such as acousticness, danceability, energy, and instrumentalness.
- **Categorical Data**: Attributes like genre, popularity category, and mode category, which describe the emotional and stylistic characteristics of the songs.
- **Artist and Song Information**: Details on the artists and individual tracks.

**Appropriate Model**: Given the rich content-based features, this dataset is ideal for a **Content-Based Filtering** recommendation system.

#### 2. Data Cleaned (`data_cleaned.csv`)
This dataset includes:

- **Song Features**: Contains attributes similar to the genre dataset, such as acousticness, danceability, and energy.
- **Lack of User-Specific Interaction Data**: It does not contain user-item interactions like ratings or play history, which are crucial for collaborative filtering.

**Appropriate Model**: The absence of user interaction data makes this dataset less suited for collaborative filtering and more appropriate for a **Content-Based Filtering** approach.

#### Conclusion

**Model Choice**
- The extensive details in the `data_cleaned.csv` make it ideal for analyzing and recommending songs based on their inherent characteristics.
- We will proceed with a **Content-Based Filtering** model using the rich metadata available in our datasets.
- **Technique**
- We will implement **Cosine Similarity** to identify and recommend songs with similar features, aiming to improve user satisfaction by aligning recommendations closely with their preferences.


## 7.4 Music Recommendation System

**Obective**

Recommendation system is set up with a user interface to interact with the user and suggest personalized recommendations based on various features.

**Overview**

**User Interface**

Users can interact with dropdowns to select songs, artists, or genres for receiving recommendations.Three dropdowns are created for the user to choose:
- All Songs: Displays a list of all songs.
- By Artist: Displays songs filtered by the artist.
- By Genre: Displays songs filtered by genre.
These dropdowns allow the user to select songs for which recommendations will be generated.

**Recommendation Logic**
- Based on the song selected, it computes and displays 5 similar songs with the cosine distance and cluster alignment.
- The recommendation screen also shows the average cosine distance between the selected song and recommended songs.
- It calculates and shows the alignment of clusters between the selected song and the recommended ones, providing feedback based on the alignment score:
    - Excellent match (alignment > 0.8)
    - Good match (alignment > 0.5)
    - Poor match (alignment <= 0.5)


In [None]:
# Shuffling the dataset for efficient processing
data_cleaned_shuffled = data.sample(frac=1, random_state=42).reset_index(drop=True)

# Relevant numeric columns for recommendations
number_cols = [
    'valence', 'year', 'acousticness', 'danceability', 'duration_min', 
    'energy', 'explicit', 'instrumentalness', 'key', 'liveness', 
    'loudness_scaled', 'mode', 'popularity', 'speechiness', 'tempo'
]

# Initialize and fit the scaler
scaler = StandardScaler()
scaler.fit(data_cleaned_shuffled[number_cols])

# Dropdowns for selecting songs, artists, or genres
all_songs_dropdown = widgets.Dropdown(description='🎵 All Songs:', layout={'width': '50%'})
artist_dropdown = widgets.Dropdown(description='🎤 By Artist:', layout={'width': '50%'})
genre_dropdown = widgets.Dropdown(description='🎧 By Genre:', layout={'width': '50%'})

# Populate dropdowns with formatted options
all_songs_dropdown.options = [(f"{row['name']} ({row['year']}) - {', '.join(eval(row['artists']))} 🎶", index) for index, row in data_cleaned_shuffled.iterrows()]
artist_dropdown.options = [(artist, index) for index, artist in enumerate(data_cleaned_shuffled['artists'].apply(lambda x: ', '.join(eval(x))).unique())]
genre_dropdown.options = [(genre, index) for index, genre in enumerate(genre_data['genres'].unique())]

# Outputs for displaying recommendations and metrics
output = widgets.Output()
metrics_output = widgets.Output()

# Function to recommend songs and update dropdown for further recommendations
def recommend_songs(song_index, data, genre_data, n_songs=5):
    song_data = data.iloc[song_index]
    song_features = scaler.transform([song_data[number_cols]])
    data_features = scaler.transform(data[number_cols])
    distances = cdist(song_features, data_features, 'cosine')[0]
    indices = np.argsort(distances)[:n_songs + 6]  # Avoid self-recommendation
    recommended_songs = data.iloc[indices].head(n_songs + 1)
    
    # Merge genre data using cluster label from genre_data
    recommended_songs = recommended_songs.copy()
    genre_map = genre_data.set_index('cluster')['genres'].to_dict()  # Map cluster to genre
    recommended_songs['genre'] = recommended_songs['cluster_label'].map(genre_map)
    
    return recommended_songs[recommended_songs['id'] != song_data['id']], distances[indices[1:n_songs + 1]]

# Function to display recommendations and metrics
def display_recommendations(change):
    song_index = change.new
    recommendations, distances = recommend_songs(song_index, data_cleaned_shuffled, genre_data, 5)
    
    with output:
        clear_output()
        display(HTML("<strong>🎧 Recommendations:</strong>"))
        
        recommendations_dropdown = widgets.Dropdown(
            options=[(f"🎶 {row['name']} - {', '.join(eval(row['artists']))} 🎤 - {row['genre']} 🎧", idx) for idx, row in recommendations.iterrows()],
            description='Further Recommendations:',
            layout={'width': '50%'}
        )
        
        recommendations_dropdown.observe(display_recommendations, names='value')
        display(recommendations_dropdown)
        
        # Filter the DataFrame to show only necessary columns and display it, including genre
        filtered_df = recommendations[['name', 'year', 'artists', 'duration_min', 'genre', 'cluster_label']]
        filtered_df['artists'] = filtered_df['artists'].apply(lambda x: ', '.join(eval(x)))
        display(HTML(filtered_df.to_html(index=False)))  # Displaying the filtered DataFrame
    
    with metrics_output:
        clear_output()
        average_distance = np.mean(distances)
        input_cluster = data_cleaned_shuffled.iloc[song_index]['cluster_label']
        
        # Calculate the alignment of clusters
        alignment = np.mean(recommendations['cluster_label'] == input_cluster)
        display(HTML(f"<strong>📏 Average Cosine Distance:</strong> {average_distance:.4f}<br><strong>🔄 Cluster Alignment:</strong> {alignment:.2f}"))
        
        # Conclusion based on alignment score
        if alignment > 0.8:
            conclusion = "👌 Excellent match! 🔥"
        elif alignment > 0.5:
            conclusion = "👍 Good match!"
        else:
            conclusion = "⚠️ Poor match, consider refining features or the model."
        
        display(HTML(f"<strong>📝 Conclusion:</strong> {conclusion}"))

# Connect dropdowns to the recommendation function
all_songs_dropdown.observe(display_recommendations, names='value')
artist_dropdown.observe(display_recommendations, names='value')
genre_dropdown.observe(display_recommendations, names='value')

# Display UI components
tab = widgets.Tab([all_songs_dropdown, artist_dropdown, genre_dropdown])
tab.set_title(0, '🎶 All Songs')
tab.set_title(1, '🎤 By Artist')
tab.set_title(2, '🎧 By Genre')

# Set up the layout and show everything on the UI
display(widgets.VBox([tab, output, metrics_output]))


VBox(children=(Tab(children=(Dropdown(description='🎵 All Songs:', layout=Layout(width='50%'), options=(("Camby…

## 7.5 Future Scope

**Real Time Processing**
- Build user profiles, integrate real-time feedback (likes/dislikes) and apply collaborative filtering for more accurate, personalized suggestions.

**Advanced Models**
- Implement Hybrid models combining collaborative, content-based and context-aware filtering for better recommendations.

**Enanced Features**
- Incorporate audio features, sentiment analysis of lyrics and temporal trends better recommendations.
