##Music Similarity Finder- Rijalda Šaćirbegović(20184)

### Installation and Imports

In this first step, I install the libraries needed for audio processing and the user interface.  
- **librosa** is used for loading audio files and extracting features.  
- **mutagen** will later be used to read metadata (title, artist, genre) from MP3 files.

After installation, I import all libraries required for the project, including:
- `numpy` and `pandas` for data handling
- `matplotlib` for visualizations
- `gradio` for building the interactive UI
- `librosa` for audio analysis
- `cosine_similarity` from scikit-learn for comparing feature vectors

This cell prepares the environment for the rest of the project.

In [1]:
!pip install librosa -q
!pip install mutagen

import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import gradio as gr
import librosa
from sklearn.metrics.pairwise import cosine_similarity

Collecting mutagen
  Downloading mutagen-1.47.0-py3-none-any.whl.metadata (1.7 kB)
Downloading mutagen-1.47.0-py3-none-any.whl (194 kB)
[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/194.4 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m194.4/194.4 kB[0m [31m5.5 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: mutagen
Successfully installed mutagen-1.47.0


In [2]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


### Extracting Metadata from Audio Files

In this step, I scan the folder containing all MP3 tracks and collect basic metadata for each file.  
I use the `mutagen` library to read ID3 tags such as **title**, **artist**, and **genre**.  
If a file does not contain ID3 tags, I go back to using the filename as the title and set unknown fields to default values.

For each track, I create a dictionary with:
- a unique `track_id`
- the song title
- artist name
- genre
- the original filename

All metadata is stored in a pandas DataFrame and saved to `metadata.csv` in my Drive.  
This file will be used later when building the similarity system.

In [3]:
import os
import pandas as pd
from mutagen.easyid3 import EasyID3
import mutagen

TRACKS_DIR = "/content/drive/MyDrive/ADS project /tracks"

files = sorted([
    f for f in os.listdir(TRACKS_DIR)
    if f.lower().endswith(".mp3")
])

metadata = []

for i, f in enumerate(files, start=1):
    path = os.path.join(TRACKS_DIR, f)

    try:
        audio = EasyID3(path)
        title = audio.get("title", [f])[0]
        artist = audio.get("artist", ["Unknown Artist"])[0]
        genre = audio.get("genre", ["Unknown"])[0]
    except mutagen.id3.ID3NoHeaderError:
        # fallback if file has no ID3 tag
        base = f.rsplit(".", 1)[0]
        title = base.replace("_", " ").replace("-", " ")
        artist = "Unknown Artist"
        genre = "Unknown"

    metadata.append({
        "track_id": i,
        "title": title,
        "artist": artist,
        "genre": genre,
        "filename": f
    })

df = pd.DataFrame(metadata)

OUTPUT_PATH = "/content/drive/MyDrive/ADS project /metadata.csv"
df.to_csv(OUTPUT_PATH, index=False)

df

Unnamed: 0,track_id,title,artist,genre,filename
0,1,A Stroll - The Grey Room _ Density & Time.mp3,Unknown Artist,Unknown,A Stroll - The Grey Room _ Density & Time.mp3
1,2,At All Costs - The Grey Room _ Golden Palms.mp3,Unknown Artist,Unknown,At All Costs - The Grey Room _ Golden Palms.mp3
2,3,Boogie Down - The Grey Room _ Golden Palms.mp3,Unknown Artist,Unknown,Boogie Down - The Grey Room _ Golden Palms.mp3
3,4,By Myself - The Grey Room _ Clark Sims.mp3,Unknown Artist,Unknown,By Myself - The Grey Room _ Clark Sims.mp3
4,5,Claim To Fame - The Grey Room _ Clark Sims.mp3,Unknown Artist,Unknown,Claim To Fame - The Grey Room _ Clark Sims.mp3
5,6,Cooked - The Grey Room _ Golden Palms.mp3,Unknown Artist,Unknown,Cooked - The Grey Room _ Golden Palms.mp3
6,7,Down The Rabbit Hole - The Grey Room _ Density...,Unknown Artist,Unknown,Down The Rabbit Hole - The Grey Room _ Density...
7,8,F16 - The Grey Room _ Golden Palms.mp3,Unknown Artist,Unknown,F16 - The Grey Room _ Golden Palms.mp3
8,9,Flutter - The Grey Room _ Clark Sims.mp3,Unknown Artist,Unknown,Flutter - The Grey Room _ Clark Sims.mp3
9,10,Frame-Dragging - The Grey Room _ Density & Tim...,Unknown Artist,Unknown,Frame-Dragging - The Grey Room _ Density & Tim...


### Loading the Metadata

After creating the metadata file, I load it back into the notebook so it can be used for feature extraction and later steps of the project.  
`tracks_df` now contains one row per audio file, including its title, artist, genre, and filename.  
The `head()` function displays the first few rows to confirm that the metadata was loaded correctly.


In [4]:
# Path to folder with audio files
TRACKS_DIR = "/content/drive/MyDrive/ADS project /tracks"

# Path to metadata CSV
METADATA_PATH = "/content/drive/MyDrive/ADS project /metadata.csv"

# Load metadata
tracks_df = pd.read_csv(METADATA_PATH)

tracks_df.head()


Unnamed: 0,track_id,title,artist,genre,filename
0,1,A Stroll - The Grey Room _ Density & Time.mp3,Unknown Artist,Unknown,A Stroll - The Grey Room _ Density & Time.mp3
1,2,At All Costs - The Grey Room _ Golden Palms.mp3,Unknown Artist,Unknown,At All Costs - The Grey Room _ Golden Palms.mp3
2,3,Boogie Down - The Grey Room _ Golden Palms.mp3,Unknown Artist,Unknown,Boogie Down - The Grey Room _ Golden Palms.mp3
3,4,By Myself - The Grey Room _ Clark Sims.mp3,Unknown Artist,Unknown,By Myself - The Grey Room _ Clark Sims.mp3
4,5,Claim To Fame - The Grey Room _ Clark Sims.mp3,Unknown Artist,Unknown,Claim To Fame - The Grey Room _ Clark Sims.mp3


### Feature Extraction Function

This function converts an audio file into a single numerical feature vector.  
I use `librosa` to load the audio and extract several standard audio features:

1. **MFCCs** - capture timbre and overall sound texture  
2. **Spectral centroid** - relates to brightness of the track  
3. **Chroma features** - describe harmonic and pitch content  
4. **Tempo** - gives an estimate of the song's speed or rhythm

Each feature originally has multiple values across time, so I take the **mean along the time axis** and then flatten the result using `.ravel()` to ensure all outputs become simple 1D vectors. Reason behind this is that all features must have the same fixed shape so they can be concatenated and stored in one consistent feature matrix.

Finally, all extracted features are concatenated into one vector.  
If the file cannot be processed, the function returns `None`.

This function is applied to every audio file in the dataset.

In [5]:
#Feature extraction function
TRACKS_DIR = "/content/drive/MyDrive/ADS project /tracks"  # FIXED PATH

def extract_features_for_file(filepath, sr=22050):
    """
    Extracts a 1D feature vector from an audio file.
    Returns a 1D numpy array of features.
    """
    try:
        y, sr = librosa.load(filepath, sr=sr, mono=True)

        # 1. MFCC (mel-frequency cepstral coefficients)
        mfcc = librosa.feature.mfcc(y=y, sr=sr, n_mfcc=13)
        mfcc_mean = mfcc.mean(axis=1).ravel()   # (13,)

        # 2. Spectral centroid
        spec_centroid = librosa.feature.spectral_centroid(y=y, sr=sr)
        spec_centroid_mean = spec_centroid.mean(axis=1).ravel()  # (1,)

        # 3. Chroma features
        chroma = librosa.feature.chroma_stft(y=y, sr=sr)
        chroma_mean = chroma.mean(axis=1).ravel()  # (12,)

        # 4. Tempo (ensure scalar)
        tempo, _ = librosa.beat.beat_track(y=y, sr=sr)
        tempo_val = float(tempo)
        tempo_arr = np.array([tempo_val])

        # SVE feature-e spajamo kao čiste 1D vektore
        features = np.concatenate([
            mfcc_mean,
            spec_centroid_mean,
            chroma_mean,
            tempo_arr
        ])

        return features

    except Exception as e:
        print(f"Error processing {filepath}: {e}")
        return None



### Extracting Features for All Tracks

Here I loop over all rows in `tracks_df` and apply the `extract_features_for_file()` function to each MP3 file:

- For every filename in the metadata table, I build the full path.
- If the file exists, I print its name and extract the features.
- Only successful extractions are stored in `feature_list`, and I keep the corresponding indices in `valid_indices`.

At the end, I stack all feature vectors into a NumPy array called `features` and create a cleaned DataFrame `tracks_valid_df` that only contains tracks with valid features.

For my dataset, the output is:

- `Features shape: (32, 27)` - 32 tracks, each represented by a 27-dimensional feature vector  
- `tracks_valid_df.head()` - shows the first few tracks with their metadata (track_id, title, artist, genre, filename)




In [6]:
#Extract features for all tracks
feature_list = []
valid_indices = []

for idx, row in tracks_df.iterrows():
    fname = row["filename"]
    path = os.path.join(TRACKS_DIR, fname)

    if not os.path.exists(path):
        print(f"File not found, skipping: {path}")
        continue

    print(f"Processing: {fname}")
    feats = extract_features_for_file(path)

    if feats is not None:
        feature_list.append(feats)
        valid_indices.append(idx)

features = np.array(feature_list)
tracks_valid_df = tracks_df.iloc[valid_indices].reset_index(drop=True)

print("Features shape:", features.shape)
tracks_valid_df.head()

Processing: A Stroll - The Grey Room _ Density & Time.mp3


  tempo_val = float(tempo)


Processing: At All Costs - The Grey Room _ Golden Palms.mp3
Processing: Boogie Down - The Grey Room _ Golden Palms.mp3
Processing: By Myself - The Grey Room _ Clark Sims.mp3
Processing: Claim To Fame - The Grey Room _ Clark Sims.mp3
Processing: Cooked - The Grey Room _ Golden Palms.mp3
Processing: Down The Rabbit Hole - The Grey Room _ Density & Time.mp3
Processing: F16 - The Grey Room _ Golden Palms.mp3
Processing: Flutter - The Grey Room _ Clark Sims.mp3
Processing: Frame-Dragging - The Grey Room _ Density & Time.mp3
Processing: High Noon - The Grey Room _ Density & Time.mp3
Processing: In The Morning - The Grey Room _ Clark Sims.mp3
Processing: Missed My Chance - The Grey Room _ Clark Sims.mp3
Processing: Nebula - The Grey Room _ Density & Time.mp3
Processing: On The Flip - The Grey Room _ Density & Time.mp3
Processing: Overboard - The Grey Room _ Golden Palms.mp3
Processing: Pawn - The Grey Room _ Golden Palms.mp3
Processing: Pulsar - The Grey Room _ Density & Time.mp3
Processing: 

Unnamed: 0,track_id,title,artist,genre,filename
0,1,A Stroll - The Grey Room _ Density & Time.mp3,Unknown Artist,Unknown,A Stroll - The Grey Room _ Density & Time.mp3
1,2,At All Costs - The Grey Room _ Golden Palms.mp3,Unknown Artist,Unknown,At All Costs - The Grey Room _ Golden Palms.mp3
2,3,Boogie Down - The Grey Room _ Golden Palms.mp3,Unknown Artist,Unknown,Boogie Down - The Grey Room _ Golden Palms.mp3
3,4,By Myself - The Grey Room _ Clark Sims.mp3,Unknown Artist,Unknown,By Myself - The Grey Room _ Clark Sims.mp3
4,5,Claim To Fame - The Grey Room _ Clark Sims.mp3,Unknown Artist,Unknown,Claim To Fame - The Grey Room _ Clark Sims.mp3


### Why did we need to adjust the feature extraction function?

Librosa functions like MFCC, spectral centroid, and chroma return matrices with different shapes, for example:

- MFCC - (13, time_frames)
- Spectral centroid - (1, time_frames)
- Chroma - (12, time_frames)

After taking the mean over time (axis=1), these should become simple 1D vectors:

- MFCC - (13,)
- Centroid - (1,)
- Chroma - (12,)

However, for some tracks Librosa still returned shapes like (1, 1) or (12, 1).  
To fix this, we used `.ravel()` so every feature becomes a clean 1D array.

Tempo also sometimes came as a small array instead of a scalar, so we converted it to a float and then wrapped it in a single-value array.

The result:  
all features have the same length, NumPy can stack them without errors, and each track gets a consistent feature vector.


### Mood Tagging Based on Tempo and Spectral Brightness

Here we add a simple “mood” label to each track using two audio features:

- **Tempo** - tells us how fast or energetic the track feels  
- **Spectral brightness** (spectral centroid) -tells us whether the sound is more “dark” or “bright”

We add these values to the DataFrame and then use the median brightness as a threshold.   Based on these two features, we apply a few simple rules:

- low tempo + low brightness - **Calm**
- high tempo + high brightness - **Energetic**
- low brightness - **Dark**
- high brightness - **Bright**

This mood tagging is basic but useful. It gives an extra layer of information to the tracks and makes the similarity results easier to interpret.



In [8]:
# === Mood tagging based on tempo and spectral brightness nakon što imaš features i tracks_valid_df: ===

# tempo je zadnji element vektora feature-a (mfcc 13 + centroid 1 + chroma 12 + tempo 1 = 27)
tempo_vals = features[:, -1]

# spectral centroid mean je odmah nakon MFCC (index 13)
brightness_vals = features[:, 13]

# dodamo u DataFrame radi lakseg rada
tracks_valid_df["tempo"] = tempo_vals
tracks_valid_df["brightness"] = brightness_vals

# koristimo median kao prag da razlikujemo "dark" i "bright"
brightness_med = tracks_valid_df["brightness"].median()

def map_mood(row):
    t = row["tempo"]
    b = row["brightness"]

    # jednostavna, heuristička pravila
    if t < 90 and b < brightness_med:
        return "Calm"
    elif t > 120 and b > brightness_med:
        return "Energetic"
    elif b < brightness_med:
        return "Dark"
    else:
        return "Bright"

tracks_valid_df["mood"] = tracks_valid_df.apply(map_mood, axis=1)

tracks_valid_df[["title", "tempo", "brightness", "mood"]].head()



Unnamed: 0,title,tempo,brightness,mood
0,A Stroll - The Grey Room _ Density & Time.mp3,75.99954,1084.110942,Calm
1,At All Costs - The Grey Room _ Golden Palms.mp3,80.749512,2112.195762,Bright
2,Boogie Down - The Grey Room _ Golden Palms.mp3,117.453835,2495.84686,Bright
3,By Myself - The Grey Room _ Clark Sims.mp3,112.347147,2003.952662,Bright
4,Claim To Fame - The Grey Room _ Clark Sims.mp3,95.703125,2322.430439,Bright


### Computing the Similarity Matrix
### Computing the Similarity Matrix

After extracting all features, I calculate how similar each song is to every other song.
To do this, I use **cosine similarity**, which basically checks how close two feature vectors point in the same direction.

- If the value is close to **1**, the songs sound very similar.  
- If it is close to **0**, they are not very similar.

This gives us a matrix where each row and column represents a song, and the numbers inside show their similarity.


In [9]:
#Similarity matrix and helper function
# Compute similarity matrix between all tracks
from sklearn.metrics.pairwise import cosine_similarity

similarity_matrix = cosine_similarity(features)

similarity_matrix.shape


(32, 32)

### Helper Function for Finding Similar Tracks

This function returns the most similar songs for a given track.  
The input is the track's index, and the output is a small table showing the top recommendations.

Here is what the function does step by step:

1. It takes the row from the similarity matrix that belongs to the selected song.
2. It sorts all other songs from most similar to least similar.
3. It removes the song itself from the list (so a track doesn't recommend itself).
4. It picks the top `k` most similar songs.
5. It creates a DataFrame containing the metadata of those songs plus their similarity scores.

This function is the core of the recommendation system.

In [10]:
#helper function
def get_similar_tracks(track_index, top_k=5):
    """
    Given the index of a track (row in tracks_valid_df),
    return a DataFrame of the top_k most similar songs (excluding itself).
    """
    sim_scores = similarity_matrix[track_index]

    # Get indices of tracks sorted by similarity (high to low)
    # Exclude itself
    indices = np.argsort(sim_scores)[::-1]  # descending
    indices = [i for i in indices if i != track_index]

    top_indices = indices[:top_k]
    top_scores = sim_scores[top_indices]

    # Build result DataFrame
    result = tracks_valid_df.iloc[top_indices].copy()
    result["similarity_score"] = top_scores

    return result


#Validation & Testing

To ensure that the music similarity system works correctly and consistently, we performed several simple validation tests:

**1. Multiple Query Track Test**

I tested the system on several different tracks (indexes 0, 5, 12, 20) to see whether it returns reasonable recommendations for each one.
This helps confirm that the similarity algorithm works for the whole dataset, not just one example.

**2. Stability Test**

I ran the recommendation function twice for the same track and checked whether the results were identical.
If both outputs match, the system is deterministic (always gives the same answer for the same input).

**3. Self-Similarity Test**

Here I checked that the system never recommends the same song as one of its “similar tracks.”
The recommendation list should always exclude the query track itself.

**4. Similarity Score Distribution**

Finally, I looked at the minimum, maximum, and mean values of all similarity scores.
This gives a quick sense of how spread-out or clustered the features are, and whether the similarity matrix behaves normally.


In [13]:
# Validation & Testing

# 1. Test similar tracks for multiple songs
print("Test: Multiple Query Tracks")
for idx in [0, 5, 12, 20]:
    print(f"\nQuery track index: {idx}")
    display(get_similar_tracks(idx, top_k=5))


# 2. Stability test: same input - same output
print("Test: Stability Test ")
res1 = get_similar_tracks(0, top_k=5)
res2 = get_similar_tracks(0, top_k=5)
print("Stable results:", res1.equals(res2))


# 3. Self-similarity test: ensure the track does not recommend itself
print("Test: Self-Similarity Test")

def identity_test():
    for i in range(len(similarity_matrix)):
        sims = get_similar_tracks(i, top_k=5)
        if any(sims.index == i):
            return False
    return True

print("Self-similarity satisfied:", identity_test())


# 4. Similarity score distribution
print("Test: Similarity Score Distribution")
all_scores = similarity_matrix.flatten()
print("Min similarity:", all_scores.min())
print("Max similarity:", all_scores.max())
print("Mean similarity:", all_scores.mean())


Test: Multiple Query Tracks

Query track index: 0


Unnamed: 0,track_id,title,artist,genre,filename,tempo,brightness,mood,similarity_score
20,21,Rapid Unscheduled Disassembly - The Grey Room ...,Unknown Artist,Unknown,Rapid Unscheduled Disassembly - The Grey Room ...,80.749512,1497.991418,Calm,0.997721
14,15,On The Flip - The Grey Room _ Density & Time.mp3,Unknown Artist,Unknown,On The Flip - The Grey Room _ Density & Time.mp3,83.354335,1299.961461,Calm,0.997404
21,22,Resolution Or Reflection - The Grey Room _ Cla...,Unknown Artist,Unknown,Resolution Or Reflection - The Grey Room _ Cla...,95.703125,1213.706217,Dark,0.997012
27,28,good for the ghost - Alge.mp3,Unknown Artist,Unknown,good for the ghost - Alge.mp3,161.499023,1400.034882,Dark,0.99589
29,30,test demo - Alge.mp3,Unknown Artist,Unknown,test demo - Alge.mp3,161.499023,1400.034882,Dark,0.99589



Query track index: 5


Unnamed: 0,track_id,title,artist,genre,filename,tempo,brightness,mood,similarity_score
18,19,Purple Desire - The Grey Room _ Clark Sims.mp3,Unknown Artist,Unknown,Purple Desire - The Grey Room _ Clark Sims.mp3,135.999178,2320.707084,Energetic,0.999795
11,12,In The Morning - The Grey Room _ Clark Sims.mp3,Unknown Artist,Unknown,In The Morning - The Grey Room _ Clark Sims.mp3,99.384014,2213.176773,Bright,0.999785
2,3,Boogie Down - The Grey Room _ Golden Palms.mp3,Unknown Artist,Unknown,Boogie Down - The Grey Room _ Golden Palms.mp3,117.453835,2495.84686,Bright,0.999744
3,4,By Myself - The Grey Room _ Clark Sims.mp3,Unknown Artist,Unknown,By Myself - The Grey Room _ Clark Sims.mp3,112.347147,2003.952662,Bright,0.99974
4,5,Claim To Fame - The Grey Room _ Clark Sims.mp3,Unknown Artist,Unknown,Claim To Fame - The Grey Room _ Clark Sims.mp3,95.703125,2322.430439,Bright,0.999716



Query track index: 12


Unnamed: 0,track_id,title,artist,genre,filename,tempo,brightness,mood,similarity_score
3,4,By Myself - The Grey Room _ Clark Sims.mp3,Unknown Artist,Unknown,By Myself - The Grey Room _ Clark Sims.mp3,112.347147,2003.952662,Bright,0.999875
8,9,Flutter - The Grey Room _ Clark Sims.mp3,Unknown Artist,Unknown,Flutter - The Grey Room _ Clark Sims.mp3,95.703125,1858.73799,Bright,0.999818
7,8,F16 - The Grey Room _ Golden Palms.mp3,Unknown Artist,Unknown,F16 - The Grey Room _ Golden Palms.mp3,107.666016,2190.573602,Bright,0.99981
22,23,Ruff Money - The Grey Room _ Clark Sims.mp3,Unknown Artist,Unknown,Ruff Money - The Grey Room _ Clark Sims.mp3,99.384014,2348.657785,Bright,0.999707
11,12,In The Morning - The Grey Room _ Clark Sims.mp3,Unknown Artist,Unknown,In The Morning - The Grey Room _ Clark Sims.mp3,99.384014,2213.176773,Bright,0.999692



Query track index: 20


Unnamed: 0,track_id,title,artist,genre,filename,tempo,brightness,mood,similarity_score
14,15,On The Flip - The Grey Room _ Density & Time.mp3,Unknown Artist,Unknown,On The Flip - The Grey Room _ Density & Time.mp3,83.354335,1299.961461,Calm,0.998939
21,22,Resolution Or Reflection - The Grey Room _ Cla...,Unknown Artist,Unknown,Resolution Or Reflection - The Grey Room _ Cla...,95.703125,1213.706217,Dark,0.998306
19,20,Push Thru - The Grey Room _ Golden Palms.mp3,Unknown Artist,Unknown,Push Thru - The Grey Room _ Golden Palms.mp3,89.102909,1618.619247,Calm,0.998112
0,1,A Stroll - The Grey Room _ Density & Time.mp3,Unknown Artist,Unknown,A Stroll - The Grey Room _ Density & Time.mp3,75.99954,1084.110942,Calm,0.997721
29,30,test demo - Alge.mp3,Unknown Artist,Unknown,test demo - Alge.mp3,161.499023,1400.034882,Dark,0.996793


Test: Stability Test 
Stable results: True
Test: Self-Similarity Test
Self-similarity satisfied: True
Test: Similarity Score Distribution
Min similarity: 0.5036851346204818
Max similarity: 1.0000000000000002
Mean similarity: 0.9508135817143187


**Explanation of Validation Output**
1. Multiple Query Tracks (Indexes 0, 5, 12, 20)For each selected track, the system returned the top 5 most similar songs based on cosine similarity.
In all cases:

The recommended tracks have high similarity scores (mostly above 0.99),

The results seem reasonable — songs from the same artist or similar style often appear together,

The recommendations do not include the track itself, which is correct.

This shows that the similarity function is working consistently across the dataset.

2. Stability Test

The output:

Stable results: True


means that when I ask the system twice for recommendations for the same song, the results are exactly the same.
This confirms the model is deterministic — no randomness is affecting the outcomes.

3. Self-Similarity Test

The result:

Self-similarity satisfied: True

means that the system never recommends the original track as one of its similar songs, which is exactly what we want.
The filtering step is working correctly.

4. Similarity Score Distribution

The printed values show:

Min similarity - the least similar pair of tracks

Max similarity -the closest pair (usually around 0.999)

Mean similarity - the general similarity level of the dataset

This confirms that the feature extraction process creates consistent vectors and that cosine similarity produces meaningful differences between songs.

##Similarity Threshold (Sensitivity Control)

This part adds a simple way for the user to control how strict the recommendations should be.

If the threshold is high, the system will only return songs that are very similar.

If the threshold is low, the system will return more songs, even if they are not extremely close.

The function looks at all similarity scores for the selected song and keeps only the ones that are above the user's chosen value (min_sim).
If there are not enough songs that pass this threshold, the function just returns the top closest ones so the user always gets results.

Overall, this feature gives the user a basic “sensitivity” setting for how similar the recommended tracks should be.

In [14]:
# Helper with similarity threshold (sensitivity)

def get_similar_tracks_with_threshold(track_index, min_sim=0.0, top_k=3):
    """
    Returns up to top_k most similar tracks whose similarity_score >= min_sim.
    If not enough tracks pass the threshold, it simply returns the top_k closest.
    """
    scores = similarity_matrix[track_index]

    # sortiramo indekse po slicnosti (od najvise ka najnizoj)
    sorted_ids = scores.argsort()[::-1]
    # izbacimo samu sebe
    sorted_ids = [i for i in sorted_ids if i != track_index]

    # filtriramo po thresholdu
    if min_sim > 0.0:
        filtered_ids = [i for i in sorted_ids if scores[i] >= min_sim]
    else:
        filtered_ids = sorted_ids

    # ako nema dovoljno iznad thresholda fallback na top_k bez filtera
    if len(filtered_ids) < top_k:
        filtered_ids = sorted_ids[:top_k]
    else:
        filtered_ids = filtered_ids[:top_k]

    result = tracks_valid_df.iloc[filtered_ids].copy()
    result["similarity_score"] = scores[filtered_ids]
    return result


##Waveform and Spectrogram Plots

To make the audio easier to understand, we added two helper functions that draw simple visualizations of each song:

**1. Waveform plot**

Shows how the sound changes over time

Time is on the x-axis, loudness on the y-axis

This helps us see where the song gets louder, quieter, or more energetic

**2. Mel Spectrogram**

Shows how strong different frequencies are in the audio

Bright colors mean stronger (louder) frequencies

This gives a quick visual idea of the song's tone and texture

These visuals help the user see what the audio looks like, not just listen to it.
We later display both plots in the Gradio interface when a track is selected.

In [15]:
# Waveform and spectrogram plotting helpers :
import matplotlib.pyplot as plt
import librosa
import librosa.display

def plot_waveform(filepath, sr=22050):
    """Return a matplotlib Figure showing the waveform of the audio file."""
    y, sr = librosa.load(filepath, sr=sr, mono=True)
    fig, ax = plt.subplots()
    librosa.display.waveshow(y, sr=sr, ax=ax)
    ax.set_title("Waveform")
    ax.set_xlabel("Time (s)")
    ax.set_ylabel("Amplitude")
    plt.tight_layout()
    return fig

def plot_spectrogram(filepath, sr=22050):
    """Return a matplotlib Figure showing a mel-spectrogram."""
    y, sr = librosa.load(filepath, sr=sr, mono=True)
    S = librosa.feature.melspectrogram(y=y, sr=sr)
    S_db = librosa.power_to_db(S, ref=np.max)

    fig, ax = plt.subplots()
    img = librosa.display.specshow(S_db, sr=sr, x_axis="time", y_axis="mel", ax=ax)
    fig.colorbar(img, ax=ax, format="%+2.0f dB")
    ax.set_title("Mel Spectrogram")
    plt.tight_layout()
    return fig


Before building the full UI, we added a small helper function that prints all available tracks in a clean table format.
It shows basic information such as:

`track title`

`artist`

`genre`

`filename`

and the `track index` (added for easier referencing)

This makes it easier to check if the metadata was loaded correctly and to know which track index to use when testing the similarity functions.

In [16]:
##Simple text-based UI
def show_all_tracks():
    display_cols = ["title", "artist", "genre", "filename"]
    display_df = tracks_valid_df[display_cols].reset_index().rename(columns={"index": "track_index"})
    return display_df

show_all_tracks()


Unnamed: 0,track_index,title,artist,genre,filename
0,0,A Stroll - The Grey Room _ Density & Time.mp3,Unknown Artist,Unknown,A Stroll - The Grey Room _ Density & Time.mp3
1,1,At All Costs - The Grey Room _ Golden Palms.mp3,Unknown Artist,Unknown,At All Costs - The Grey Room _ Golden Palms.mp3
2,2,Boogie Down - The Grey Room _ Golden Palms.mp3,Unknown Artist,Unknown,Boogie Down - The Grey Room _ Golden Palms.mp3
3,3,By Myself - The Grey Room _ Clark Sims.mp3,Unknown Artist,Unknown,By Myself - The Grey Room _ Clark Sims.mp3
4,4,Claim To Fame - The Grey Room _ Clark Sims.mp3,Unknown Artist,Unknown,Claim To Fame - The Grey Room _ Clark Sims.mp3
5,5,Cooked - The Grey Room _ Golden Palms.mp3,Unknown Artist,Unknown,Cooked - The Grey Room _ Golden Palms.mp3
6,6,Down The Rabbit Hole - The Grey Room _ Density...,Unknown Artist,Unknown,Down The Rabbit Hole - The Grey Room _ Density...
7,7,F16 - The Grey Room _ Golden Palms.mp3,Unknown Artist,Unknown,F16 - The Grey Room _ Golden Palms.mp3
8,8,Flutter - The Grey Room _ Clark Sims.mp3,Unknown Artist,Unknown,Flutter - The Grey Room _ Clark Sims.mp3
9,9,Frame-Dragging - The Grey Room _ Density & Tim...,Unknown Artist,Unknown,Frame-Dragging - The Grey Room _ Density & Tim...


##Gradio User Interface: Music Similarity App

In this part, I build the full web interface using Gradio.
The goal is to let the user:

**pick a track from a dropdown**

**choose how strict the similarity should be (with a slider)**

**see visualizations**

**listen to the selected song and the recommended songs**


**1. Track labels and index mapping**

First, I create a list of labels like:
`"Song Title - Artist"` so the dropdown looks clean.
Each label is linked to the correct row in the DataFrame so we can easily find the selected track.

**2. Similarity bar plot**

function `plot_similarities()` draws a simple bar chart.
It shows how similar each recommended song is to the selected one.
Higher bars = more similar.

**3. Main function for the UI**

The function `recommend_for_gradio()` runs when the user clicks the “Find similar tracks” button.
It does several things:

Reads which track the user selected

Shows basic info about that track (title, artist, genre, mood)

Loads the audio so the user can play it

Shows the waveform and spectrogram of the track

Finds the top 3 most similar tracks (based on the similarity threshold chosen with the slider)

Shows a bar chart of similarity scores

Displays and plays the 3 recommended songs

So the user gets audio + visuals + explanations all in one place.

**4. Building the layout**

Using `gr.Blocks()`, I set up the UI:

a title

a dropdown with all songs

a slider to choose similarity sensitivity

a button to run the system

and all the outputs: plots, audio players, descriptions, etc.

Finally,` music_app.launch()` starts the app so everything appears nicely


In [17]:
import gradio as gr

# Build a mapping from label , track_index for the dropdown ZAMIJENI cijeli stari Gradio
track_labels = [
    f"{row['title']} - {row['artist']}"
    for _, row in tracks_valid_df.iterrows()
]
label_to_index = {label: i for i, label in enumerate(track_labels)}

def plot_similarities(sim_df):
    """
    Create a bar plot showing similarity scores for top similar tracks.
    """
    titles = sim_df["title"].tolist()
    scores = sim_df["similarity_score"].tolist()

    x = np.arange(len(titles))

    fig, ax = plt.subplots()
    ax.bar(x, scores)
    ax.set_xticks(x)
    ax.set_xticklabels(titles, rotation=45, ha="right")
    ax.set_ylabel("Similarity")
    ax.set_title("Top similar tracks")
    plt.tight_layout()
    return fig

def recommend_for_gradio(selected_label, sensitivity):
    """
    Gradio callback:
    - selected_label: string from dropdown ("Title - Artist")
    - sensitivity: float in [0, 1], similarity threshold
    Returns:
    - markdown with selected track info (+ mood)
    - audio file path for selected track
    - waveform plot
    - spectrogram plot
    - similarity bar plot
    - title + audio for top 3 similar tracks (with mood in title)
    """
    idx = label_to_index[selected_label]

    # Selected track row
    selected = tracks_valid_df.iloc[idx]
    selected_md = (
        f"**Selected track:** {selected['title']} - {selected['artist']} "
        f"(_{selected.get('genre', 'unknown genre')}_, mood: **{selected.get('mood', 'Unknown')}**)"
    )
    selected_audio = os.path.join(TRACKS_DIR, selected["filename"])

    # Waveform + spectrogram for selected track
    wave_fig = plot_waveform(selected_audio)
    spec_fig = plot_spectrogram(selected_audio)

    # Get up to 3 similar tracks using threshold (sensitivity)
    sim_df = get_similar_tracks_with_threshold(idx, min_sim=sensitivity, top_k=3)
    sim_fig = plot_similarities(sim_df)

    # Prepare audio previews for top 3 similar tracks
    sim_titles = []
    sim_audios = []
    for _, row in sim_df.iterrows():
        mood = row.get("mood", "Unknown")
        sim_titles.append(f"{row['title']} - {row['artist']} (Mood: {mood})")
        sim_audios.append(os.path.join(TRACKS_DIR, row["filename"]))

    # Pad to always have 3 outputs
    while len(sim_titles) < 3:
        sim_titles.append("")
        sim_audios.append(None)

    return (
        selected_md,
        selected_audio,
        wave_fig,
        spec_fig,
        sim_fig,
        sim_titles[0], sim_audios[0],
        sim_titles[1], sim_audios[1],
        sim_titles[2], sim_audios[2],
    )

with gr.Blocks() as music_app:
    gr.Markdown("# 🎵 Music Similarity Finder")
    gr.Markdown(
        "Select a song from the dropdown and adjust the similarity sensitivity slider. "
        "The app shows the selected track, its waveform and spectrogram, and the top 3 similar tracks "
        "based on audio features, along with their mood tags and similarity scores."
    )

    with gr.Row():
        track_dropdown = gr.Dropdown(
            choices=track_labels,
            value=track_labels[0],
            label="Choose a track",
        )
        sensitivity_slider = gr.Slider(
            minimum=0.0,
            maximum=1.0,
            value=0.9,
            step=0.01,
            label="Similarity sensitivity (threshold)"
        )

    run_button = gr.Button("Find similar tracks")

    # OUTPUTS
    selected_info = gr.Markdown()
    selected_audio_player = gr.Audio(label="Selected track", interactive=False)
    waveform_plot = gr.Plot(label="Waveform")
    spectrogram_plot = gr.Plot(label="Spectrogram")
    similarity_plot = gr.Plot(label="Similarity scores")

    gr.Markdown("### Top 3 similar tracks")

    with gr.Row():
        with gr.Column():
            sim1_title = gr.Markdown()
            sim1_audio = gr.Audio(label="Similar track 1", interactive=False)
        with gr.Column():
            sim2_title = gr.Markdown()
            sim2_audio = gr.Audio(label="Similar track 2", interactive=False)
        with gr.Column():
            sim3_title = gr.Markdown()
            sim3_audio = gr.Audio(label="Similar track 3", interactive=False)

    run_button.click(
        fn=recommend_for_gradio,
        inputs=[track_dropdown, sensitivity_slider],
        outputs=[
            selected_info,
            selected_audio_player,
            waveform_plot,
            spectrogram_plot,
            similarity_plot,
            sim1_title, sim1_audio,
            sim2_title, sim2_audio,
            sim3_title, sim3_audio,
        ]
    )

music_app.launch()


It looks like you are running Gradio on a hosted Jupyter notebook, which requires `share=True`. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://1e05e71f338f7cb868.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)




**Interpretation of the Gradio Output**

When the user selects a song, the app shows a bar chart with the similarity scores of the top 3 recommended tracks.
These scores are very high (almost 1.0), which means the songs share very similar audio features.

Under the chart, the user can listen to each recommended track.
This makes it easy to compare the songs both visually (chart) and by listening.

This confirms that the system is working correctly

##Original Features Added to the Project

To make the Music Similarity Finder more useful and different from standard solutions, I added three extra features that improve the user experience and explainability of the recommendations. These features are built directly on top of the audio analysis and similarity model.

**1. Adjustable Similarity Sensitivity (User-Controlled Threshold)**

I added a similarity sensitivity slider so the user can decide how strict the recommendations should be.

A higher threshold returns only tracks that sound almost identical.

A lower threshold allows more variety.

This gives the user more control over how the recommendation system behaves and makes the tool more flexible.

**2. Real-Time Audio Visualization (Waveform and Spectrogram)**

Before showing recommendations, the system displays two simple visualizations of the selected track:

a waveform, showing how the audio changes over time

a mel spectrogram, showing how energy is spread across frequencies

These visuals help the user understand the “shape” of the audio, not just the similarity score. They also connect directly to the features used by the model (brightness, tempo, frequency content)

**3. Mood-Based Tagging (Calm, Energetic, Dark, Bright)**

I added a small mood-classification step based on two extracted features:

tempo

spectral brightness

Using simple rules, each track gets one of four moods: Calm, Energetic, Dark, or Bright.
These mood labels appear both for the selected song and the recommended ones, which helps the user quickly understand the emotional tone of each track, not just how similar they are numerically.

##Final Conclusion
In this project, I built a small Music Similarity Finder that can compare songs and recommend tracks that sound alike. I used a collection of MP3 files, extracted basic audio features (MFCC, brightness, chroma, tempo), and turned each song into a numerical feature vector. Then, using cosine similarity, the system measures how close two songs are based on these features.

I also created a simple Gradio interface where the user can pick a track, listen to it, see its waveform and spectrogram, and view the top similar songs. The UI also shows a bar chart with similarity scores, which makes the results easy to read and understand.

All validation tests showed that the system works consistently:
it returns good results, avoids recommending the same song, and produces logical similarity rankings.

Hope you enjoyed it !!