# Audio Embeddings Generator (OpenL3, PANNs) + Song Recommendations

This notebook demonstrates how to generate audio embeddings using two different state-of-the-art models:
- **OpenL3**: Audio embeddings using deep learning
- **PANNs**: Large-scale Pretrained Audio Neural Networks

We'll then use these embeddings to build a music recommendation system.


### SECTION 1 ‚Äî SETUP of modules


In [None]:
# Install required libraries
%conda install openl3 torch torchaudio tensorflow pandas numpy librosa scikit-learn tqdm soundfile faiss-cpu pyarrow -q


Note: you may need to restart the kernel to use updated packages.


### 1.3 Set Random Seed

Set random seeds for reproducibility across different libraries.


In [10]:
%pip install torch torchvision

Collecting torchvision
  Downloading torchvision-0.24.1-cp311-cp311-win_amd64.whl.metadata (5.9 kB)
Collecting filelock (from torch)
  Using cached filelock-3.20.0-py3-none-any.whl.metadata (2.1 kB)
Collecting sympy>=1.13.3 (from torch)
  Using cached sympy-1.14.0-py3-none-any.whl.metadata (12 kB)
Collecting fsspec>=0.8.5 (from torch)
  Using cached fsspec-2025.10.0-py3-none-any.whl.metadata (10 kB)
Collecting mpmath<1.4,>=1.1.0 (from sympy>=1.13.3->torch)
  Using cached mpmath-1.3.0-py3-none-any.whl.metadata (8.6 kB)
Downloading torchvision-0.24.1-cp311-cp311-win_amd64.whl (4.0 MB)
   ---------------------------------------- 0.0/4.0 MB ? eta -:--:--
   ----- ---------------------------------- 0.5/4.0 MB 2.4 MB/s eta 0:00:02
   ------- -------------------------------- 0.8/4.0 MB 2.6 MB/s eta 0:00:02
   ------- -------------------------------- 0.8/4.0 MB 2.6 MB/s eta 0:00:02
   ---------- ----------------------------- 1.0/4.0 MB 1.0 MB/s eta 0:00:03
   ------------------ ---------------

In [12]:
import os
import numpy as np
import pandas as pd
import torch
import torchaudio
import librosa
import soundfile as sf
from pathlib import Path
from tqdm import tqdm
import warnings
warnings.filterwarnings('ignore')

# Audio embedding libraries
import openl3
# Similarity search
import faiss

# Utilities
from sklearn.preprocessing import StandardScaler
from typing import Union, List, Dict, Optional
import json

print("All libraries imported successfully!")


All libraries imported successfully!


In [13]:
# Set random seeds for reproducibility
RANDOM_SEED = 42
np.random.seed(RANDOM_SEED)
torch.manual_seed(RANDOM_SEED)
if torch.cuda.is_available():
    torch.cuda.manual_seed(RANDOM_SEED)
    torch.cuda.manual_seed_all(RANDOM_SEED)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False

print(f"Random seed set to {RANDOM_SEED}")


Random seed set to 42


### SECTION 2 ‚Äî LOAD DATA


In [14]:
# Load the filtered CSV file (contains only records with existing WAV files)
CSV_PATH = "track_reference_filtered.csv"

if not os.path.exists(CSV_PATH):
    raise FileNotFoundError(f"Filtered CSV file not found: {CSV_PATH}. Please run filter_csv_by_wav_files.py first.")

# Load the dataset
df = pd.read_csv(CSV_PATH)
print(f"Loaded {len(df)} tracks from {CSV_PATH}")
print(f"\nColumns: {df.columns.tolist()}")

# Create file_path column: WAV files are in 'rec' folder, named as musicbrainz_id.wav
REC_FOLDER = "rec"
df['file_path'] = df['musicbrainz_id'].apply(lambda x: os.path.join(REC_FOLDER, f"{x}.wav"))

print(f"\nWAV files location: {REC_FOLDER}/{{musicbrainz_id}}.wav")
print(f"\nFirst few rows:")
print(df[['musicbrainz_id', 'title', 'artist', 'file_path']].head())


Loaded 1084 tracks from track_reference_filtered.csv

Columns: ['musicbrainz_id', 'title', 'artist', 'artist_id', 'album', 'album_id', 'release_date', 'country', 'length']

WAV files location: rec/{musicbrainz_id}.wav

First few rows:
                         musicbrainz_id                       title  \
0  00b1397d-7f3e-4c59-bb42-ccd7fa17ee10  raindrops (an angel cried)   
1  00c9dcab-4abf-47f5-9755-c5c805b779c7            Through the Wire   
2  012e3459-b54d-49e9-b48d-d0922d295c5a            I'll Cry Instead   
3  013a7fe3-0113-4604-a295-f74a0b88bf05        She‚Äôs Always a Woman   
4  01564f1c-99b2-466a-a60d-4e22a5008525                       angel   

            artist                                     file_path  
0    Ariana Grande  rec\00b1397d-7f3e-4c59-bb42-ccd7fa17ee10.wav  
1               Ye  rec\00c9dcab-4abf-47f5-9755-c5c805b779c7.wav  
2      The Beatles  rec\012e3459-b54d-49e9-b48d-d0922d295c5a.wav  
3       Billy Joel  rec\013a7fe3-0113-4604-a295-f74a0b88bf05.wav  
4

In [16]:
df.head()

Unnamed: 0,musicbrainz_id,title,artist,artist_id,album,album_id,release_date,country,length,file_path
0,00b1397d-7f3e-4c59-bb42-ccd7fa17ee10,raindrops (an angel cried),Ariana Grande,f4fdbb4c-e4b7-47a0-b83b-d91bbfcfa387,"sweetener / thank u, next tour - live at Coach...",6cd36f2a-0c90-45ea-b63b-0e922f1df4ba,2019-04-19,XW,36000.0,rec\00b1397d-7f3e-4c59-bb42-ccd7fa17ee10.wav
1,00c9dcab-4abf-47f5-9755-c5c805b779c7,Through the Wire,Ye,164f0d73-1234-4e2c-8743-d77bf2191051,BET Awards: '04 Nominees,d9f9fa38-f06e-4d22-abf8-73b60983ef8f,2004-01-01,US,270386.0,rec\00c9dcab-4abf-47f5-9755-c5c805b779c7.wav
2,012e3459-b54d-49e9-b48d-d0922d295c5a,I'll Cry Instead,The Beatles,b10bbbfc-cf9e-42e0-be17-e2c3e1d2600d,"UK EP Collection, Volume 1",51443d4d-fdb8-4d4e-8b61-58237764e6ae,2000-01-01,XW,107000.0,rec\012e3459-b54d-49e9-b48d-d0922d295c5a.wav
3,013a7fe3-0113-4604-a295-f74a0b88bf05,She‚Äôs Always a Woman,Billy Joel,64b94289-9474-4d43-8c93-918ccc1920d1,"Retold, Volume 3: Souvenir of a Stranger at th...",c5c69643-b9d3-446b-a773-57747bc1ad08,1995-01-01,US,120560.0,rec\013a7fe3-0113-4604-a295-f74a0b88bf05.wav
4,01564f1c-99b2-466a-a60d-4e22a5008525,angel,Kacey Musgraves,d1393ecb-431b-4fde-a6ea-d769f2f040cb,star‚Äêcrossed,80ec0d1a-00cf-465c-b832-26f15b558b57,2021-09-10,,140000.0,rec\01564f1c-99b2-466a-a60d-4e22a5008525.wav


## 2.2 Filter Valid WAV Files

Filter the dataset to only include tracks with valid WAV files that exist on disk.


In [17]:
def validate_wav_file(file_path: str) -> bool:
    """Check if a WAV file exists and is readable."""
    try:
        if not os.path.exists(file_path):
            return False
        # Try to load the file to ensure it's valid
        data, sr = librosa.load(file_path, sr=None, duration=1.0)
        return len(data) > 0
    except Exception as e:
        print(f"Error validating {file_path}: {e}")
        return False

# Filter tracks with valid WAV files
print("Validating WAV files...")
valid_mask = df['file_path'].apply(validate_wav_file)
df_valid = df[valid_mask].copy().reset_index(drop=True)

print(f"\nOriginal tracks: {len(df)}")
print(f"Valid tracks: {len(df_valid)}")
print(f"Invalid/removed tracks: {len(df) - len(df_valid)}")

if len(df_valid) == 0:
    print("\n‚ö†Ô∏è  WARNING: No valid WAV files found!")
    print("Please update the CSV_PATH and ensure file_paths in the CSV are correct.")
else:
    print(f"\n‚úÖ Successfully validated {len(df_valid)} tracks")
    print("\nSample of valid tracks:")
    print(df_valid.head())


Validating WAV files...

Original tracks: 1084
Valid tracks: 1084
Invalid/removed tracks: 0

‚úÖ Successfully validated 1084 tracks

Sample of valid tracks:
                         musicbrainz_id                       title  \
0  00b1397d-7f3e-4c59-bb42-ccd7fa17ee10  raindrops (an angel cried)   
1  00c9dcab-4abf-47f5-9755-c5c805b779c7            Through the Wire   
2  012e3459-b54d-49e9-b48d-d0922d295c5a            I'll Cry Instead   
3  013a7fe3-0113-4604-a295-f74a0b88bf05        She‚Äôs Always a Woman   
4  01564f1c-99b2-466a-a60d-4e22a5008525                       angel   

            artist                             artist_id  \
0    Ariana Grande  f4fdbb4c-e4b7-47a0-b83b-d91bbfcfa387   
1               Ye  164f0d73-1234-4e2c-8743-d77bf2191051   
2      The Beatles  b10bbbfc-cf9e-42e0-be17-e2c3e1d2600d   
3       Billy Joel  64b94289-9474-4d43-8c93-918ccc1920d1   
4  Kacey Musgraves  d1393ecb-431b-4fde-a6ea-d769f2f040cb   

                                               album 

### SECTION 3 ‚Äî AUDIO EMBEDDING MODELS using OpenL3

#### 3.1 OpenL3 Embedding Model

OpenL3 is a deep learning model for audio embeddings that uses a combination of audio and visual information.


In [18]:
class OpenL3Embedding:
    """OpenL3 audio embedding model wrapper."""
    
    def __init__(self, input_repr="mel256", content_type="music", embedding_size=512):
        """
        Initialize OpenL3 model.
        
        Parameters:
        -----------
        input_repr : str
            Input representation: "linear", "mel128", or "mel256"
        content_type : str
            Content type: "music" or "env"
        embedding_size : int
            Embedding size: 512 or 6144
        """
        self.input_repr = input_repr
        self.content_type = content_type
        self.embedding_size = embedding_size
        self.model = None
        print(f"OpenL3Embedding initialized with input_repr={input_repr}, "
              f"content_type={content_type}, embedding_size={embedding_size}")
    
    def get_embedding(self, wav_path: str) -> np.ndarray:
        """
        Extract embedding from audio file.
        
        Parameters:
        -----------
        wav_path : str
            Path to WAV file
            
        Returns:
        --------
        np.ndarray
            Audio embedding vector
        """
        try:
            # Load audio file
            audio, sr = librosa.load(wav_path, sr=48000)  # OpenL3 expects 48kHz
            
            # Get embedding using OpenL3
            # OpenL3 returns embeddings with shape (n_frames, embedding_size)
            # We'll take the mean across frames to get a single vector
            embedding, _ = openl3.get_audio_embedding(
                audio,
                sr,
                input_repr=self.input_repr,
                content_type=self.content_type,
                embedding_size=self.embedding_size,
                center=True,
                hop_size=0.1,
                verbose=False
            )
            
            # Average over time frames to get a single embedding vector
            if len(embedding.shape) > 1:
                embedding = np.mean(embedding, axis=0)
            
            return embedding.astype(np.float32)
            
        except Exception as e:
            print(f"Error extracting OpenL3 embedding from {wav_path}: {e}")
            # Return zero vector of correct size as fallback
            return np.zeros(self.embedding_size, dtype=np.float32)


In [None]:
# Install panns_inference for PANNs model support
# Uncomment the line below if panns_inference is not installed
%pip install panns_inference Pillow


Collecting panns_inference
  Downloading panns_inference-0.1.1-py3-none-any.whl.metadata (2.4 kB)
Collecting matplotlib (from panns_inference)
  Downloading matplotlib-3.10.7-cp311-cp311-win_amd64.whl.metadata (11 kB)
Collecting contourpy>=1.0.1 (from matplotlib->panns_inference)
  Downloading contourpy-1.3.3-cp311-cp311-win_amd64.whl.metadata (5.5 kB)
Collecting cycler>=0.10 (from matplotlib->panns_inference)
  Using cached cycler-0.12.1-py3-none-any.whl.metadata (3.8 kB)
Collecting fonttools>=4.22.0 (from matplotlib->panns_inference)
  Downloading fonttools-4.60.1-cp311-cp311-win_amd64.whl.metadata (114 kB)
Collecting kiwisolver>=1.3.1 (from matplotlib->panns_inference)
  Downloading kiwisolver-1.4.9-cp311-cp311-win_amd64.whl.metadata (6.4 kB)
Collecting pyparsing>=3 (from matplotlib->panns_inference)
  Using cached pyparsing-3.2.5-py3-none-any.whl.metadata (5.0 kB)
Downloading panns_inference-0.1.1-py3-none-any.whl (8.3 kB)
Downloading matplotlib-3.10.7-cp311-cp311-win_amd64.whl (8.

### 3.3 PANNs Embedding Model

PANNs (Pretrained Audio Neural Networks) are large-scale pretrained models for audio classification and feature extraction.


In [36]:
import panns_inference

ModuleNotFoundError: No module named 'PIL.PngImagePlugin'

In [27]:
class PANNsEmbedding:
    """PANNs audio embedding model wrapper."""
    
    def __init__(self):
        """Initialize PANNs model using torchaudio pretrained pipeline."""
        print("Loading PANNs model...")
        try:
            # Load PANNs pipeline from torchaudio
            self.pipeline = torchaudio.pipelines.PANNs_CNNonly()
            self.model = self.pipeline.get_model()
            self.model.eval()
            self.sample_rate = self.pipeline.sample_rate
            print(f"‚úÖ PANNs model loaded successfully (sample_rate={self.sample_rate})")
        except Exception as e:
            print(f"‚ö†Ô∏è  Error loading PANNs model: {e}")
            print("Attempting alternative loading method...")
            try:
                # Alternative: try the full PANNs model
                self.pipeline = torchaudio.pipelines.PANNs_AS20K()
                self.model = self.pipeline.get_model()
                self.model.eval()
                self.sample_rate = self.pipeline.sample_rate
                print(f"‚úÖ PANNs model loaded successfully (sample_rate={self.sample_rate})")
            except Exception as e2:
                print(f"‚ùå Failed to load PANNs model: {e2}")
                self.model = None
                self.sample_rate = 32000
    
    def _preprocess_audio(self, wav_path: str) -> torch.Tensor:
        """
        Preprocess audio file for PANNs.
        
        Parameters:
        -----------
        wav_path : str
            Path to WAV file
            
        Returns:
        --------
        torch.Tensor
            Preprocessed audio tensor
        """
        try:
            # Load audio
            waveform, sample_rate = torchaudio.load(wav_path)
            
            # Resample if necessary
            if sample_rate != self.sample_rate:
                resampler = torchaudio.transforms.Resample(sample_rate, self.sample_rate)
                waveform = resampler(waveform)
            
            # Convert to mono if stereo
            if waveform.shape[0] > 1:
                waveform = torch.mean(waveform, dim=0, keepdim=True)
            
            return waveform
            
        except Exception as e:
            print(f"Error preprocessing audio {wav_path}: {e}")
            return torch.zeros(1, self.sample_rate * 10)  # 10 seconds of zeros
    
    def get_embedding(self, wav_path: str) -> np.ndarray:
        """
        Extract embedding from audio file.
        
        Parameters:
        -----------
        wav_path : str
            Path to WAV file
            
        Returns:
        --------
        np.ndarray
            Audio embedding vector (2048D)
        """
        if self.model is None:
            print("PANNs model not loaded, returning zero vector")
            return np.zeros(2048, dtype=np.float32)
        
        try:
            # Preprocess audio
            waveform = self._preprocess_audio(wav_path)
            
            # Get embedding
            with torch.no_grad():
                # Forward pass through the model
                # PANNs model typically returns features from an intermediate layer
                # We need to extract the embedding layer output (before classification head)
                
                # Get the feature extractor part of the model
                # The model structure may vary, so we'll try to get embeddings
                # by accessing intermediate layers
                
                # Method 1: Try to get features directly if model supports it
                if hasattr(self.model, 'get_embedding'):
                    embedding = self.model.get_embedding(waveform)
                else:
                    # Method 2: Forward pass and extract intermediate features
                    # Most PANNs models have a feature extractor that outputs 2048D
                    features = self.model(waveform)
                    
                    # If features is a tuple, take the first element (embeddings)
                    if isinstance(features, tuple):
                        embedding = features[0]
                    else:
                        embedding = features
                    
                    # If the output is 2D (batch, features), take the first sample
                    if len(embedding.shape) > 1:
                        embedding = embedding[0] if embedding.shape[0] == 1 else embedding.mean(dim=0)
                
                # Convert to numpy
                if isinstance(embedding, torch.Tensor):
                    embedding = embedding.cpu().numpy()
                
                # Ensure correct dimensionality (2048D)
                if embedding.shape[0] != 2048:
                    if embedding.shape[0] < 2048:
                        embedding = np.pad(embedding, (0, 2048 - embedding.shape[0]))
                    else:
                        embedding = embedding[:2048]
                
                return embedding.astype(np.float32)
                
        except Exception as e:
            print(f"Error extracting PANNs embedding from {wav_path}: {e}")
            import traceback
            traceback.print_exc()
            return np.zeros(2048, dtype=np.float32)


#### 3.4 Audio Embedding Factory

Factory class to create and manage different embedding models.


In [28]:
class AudioEmbeddingFactory:
    """Factory class for creating audio embedding models."""
    
    _models = {
        "openl3": OpenL3Embedding,
        "panns": PANNsEmbedding
    }
    
    @classmethod
    def create_model(cls, model_type: str, **kwargs):
        """
        Create an audio embedding model instance.
        
        Parameters:
        -----------
        model_type : str
            Type of model: "openl3" or "panns"
        **kwargs
            Additional arguments to pass to the model constructor
            
        Returns:
        --------
        Embedding model instance
        """
        model_type = model_type.lower()
        
        if model_type not in cls._models:
            raise ValueError(
                f"Unknown model type: {model_type}. "
                f"Available models: {list(cls._models.keys())}"
            )
        
        model_class = cls._models[model_type]
        return model_class(**kwargs)
    
    @classmethod
    def get_available_models(cls):
        """Get list of available model types."""
        return list(cls._models.keys())
    
    @classmethod
    def get_embedding_dimension(cls, model_type: str) -> int:
        """
        Get the embedding dimension for a given model type.
        
        Parameters:
        -----------
        model_type : str
            Type of model
            
        Returns:
        --------
        int
            Embedding dimension
        """
        dimensions = {
            "openl3": 512,
                        "panns": 2048
        }
        return dimensions.get(model_type.lower(), 512)

# Test the factory
print("Available embedding models:")
print(AudioEmbeddingFactory.get_available_models())
print("\nEmbedding dimensions:")
for model_type in AudioEmbeddingFactory.get_available_models():
    dim = AudioEmbeddingFactory.get_embedding_dimension(model_type)
    print(f"  {model_type}: {dim}D")


Available embedding models:
['openl3', 'panns']

Embedding dimensions:
  openl3: 512D
  panns: 2048D



### SECTION 4 ‚Äî COMPUTE EMBEDDINGS FOR 10 RANDOM SONGS


We'll randomly sample 10 songs from the dataset and compute embeddings using both models.


#### 4.1 Sample 10 Random Songs

Randomly select 10 songs from the validated dataset.


In [23]:
df_valid.sample(2)

Unnamed: 0,musicbrainz_id,title,artist,artist_id,album,album_id,release_date,country,length,file_path
693,aafe72e2-9b95-41db-b1d5-599bc21ecdfe,She's Leaving Home,The Beatles,b10bbbfc-cf9e-42e0-be17-e2c3e1d2600d,Sgt. Pepper Naked,e5f85542-4987-4c41-8b8d-36daf0a3d118,2005-01-01,US,216456.0,rec\aafe72e2-9b95-41db-b1d5-599bc21ecdfe.wav
56,0c1481db-8f8f-46ba-ba51-a238617c62b1,my hair,Ariana Grande,f4fdbb4c-e4b7-47a0-b83b-d91bbfcfa387,my hair,e4958a84-b691-4f90-a82b-880abc67aef6,2021-07-14,XW,197000.0,rec\0c1481db-8f8f-46ba-ba51-a238617c62b1.wav


In [29]:
# Sample 10 random songs from the validated dataset
N_SAMPLES = 10

if len(df_valid) >= N_SAMPLES:
    df_sample = df_valid.sample(n=N_SAMPLES, random_state=RANDOM_SEED).reset_index(drop=True)
    print(f"‚úÖ Sampled {len(df_sample)} songs from {len(df_valid)} total tracks")
    print("\nSampled tracks:")
    print(df_sample[['musicbrainz_id', 'file_path']].head(10))
else:
    print(f"‚ö†Ô∏è  Only {len(df_valid)} tracks available, using all of them")
    df_sample = df_valid.copy()
    print("\nUsing all available tracks:")
    print(df_sample[['musicbrainz_id', 'file_path']].head())


‚úÖ Sampled 10 songs from 1084 total tracks

Sampled tracks:
                         musicbrainz_id  \
0  aafe72e2-9b95-41db-b1d5-599bc21ecdfe   
1  0c1481db-8f8f-46ba-ba51-a238617c62b1   
2  42bdde78-259c-47eb-b69f-151e3b42bf2b   
3  f5e54962-d384-435f-8006-fbb872790c73   
4  2fedfd5e-d88e-458c-9890-0265e8c41a8c   
5  141c8038-ebe2-4aa3-88af-effc8621c7f9   
6  61e9d086-e9fb-4b32-8984-03eaaf0c9fbf   
7  6ff8a258-7649-4f36-bdf4-5bc990ed5d44   
8  4643da60-5491-4c21-8b9c-b2fae2829f4e   
9  8d206074-aec2-4357-87b9-5b2fd18811fb   

                                      file_path  
0  rec\aafe72e2-9b95-41db-b1d5-599bc21ecdfe.wav  
1  rec\0c1481db-8f8f-46ba-ba51-a238617c62b1.wav  
2  rec\42bdde78-259c-47eb-b69f-151e3b42bf2b.wav  
3  rec\f5e54962-d384-435f-8006-fbb872790c73.wav  
4  rec\2fedfd5e-d88e-458c-9890-0265e8c41a8c.wav  
5  rec\141c8038-ebe2-4aa3-88af-effc8621c7f9.wav  
6  rec\61e9d086-e9fb-4b32-8984-03eaaf0c9fbf.wav  
7  rec\6ff8a258-7649-4f36-bdf4-5bc990ed5d44.wav  
8  rec\4643da60

#### 4.2 Compute Embeddings for both models

Generate embeddings for the sampled songs using OpenL3 and PANNs models.


In [30]:
def compute_embeddings_for_model(
    df: pd.DataFrame,
    model_type: str,
    **model_kwargs
) -> Dict[str, np.ndarray]:
    """
    Compute embeddings for all tracks in dataframe using specified model.
    
    Parameters:
    -----------
    df : pd.DataFrame
        Dataframe with 'musicbrainz_id' and 'file_path' columns
    model_type : str
        Type of embedding model: "openl3" or "panns"
    **model_kwargs
        Additional arguments for model initialization
        
    Returns:
    --------
    Dict[str, np.ndarray]
        Dictionary mapping musicbrainz_id to embedding vector
    """
    print(f"\n{'='*60}")
    print(f"Computing {model_type.upper()} embeddings...")
    print(f"{'='*60}")
    
    # Create model instance
    model = AudioEmbeddingFactory.create_model(model_type, **model_kwargs)
    
    # Dictionary to store embeddings: {musicbrainz_id: embedding_vector}
    embeddings_dict = {}
    
    # Process each track
    for idx, row in tqdm(df.iterrows(), total=len(df), desc=f"Processing {model_type}"):
        musicbrainz_id = row['musicbrainz_id']
        wav_path = row['file_path']
        
        try:
            embedding = model.get_embedding(wav_path)
            embeddings_dict[musicbrainz_id] = embedding
        except Exception as e:
            print(f"\n‚ö†Ô∏è  Error processing {musicbrainz_id}: {e}")
            # Store zero vector as fallback
            embedding_dim = AudioEmbeddingFactory.get_embedding_dimension(model_type)
            embeddings_dict[musicbrainz_id] = np.zeros(embedding_dim, dtype=np.float32)
    
    print(f"‚úÖ Successfully computed {len(embeddings_dict)} {model_type.upper()} embeddings")
    return embeddings_dict

# Compute embeddings for both models
print("Starting embedding computation for all models...\n")

# OpenL3 embeddings
openl3_embeddings = compute_embeddings_for_model(df_sample, "openl3")

# PANNs embeddings
panns_embeddings = compute_embeddings_for_model(df_sample, "panns")

print("\n" + "="*60)
print("‚úÖ All embeddings computed successfully!")
print("="*60)


Starting embedding computation for all models...


Computing OPENL3 embeddings...
OpenL3Embedding initialized with input_repr=mel256, content_type=music, embedding_size=512


Processing openl3: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 10/10 [20:50<00:00, 125.07s/it]


‚úÖ Successfully computed 10 OPENL3 embeddings

Computing PANNS embeddings...
Loading PANNs model...
‚ö†Ô∏è  Error loading PANNs model: module 'torchaudio.pipelines' has no attribute 'PANNs_CNNonly'
Attempting alternative loading method...
‚ùå Failed to load PANNs model: module 'torchaudio.pipelines' has no attribute 'PANNs_AS20K'


Processing panns: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 10/10 [00:00<?, ?it/s]

PANNs model not loaded, returning zero vector
PANNs model not loaded, returning zero vector
PANNs model not loaded, returning zero vector
PANNs model not loaded, returning zero vector
PANNs model not loaded, returning zero vector
PANNs model not loaded, returning zero vector
PANNs model not loaded, returning zero vector
PANNs model not loaded, returning zero vector
PANNs model not loaded, returning zero vector
PANNs model not loaded, returning zero vector
‚úÖ Successfully computed 10 PANNS embeddings

‚úÖ All embeddings computed successfully!





## 4.3 Save Embeddings to Parquet Files

Convert embeddings dictionaries to DataFrames and save as Parquet files for efficient storage and retrieval.


In [37]:
def save_embeddings_to_parquet(
    embeddings_dict: Dict[str, np.ndarray],
    df_metadata: pd.DataFrame,
    output_path: str
) -> pd.DataFrame:
    """
    Convert embeddings dictionary to DataFrame and save as Parquet.
    
    Parameters:
    -----------
    embeddings_dict : Dict[str, np.ndarray]
        Dictionary mapping musicbrainz_id to embedding vector
    df_metadata : pd.DataFrame
        Dataframe with track metadata
    output_path : str
        Path to save Parquet file
        
    Returns:
    --------
    pd.DataFrame
        DataFrame with embeddings and metadata
    """
    # Create list of records
    records = []
    for musicbrainz_id, embedding in embeddings_dict.items():
        # Get metadata for this track
        track_meta = df_metadata[df_metadata['musicbrainz_id'] == musicbrainz_id].iloc[0].to_dict()
        
        # Create record with embedding as list (Parquet-friendly)
        record = track_meta.copy()
        record['embedding'] = embedding.tolist()
        records.append(record)
    
    # Create DataFrame
    df_embeddings = pd.DataFrame(records)
    
    # Save to Parquet
    df_embeddings.to_parquet(output_path, index=False)
    print(f"üíæ Saved {len(df_embeddings)} embeddings to {output_path}")
    print(f"   Embedding dimension: {len(embeddings_dict[list(embeddings_dict.keys())[0]])}D")
    
    return df_embeddings



In [38]:
# Save embeddings for each model
print("Saving embeddings to Parquet files...\n")

df_openl3 = save_embeddings_to_parquet(
    openl3_embeddings,
    df_sample,
    "openl3_embeddings.parquet"
)

Saving embeddings to Parquet files...

üíæ Saved 10 embeddings to openl3_embeddings.parquet
   Embedding dimension: 512D


In [40]:
df_openl3.head()

Unnamed: 0,musicbrainz_id,title,artist,artist_id,album,album_id,release_date,country,length,file_path,embedding
0,aafe72e2-9b95-41db-b1d5-599bc21ecdfe,She's Leaving Home,The Beatles,b10bbbfc-cf9e-42e0-be17-e2c3e1d2600d,Sgt. Pepper Naked,e5f85542-4987-4c41-8b8d-36daf0a3d118,2005-01-01,US,216456.0,rec\aafe72e2-9b95-41db-b1d5-599bc21ecdfe.wav,"[2.6655681133270264, 1.8816463947296143, 2.359..."
1,0c1481db-8f8f-46ba-ba51-a238617c62b1,my hair,Ariana Grande,f4fdbb4c-e4b7-47a0-b83b-d91bbfcfa387,my hair,e4958a84-b691-4f90-a82b-880abc67aef6,2021-07-14,XW,197000.0,rec\0c1481db-8f8f-46ba-ba51-a238617c62b1.wav,"[2.2107479572296143, 2.256802797317505, 3.4405..."
2,42bdde78-259c-47eb-b69f-151e3b42bf2b,"Sunflower, Vol. 6",Harry Styles,7eb1ce54-a355-41f9-8d68-e018b096d427,Fine Line,4325c1d4-cef1-4650-9d09-57af47b23cf4,2022-02-02,XW,221826.0,rec\42bdde78-259c-47eb-b69f-151e3b42bf2b.wav,"[2.4756479263305664, 2.1465394496917725, 3.059..."
3,f5e54962-d384-435f-8006-fbb872790c73,I Hate U,SZA,272989c8-5535-492d-a25c-9f58803e027f,I Hate U,26073acb-74f1-4035-9f37-ef6951873f50,2021-12-03,XW,173933.0,rec\f5e54962-d384-435f-8006-fbb872790c73.wav,"[2.3763434886932373, 2.065270185470581, 2.7041..."
4,2fedfd5e-d88e-458c-9890-0265e8c41a8c,Mine,Taylor Swift,20244d07-534f-4eff-b4d4-930878889970,Taylor Swift Karaoke: Speak Now,6cdd62a6-db01-4902-9481-d6e27ce756d9,2010-12-20,US,241106.0,rec\2fedfd5e-d88e-458c-9890-0265e8c41a8c.wav,"[2.3198599815368652, 2.1622135639190674, 3.129..."


In [None]:

## NOT running due to inability of PANNs model to run
# df_panns = save_embeddings_to_parquet(
#     panns_embeddings,
#     df_sample,
#     "panns_embeddings.parquet"
# )

# print("\n‚úÖ All embeddings saved successfully!")
# print("\nSample of saved embeddings structure:")
# print(df_openl3.head())

#### 5.1 Recommendation System Class

Create a recommendation system using FAISS for fast similarity search.


In [41]:
class AudioRecommendationEngine:
    """FAISS-based recommendation engine for audio embeddings."""
    
    def __init__(self, embeddings_dict: Dict[str, np.ndarray], df_metadata: pd.DataFrame):
        """
        Initialize recommendation engine.
        
        Parameters:
        -----------
        embeddings_dict : Dict[str, np.ndarray]
            Dictionary mapping musicbrainz_id to embedding vector
        df_metadata : pd.DataFrame
            Dataframe with track metadata
        """
        self.embeddings_dict = embeddings_dict
        self.df_metadata = df_metadata
        self.musicbrainz_ids = list(embeddings_dict.keys())
        self.index = None
        self.embedding_dim = None
        self._build_index()
    
    def _build_index(self):
        """Build FAISS index from embeddings."""
        if len(self.embeddings_dict) == 0:
            raise ValueError("No embeddings provided")
        
        # Get embedding dimension
        first_embedding = list(self.embeddings_dict.values())[0]
        self.embedding_dim = len(first_embedding)
        
        # Convert embeddings to numpy array
        embeddings_array = np.array([self.embeddings_dict[tid] for tid in self.musicbrainz_ids], dtype=np.float32)
        
        # Normalize embeddings for cosine similarity
        faiss.normalize_L2(embeddings_array)
        
        # Create FAISS index (using Inner Product for cosine similarity after normalization)
        self.index = faiss.IndexFlatIP(self.embedding_dim)
        self.index.add(embeddings_array)
        
        print(f"‚úÖ Built FAISS index with {self.index.ntotal} vectors (dim={self.embedding_dim})")
    
    def get_recommendations(
        self,
        query_musicbrainz_id: str,
        top_k: int = 5,
        exclude_query: bool = True
    ) -> pd.DataFrame:
        """
        Get top-k similar tracks for a given query track.
        
        Parameters:
        -----------
        query_musicbrainz_id : str
            Track ID to find similar tracks for
        top_k : int
            Number of recommendations to return
        exclude_query : bool
            Whether to exclude the query track from results
            
        Returns:
        --------
        pd.DataFrame
            DataFrame with recommendations including musicbrainz_id, similarity score, and metadata
        """
        if query_musicbrainz_id not in self.embeddings_dict:
            raise ValueError(f"Track ID {query_musicbrainz_id} not found in embeddings")
        
        # Get query embedding
        query_embedding = self.embeddings_dict[query_musicbrainz_id].astype(np.float32).reshape(1, -1)
        faiss.normalize_L2(query_embedding)
        
        # Search for similar tracks
        k = top_k + 1 if exclude_query else top_k  # +1 to account for excluding query
        distances, indices = self.index.search(query_embedding, k)
        
        # Prepare results
        results = []
        for i, (distance, idx) in enumerate(zip(distances[0], indices[0])):
            if idx >= len(self.musicbrainz_ids):
                continue
            
            recommended_musicbrainz_id = self.musicbrainz_ids[idx]
            
            # Skip query track if exclude_query is True
            if exclude_query and recommended_musicbrainz_id == query_musicbrainz_id:
                continue
            
            # Get metadata
            track_meta = self.df_metadata[self.df_metadata['musicbrainz_id'] == recommended_musicbrainz_id]
            if len(track_meta) > 0:
                result = track_meta.iloc[0].to_dict()
                result['similarity_score'] = float(distance)
                result['rank'] = len(results) + 1
                results.append(result)
            
            if len(results) >= top_k:
                break
        
        return pd.DataFrame(results)
    
    def get_recommendations_from_embedding(
        self,
        query_embedding: np.ndarray,
        top_k: int = 5
    ) -> pd.DataFrame:
        """
        Get top-k similar tracks for a given embedding vector.
        
        Parameters:
        -----------
        query_embedding : np.ndarray
            Query embedding vector
        top_k : int
            Number of recommendations to return
            
        Returns:
        --------
        pd.DataFrame
            DataFrame with recommendations
        """
        query_embedding = query_embedding.astype(np.float32).reshape(1, -1)
        faiss.normalize_L2(query_embedding)
        
        # Search
        distances, indices = self.index.search(query_embedding, top_k)
        
        # Prepare results
        results = []
        for distance, idx in zip(distances[0], indices[0]):
            if idx >= len(self.musicbrainz_ids):
                continue
            
            recommended_musicbrainz_id = self.musicbrainz_ids[idx]
            track_meta = self.df_metadata[self.df_metadata['musicbrainz_id'] == recommended_musicbrainz_id]
            
            if len(track_meta) > 0:
                result = track_meta.iloc[0].to_dict()
                result['similarity_score'] = float(distance)
                result['rank'] = len(results) + 1
                results.append(result)
        
        return pd.DataFrame(results)

print("‚úÖ RecommendationEngine class defined")


‚úÖ RecommendationEngine class defined


In [42]:
# Build recommendation engines for each model
print("Building FAISS indices for all models...\n")

rec_engine_openl3 = AudioRecommendationEngine(openl3_embeddings, df_sample)
# rec_engine_panns = AudioRecommendationEngine(panns_embeddings, df_sample)

print("\n‚úÖ All recommendation engines built successfully!")


Building FAISS indices for all models...

‚úÖ Built FAISS index with 10 vectors (dim=512)

‚úÖ All recommendation engines built successfully!


#### 5.3 Generate Recommendations for All Sampled Tracks

For each of the 10 sampled tracks, get top-5 recommendations using each model.


In [46]:
def display_recommendations(
    query_musicbrainz_id: str,
    df_metadata: pd.DataFrame,
    rec_engines: Dict[str, AudioRecommendationEngine],
    top_k: int = 5
):
    """
    Display recommendations for a query track from all models.
    
    Parameters:
    -----------
    query_musicbrainz_id : str
        Track ID to get recommendations for
    df_metadata : pd.DataFrame
        Metadata dataframe
    rec_engines : Dict[str, AudioRecommendationEngine]
        Dictionary of recommendation engines by model name
    top_k : int
        Number of recommendations
    """
    # Get query track info
    query_info = df_metadata[df_metadata['musicbrainz_id'] == query_musicbrainz_id]
    if len(query_info) == 0:
        print(f"‚ö†Ô∏è  Track {query_musicbrainz_id} not found")
        return
    
    query_track = query_info.iloc[0]
    
    print("="*80)
    print(f"QUERY TRACK: {query_track.get('title', 'N/A')} by {query_track.get('artist', 'N/A')}")
    print(f"Track ID: {query_musicbrainz_id}")
    if 'genre' in query_track:
        print(f"Genre: {query_track['genre']}")
    print("="*80)
    
    # Get recommendations from each model
    for model_name, rec_engine in rec_engines.items():
        print(f"\nüìä {model_name.upper()} Recommendations (Top-{top_k}):")
        print("-" * 80)
        
        try:
            recommendations = rec_engine.get_recommendations(query_musicbrainz_id, top_k=top_k)
            
            if len(recommendations) == 0:
                print("  No recommendations found")
            else:
                for idx, row in recommendations.iterrows():
                    print(f"  {row['rank']}. {row.get('title', 'N/A')} by {row.get('artist', 'N/A')}")
                    print(f"     Track ID: {row['musicbrainz_id']}")
                    print(f"     Similarity: {row['similarity_score']:.4f}")
                    if 'genre' in row:
                        print(f"     Genre: {row['genre']}")
                    print()
        except Exception as e:
            print(f"  ‚ö†Ô∏è  Error getting recommendations: {e}")
    
    print("\n" + "="*80 + "\n")



In [47]:
# Generate recommendations for all sampled tracks
print("Generating recommendations for all sampled tracks...\n")

rec_engines = {
    "OpenL3": rec_engine_openl3
    # "PANNs": rec_engine_panns
}

for musicbrainz_id in df_sample['musicbrainz_id']:
    display_recommendations(musicbrainz_id, df_sample, rec_engines, top_k=5)
    print("-----"*20)

Generating recommendations for all sampled tracks...

QUERY TRACK: She's Leaving Home by The Beatles
Track ID: aafe72e2-9b95-41db-b1d5-599bc21ecdfe

üìä OPENL3 Recommendations (Top-5):
--------------------------------------------------------------------------------
  1. Love song by Lana Del Rey
     Track ID: 8d206074-aec2-4357-87b9-5b2fd18811fb
     Similarity: 0.9937

  2. I Hate U by SZA
     Track ID: f5e54962-d384-435f-8006-fbb872790c73
     Similarity: 0.9918

  3. Mine by Taylor Swift
     Track ID: 2fedfd5e-d88e-458c-9890-0265e8c41a8c
     Similarity: 0.9912

  4. Only Angel by Harry Styles
     Track ID: 6ff8a258-7649-4f36-bdf4-5bc990ed5d44
     Similarity: 0.9900

  5. Sunflower, Vol. 6 by Harry Styles
     Track ID: 42bdde78-259c-47eb-b69f-151e3b42bf2b
     Similarity: 0.9888



----------------------------------------------------------------------------------------------------
QUERY TRACK: my hair by Ariana Grande
Track ID: 0c1481db-8f8f-46ba-ba51-a238617c62b1

üìä OPENL

In [45]:
df_sample

Unnamed: 0,musicbrainz_id,title,artist,artist_id,album,album_id,release_date,country,length,file_path
0,aafe72e2-9b95-41db-b1d5-599bc21ecdfe,She's Leaving Home,The Beatles,b10bbbfc-cf9e-42e0-be17-e2c3e1d2600d,Sgt. Pepper Naked,e5f85542-4987-4c41-8b8d-36daf0a3d118,2005-01-01,US,216456.0,rec\aafe72e2-9b95-41db-b1d5-599bc21ecdfe.wav
1,0c1481db-8f8f-46ba-ba51-a238617c62b1,my hair,Ariana Grande,f4fdbb4c-e4b7-47a0-b83b-d91bbfcfa387,my hair,e4958a84-b691-4f90-a82b-880abc67aef6,2021-07-14,XW,197000.0,rec\0c1481db-8f8f-46ba-ba51-a238617c62b1.wav
2,42bdde78-259c-47eb-b69f-151e3b42bf2b,"Sunflower, Vol. 6",Harry Styles,7eb1ce54-a355-41f9-8d68-e018b096d427,Fine Line,4325c1d4-cef1-4650-9d09-57af47b23cf4,2022-02-02,XW,221826.0,rec\42bdde78-259c-47eb-b69f-151e3b42bf2b.wav
3,f5e54962-d384-435f-8006-fbb872790c73,I Hate U,SZA,272989c8-5535-492d-a25c-9f58803e027f,I Hate U,26073acb-74f1-4035-9f37-ef6951873f50,2021-12-03,XW,173933.0,rec\f5e54962-d384-435f-8006-fbb872790c73.wav
4,2fedfd5e-d88e-458c-9890-0265e8c41a8c,Mine,Taylor Swift,20244d07-534f-4eff-b4d4-930878889970,Taylor Swift Karaoke: Speak Now,6cdd62a6-db01-4902-9481-d6e27ce756d9,2010-12-20,US,241106.0,rec\2fedfd5e-d88e-458c-9890-0265e8c41a8c.wav
5,141c8038-ebe2-4aa3-88af-effc8621c7f9,That‚Äôs My Kind of Night,Luke Bryan,aab35942-f176-4f77-bbf9-1d6aa98ccf3f,That's My Kind of Night,4c7736f2-9e6c-4937-a492-b6e0430f0bd0,2013-08-05,XW,190186.0,rec\141c8038-ebe2-4aa3-88af-effc8621c7f9.wav
6,61e9d086-e9fb-4b32-8984-03eaaf0c9fbf,Garden of Eden,Lady Gaga,650e7db6-b795-4eb5-a702-5ea2fc46c848,,,,,,rec\61e9d086-e9fb-4b32-8984-03eaaf0c9fbf.wav
7,6ff8a258-7649-4f36-bdf4-5bc990ed5d44,Only Angel,Harry Styles,7eb1ce54-a355-41f9-8d68-e018b096d427,Harry Styles,f037f8ba-5263-497a-b36c-f83bf4dc0817,2017-05-12,CA,291000.0,rec\6ff8a258-7649-4f36-bdf4-5bc990ed5d44.wav
8,4643da60-5491-4c21-8b9c-b2fae2829f4e,Arkansas,Chris Stapleton,71d58182-aa37-4c04-b21a-efe46ea0f221,Arkansas,07257089-3e93-4083-91df-c54d095f1107,2020-10-23,XW,178000.0,rec\4643da60-5491-4c21-8b9c-b2fae2829f4e.wav
9,8d206074-aec2-4357-87b9-5b2fd18811fb,Love song,Lana Del Rey,b7539c32-53e7-4908-bda3-81449c367da6,NFR!,5322431e-0e8b-417f-aebf-1b3c78ea11c0,2019-08-30,US,229000.0,rec\8d206074-aec2-4357-87b9-5b2fd18811fb.wav


#### 5.4 Summary Statistics

Analyze recommendation patterns across models.


In [44]:
# Collect all recommendations for analysis
all_recommendations = []

for musicbrainz_id in df_sample['musicbrainz_id']:
    for model_name, rec_engine in rec_engines.items():
        try:
            recs = rec_engine.get_recommendations(musicbrainz_id, top_k=5)
            recs['query_musicbrainz_id'] = musicbrainz_id
            recs['model'] = model_name
            all_recommendations.append(recs)
        except Exception as e:
            print(f"Error getting recommendations for {musicbrainz_id} with {model_name}: {e}")

if all_recommendations:
    df_all_recs = pd.concat(all_recommendations, ignore_index=True)
    
    print("üìà Recommendation Statistics:")
    print("="*60)
    print(f"Total recommendations generated: {len(df_all_recs)}")
    print(f"\nAverage similarity scores by model:")
    print(df_all_recs.groupby('model')['similarity_score'].agg(['mean', 'std', 'min', 'max']))
    
    print(f"\nRecommendations per model:")
    print(df_all_recs['model'].value_counts())
    
    # Check genre consistency (if genre column exists)
    if 'genre' in df_all_recs.columns:
        print(f"\nGenre diversity in recommendations:")
        genre_counts = df_all_recs.groupby('model')['genre'].nunique()
        print(genre_counts)
else:
    print("‚ö†Ô∏è  No recommendations generated")


üìà Recommendation Statistics:
Total recommendations generated: 50

Average similarity scores by model:
            mean       std       min       max
model                                         
OpenL3  0.994416  0.002827  0.985572  0.997532

Recommendations per model:
model
OpenL3    50
Name: count, dtype: int64



### SECTION 6 ‚Äî COMPARISON & INSIGHTS


Analysis and comparison of the three embedding models.


## 6.1 Model Characteristics

### OpenL3
- **Embedding Dimension**: 512D
- **Training**: Trained on audio-visual data (AudioSet + video)
- **Strengths**: 
  - Good at capturing general audio content
  - Works well for music classification
  - Fast inference
- **Use Cases**: 
  - General music similarity
  - Content-based filtering
  - Quick prototyping

### PANNs
- **Embedding Dimension**: 2048D
- **Training**: Large-scale pretraining on AudioSet (20K classes)
- **Strengths**:
  - Rich feature representation (higher dimensionality)
  - Excellent for fine-grained audio analysis
  - State-of-the-art on many audio tasks
- **Use Cases**:
  - Detailed audio similarity
  - When you need the most expressive features
  - Production systems with computational resources


## 6.2 Which Embedding Method Captures Timbre Better?

**Timbre** refers to the quality or "color" of sound that distinguishes different instruments or voices playing the same note.

### Analysis:

1. **PANNs** likely captures timbre best:
   - Highest dimensionality (2048D) allows for more nuanced feature representation
   - Trained on AudioSet with many instrument classes
   - Deep CNN architecture captures spectral-temporal patterns well

2. **OpenL3** provides general timbre features:
   - Good baseline for timbre similarity
   - May miss fine-grained distinctions
   - Faster but less detailed

### Recommendation:
For **timbre-focused** applications, use **PANNs** for the richest representation, .


## 6.3 Which Retrieves More Genre-Similar Songs?

**Genre similarity** refers to how well recommendations match the genre of the query track.

### Analysis:

1. **OpenL3** provides decent genre clustering:
   - Trained on AudioSet which has genre labels
   - Good at high-level music categorization
   - May group similar genres together

2. **PANNs** captures detailed features:
   - May focus on acoustic features rather than genre boundaries
   - Could retrieve songs with similar instrumentation but different genres
   - More fine-grained, potentially less genre-focused

### Recommendation:
For **genre-based** recommendations, use **OpenL3**.  in your metadata, while OpenL3 provides a good balance of speed and genre awareness.


## 6.4 Where Each Model May Be Used in the Final Recommendation Engine

### Hybrid Recommendation Strategy:

#### **Tier 1: Fast Filtering (OpenL3)**
- Use for initial candidate generation from large catalogs
- Quick similarity search to reduce search space
- Good for real-time recommendations
- **When to use**: 
  - Cold start scenarios
  - Large-scale filtering (millions of tracks)
  - Real-time recommendation APIs

#### **Tier 2: Fine-Grained Matching (PANNs)**
- Use for final ranking and detailed similarity
- Best for precision when you need exact matches
- Highest quality but slower
- **When to use**:
  - Final ranking stage
  - When computational resources allow
  - Precision-critical applications
  - Detailed audio analysis

### Recommended Pipeline:

```
1. User Query ‚Üí OpenL3 ‚Üí Get top 1000 candidates (fast)
2. Filter by metadata/genre ‚Üí Reduce to top 100
4. PANNs fine-grained ‚Üí Final top 10 recommendations
```

### Alternative: Ensemble Approach
- Combine embeddings from both models
- Weighted average or concatenation
- Train a meta-model to learn optimal combination
- **Best for**: Production systems with sufficient resources


## 6.5 Practical Considerations

### Computational Requirements:
- **OpenL3**: Fastest, lowest memory (~512D embeddings)
- **PANNs**: Slowest, highest memory (~2048D embeddings)

### Storage:
- **OpenL3**: ~2KB per track (512 floats)
- **PANNs**: ~8KB per track (2048 floats)

### Accuracy vs Speed Trade-off:
- **Speed priority**: Use OpenL3
- **Accuracy priority**: Use PANNs

### Final Recommendation:
For a **production recommendation engine**, consider:
1. **Start with OpenL3** for scalability
2. **Use PANNs** for premium/precision features
3. **Implement caching** for frequently accessed tracks
4. **Use FAISS** for efficient similarity search at scale



# SUMMARY


This notebook provides a complete implementation of:

‚úÖ **Two audio embedding models** (OpenL3, PANNs) using a factory pattern  
‚úÖ **Embedding extraction** for audio files  
‚úÖ **FAISS-based recommendation engine** for fast similarity search  
‚úÖ **Parquet storage** for efficient embedding persistence  
‚úÖ **Comprehensive comparison** and usage recommendations  

### Key Takeaways:

1. **OpenL3**: Best for fast, general-purpose music similarity
2. **PANNs**: Best for detailed, fine-grained audio analysis

### Next Steps:

- Scale to larger datasets
- Implement ensemble methods
- Add evaluation metrics (precision@k, recall@k)
- Deploy as a production API
- Fine-tune models on your specific music catalog

### Files Generated:

- `openl3_embeddings.parquet`: OpenL3 embeddings
- `panns_embeddings.parquet`: PANNs embeddings

All embeddings are stored with track metadata for easy retrieval and analysis.



# SECTION 5 ‚Äî SONG RECOMMENDATIONS


Build a recommendation system using the generated embeddings.


## 5.1 Recommendation System Class

Create a recommendation system using FAISS for fast similarity search.
