# Song Genre Classification Demo (Colab)

This notebook demonstrates the key functionalities of the Song Genre Classification project in a Google Colab environment.

## 1. Setup and Dependencies

First, we need to clone the repository and install the required libraries from `requirements.txt`.

In [5]:
# Clone the repository (replace with your repo URL)
!git clone https://github.com/itsmskoff-byte/song-genre-classification.git
%cd song-genre-classification

Cloning into 'song-genre-classification'...
/content/song-genre-classification/song-genre-classification


In [9]:
# Install dependencies
!pip install -r requirements.txt
# Install Spleeter model (2stems model)
!spleeter install

[31mERROR: Could not open requirements file: [Errno 2] No such file or directory: 'requirements.txt'[0m[31m
[0m/bin/bash: line 1: spleeter: command not found


## 2. Download Sample Data

Download a few sample audio files from different genres for demonstration.

In [7]:
# Create a directory for sample data
!mkdir -p sample_data/blues sample_data/classical

# Download sample audio files (replace with actual public URLs to small audio files)
# Example: Using dummy files or small snippets from a public dataset like Free Music Archive (FMA)
# Note: You'll need to find actual downloadable links for diverse genres.
# For demonstration purposes, let's create dummy files or download placeholders if real ones aren't readily available.

# Example using wget (replace URLs)
# !wget -O sample_data/blues/sample_blues.mp3 https://example.com/sample_blues.mp3
# !wget -O sample_data/classical/sample_classical.wav https://example.com/sample_classical.wav

# Placeholder code if public URLs are hard to find for a demo
print("Placeholder for downloading sample audio files. Please replace with actual download commands.")
# Create dummy audio files for structure testing if needed
# !echo "dummy audio content" > sample_data/blues/dummy_blues.wav
# !echo "dummy audio content" > sample_data/classical/dummy_classical.wav

Placeholder for downloading sample audio files. Please replace with actual download commands.


## 3. Audio Processing and Feature Extraction

Demonstrate using the `utils.py` functions to load audio, perform source separation, and extract features.

In [10]:
# Import utility functions
# Ensure utils.py is in the project root or accessible
from utils import load_audio, separate_stems, sliding_window, extract_classical_features, extract_mel_spectrogram

# Define a sample audio file path (replace with the actual path after downloading)
# Assuming you downloaded sample_data/blues/sample_blues.mp3
sample_audio_path = 'sample_data/blues/sample_blues.mp3'

# Check if the sample file exists (important for Colab demo)
import os
if not os.path.exists(sample_audio_path):
    print(f"Sample audio file not found at {sample_audio_path}. Please download sample data first.")
else:
    print(f"Processing sample audio: {sample_audio_path}")

    # --- Load Audio ---
    st.markdown("### Loading Audio")
    audio_data, sample_rate = load_audio(sample_audio_path)

    if audio_data is not None:
        st.info(f"Audio loaded successfully. Sample rate: {sample_rate}, Duration: {len(audio_data)/sample_rate:.2f} seconds")

        # --- Source Separation ---
        st.markdown("### Source Separation")
        separated_output_dir = 'demo_separated_stems'
        vocal_path, instrumental_path = separate_stems(sample_audio_path, output_dir=separated_output_dir)

        if vocal_path and instrumental_path:
            st.success(f"Stems saved to {separated_output_dir}")
            st.audio(vocal_path, format='audio/wav', caption='Vocal Stem')
            st.audio(instrumental_path, format='audio/wav', caption='Instrumental Stem')
        else:
            st.warning("Source separation failed.")

        # --- Feature Extraction (Classical) ---
        st.markdown("### Classical Feature Extraction")
        # Use a small segment for demonstration
        window_size_sec = 3.0
        hop_length_sec = 3.0 # Use same hop for single segment demo
        segments = list(sliding_window(audio_data, sample_rate, window_size_sec, hop_length_sec))

        if segments:
            first_segment = segments[0]
            classical_features = extract_classical_features(first_segment, sample_rate)
            if classical_features is not None:
                st.info(f"Extracted {len(classical_features)} classical features from the first segment.")
                st.write("Sample Classical Features (first 5):", classical_features[:5])
            else:
                st.warning("Classical feature extraction failed.")
        else:
            st.warning("Audio is too short for segmentation.")

        # --- Feature Extraction (Deep Learning) ---
        st.markdown("### Deep Learning Feature Extraction")
        if segments:
            first_segment = segments[0]
            mel_spectrogram = extract_mel_spectrogram(first_segment, sample_rate)
            if mel_spectrogram is not None:
                st.info(f"Extracted Mel-Spectrogram with shape: {mel_spectrogram.shape}")
                # Displaying the spectrogram image directly in Colab might require matplotlib or similar
                # For simplicity, just show shape and a snippet
                st.write("Sample Mel-Spectrogram snippet:")
                st.image(mel_spectrogram[:50, :50], caption="Mel-Spectrogram Snippet") # Display as image
            else:
                st.warning("Deep learning feature extraction failed.")
        else:
             st.warning("Audio is too short for segmentation.")

    else:
        st.error("Failed to load audio data.")

ModuleNotFoundError: No module named 'utils'

## 4. Load Pre-trained Models and Label Encoder

Assuming models (`classical_model.joblib`, `deep_learning_model.h5`) and the label encoder (`label_encoder.pkl`) are available (e.g., from training or a pre-trained download).

In [None]:
import pickle
import joblib
import tensorflow as tf
from sklearn.preprocessing import LabelEncoder

# Define paths (replace with actual paths if different)
classical_model_path = 'trained_models/classical_model.joblib'
deep_learning_model_path = 'trained_models/deep_learning_model.h5'
label_encoder_path = 'trained_models/label_encoder.pkl'

# --- Load Classical Model ---
st.markdown("### Loading Classical Model")
classical_model = None
if os.path.exists(classical_model_path):
    try:
        classical_model = joblib.load(classical_model_path)
        st.success("Classical model loaded successfully.")
    except Exception as e:
        st.error(f"Error loading classical model: {e}")
else:
    st.warning(f"Classical model not found at {classical_model_path}. Please train it first or provide the correct path.")

# --- Load Deep Learning Model ---
st.markdown("### Loading Deep Learning Model")
deep_learning_model = None
if os.path.exists(deep_learning_model_path):
    try:
        deep_learning_model = tf.keras.models.load_model(deep_learning_model_path, compile=False)
        st.success("Deep learning model loaded successfully.")
    except Exception as e:
        st.error(f"Error loading deep learning model: {e}")
else:
    st.warning(f"Deep learning model not found at {deep_learning_model_path}. Please train it first or provide the correct path.")

# --- Load Label Encoder ---
st.markdown("### Loading Label Encoder")
label_encoder = None
if os.path.exists(label_encoder_path):
    try:
        with open(label_encoder_path, 'rb') as f:
            label_encoder = pickle.load(f)
        st.success("Label encoder loaded successfully.")
        st.info(f"Genres: {list(label_encoder.classes_)}")
    except Exception as e:
        st.error(f"Error loading label encoder: {e}")
else:
    st.warning(f"Label encoder not found at {label_encoder_path}. Please train a model first to generate it or provide the correct path.")

## 5. Perform Inference

Use the loaded models and the prediction logic (similar to `predict.py`) to classify the sample audio file.

In [None]:
# Assuming sample_audio_path, audio_data, and sample_rate are available from Section 3
# Assuming classical_model, deep_learning_model, and label_encoder are available from Section 4

# Define segment parameters (must match training)
WINDOW_SIZE_SEC = 3.0
HOP_LENGTH_SEC = 1.5
TARGET_SR = 22050
N_MELS = 128

def predict_song(audio_data, sample_rate, model, model_type, feature_type, label_encoder):
    """
    Performs prediction on a full audio data array.
    Simplified prediction logic for the demo notebook.
    """
    segment_predictions = []

    # Get input shape for deep learning model if applicable
    input_shape = None
    if model_type == 'deep_learning' and model:
        try:
            if hasattr(model, 'input_shape') and len(model.input_shape) > 1:
                 input_shape = model.input_shape[1:]
            elif model.layers and hasattr(model.layers[0], 'input_shape') and len(model.layers[0].input_shape) > 1: # For Functional API models
                 input_shape = model.layers[0].input_shape[0][1:]
            else:
                print("Warning: Could not determine input shape from deep learning model.")
        except Exception as e:
            print(f"Error determining deep learning input shape: {e}")

    # Apply sliding window and collect segment predictions
    segments_generator = sliding_window(audio_data, sample_rate, WINDOW_SIZE_SEC, HOP_LENGTH_SEC)
    for i, segment in enumerate(segments_generator):
        try:
            if feature_type == 'classical':
                features = extract_classical_features(segment, sample_rate)
                if features is None:
                    print(f"Warning: Feature extraction failed for segment {i+1}.")
                    continue
                features = features.reshape(1, -1)

            elif feature_type == 'deep_learning':
                features = extract_mel_spectrogram(segment, sample_rate, n_mels=N_MELS)
                if features is None:
                    print(f"Warning: Feature extraction failed for segment {i+1}.")
                    continue
                features = np.expand_dims(features, axis=-1)
                features = np.expand_dims(features, axis=0)

                if input_shape is not None and features.shape[1:] != input_shape:
                     print(f"Warning: Extracted feature shape {features.shape[1:]} does not match expected input shape {input_shape} for segment {i+1}. Skipping.")
                     continue

            else:
                print(f"Error: Unsupported feature type: {feature_type}")
                return None, None

            # Perform prediction on the segment
            if model_type == 'classical':
                 if hasattr(model, 'predict_proba'):
                      prediction_probs = model.predict_proba(features)
                 else:
                      print("Warning: Classical model lacks predict_proba. Using predict.")
                      # Fallback: predict class and create dummy probability array
                      predicted_class_idx = model.predict(features)[0]
                      num_classes = len(label_encoder.classes_)
                      prediction_probs = np.zeros((1, num_classes))
                      prediction_probs[0, predicted_class_idx] = 1.0


            elif model_type == 'deep_learning':
                prediction_probs = model.predict(features)

            else:
                 print(f"Error: Unsupported model type: {model_type}")
                 return None, None

            segment_predictions.append(prediction_probs[0])
        except Exception as e:
            print(f"Error processing segment {i+1}: {e}")
            continue

    # Aggregate predictions
    if not segment_predictions:
        print("No successful segment predictions.")
        return None, None

    all_predictions = np.vstack(segment_predictions)
    average_probabilities = np.mean(all_predictions, axis=0)
    predicted_class_index = np.argmax(average_probabilities)
    predicted_genre = label_encoder.inverse_transform([predicted_class_index])[0]

    return predicted_genre, average_probabilities

# --- Perform Prediction with Classical Model ---
st.markdown("### Prediction with Classical Model")
if classical_model and label_encoder and audio_data is not None:
    st.info("Classifying sample audio using Classical Model...")
    predicted_genre_classical, probs_classical = predict_song(audio_data, sample_rate, classical_model, 'classical', 'classical', label_encoder)

    if predicted_genre_classical:
        st.success(f"Classical Model Predicted Genre: **{predicted_genre_classical}**")
        st.write("Probabilities:")
        genre_probs = list(zip(label_encoder.classes_, probs_classical))
        genre_probs_sorted = sorted(genre_probs, key=lambda item: item[1], reverse=True)
        for genre, prob in genre_probs_sorted:
            st.text(f"  {genre}: {prob:.4f}")
    else:
        st.error("Classical model prediction failed.")
elif audio_data is None:
    st.warning("Audio data not loaded. Cannot run classical prediction demo.")
else:
    st.warning("Classical model or label encoder not loaded. Cannot run classical prediction demo.")

# --- Perform Prediction with Deep Learning Model ---
st.markdown("### Prediction with Deep Learning Model")
# Note: Ensure the deep learning model was trained with deep learning features (mel-spectrograms)
if deep_learning_model and label_encoder and audio_data is not None:
    st.info("Classifying sample audio using Deep Learning Model...")
    # Need to determine input shape for deep learning model
    predicted_genre_dl, probs_dl = predict_song(audio_data, sample_rate, deep_learning_model, 'deep_learning', 'deep_learning', label_encoder)

    if predicted_genre_dl:
        st.success(f"Deep Learning Model Predicted Genre: **{predicted_genre_dl}**")
        st.write("Probabilities:")
        genre_probs = list(zip(label_encoder.classes_, probs_dl))
        genre_probs_sorted = sorted(genre_probs, key=lambda item: item[1], reverse=True)
        for genre, prob in genre_probs_sorted:
            st.text(f"  {genre}: {prob:.4f}")
    else:
        st.error("Deep learning model prediction failed.")
elif audio_data is None:
     st.warning("Audio data not loaded. Cannot run deep learning prediction demo.")
else:
    st.warning("Deep learning model or label encoder not loaded. Cannot run deep learning prediction demo.")

## 6. Running `train.py` and `app.py` in Colab

While this notebook demonstrates key components, the full training and web application are typically run as separate scripts. Here's how you would execute them in Colab:

### Running `train.py`

You would first need to prepare your data as described in the `README.md` and the data preparation steps (Sections 2 & 3 of this notebook demonstrate parts of this). Assuming you have processed data (e.g., `processed_features.pkl`), you can train a model:

In [None]:
# Example command to train a classical model
# Replace 'path/to/your/processed_data.pkl' with the actual path
# Ensure you have prepared data before running this.
# !python train.py --model_type classical --data_path path/to/your/processed_data.pkl --save_dir trained_models

# Example command to train a deep learning model
# Replace 'path/to/your/processed_data_mel.npy' with the actual path to mel-spectrogram data
# Ensure you have prepared data before running this.
# !python train.py --model_type deep_learning --data_path path/to/your/processed_data_mel.npy --save_dir trained_models

print("Uncomment the above lines to run training commands in Colab.")

### Running `app.py` (Streamlit Web App)

To run the Streamlit app in Colab, you typically need to use `ngrok` or a similar service to expose the local server.

In [None]:
# Install ngrok
!pip install ngrok

# Run Streamlit app and expose it via ngrok
# This will output a public URL to access the app.
# Note: This will block the notebook execution until the app is stopped.
# !streamlit run app.py & npx ngrok http 8501 --log=stdout > ngrok.log &
# import time
# time.sleep(5) # Give ngrok time to start
# !grep -o 'https://[^ ]*.ngrok.io' ngrok.log || echo "ngrok URL not found. Check ngrok.log"

**Note:** Running `!streamlit run app.py` directly in a Colab cell might not display the app correctly within the notebook output. Using `ngrok` is the standard way to access Streamlit apps hosted in Colab.