Skip to content

omn25/Audio-Classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🎵 Audio Classifier

A binary classification model that labels 10-second audio segments from full songs as either "good" (1) or "bad" (0) for use in audio-based applications like guessing games, highlight extraction, or clip curation.


🔍 Overview

This project powers automated clip selection for Bollyguess, improving the quality of daily audio snippets by predicting which segments are likely to be familiar yet challenging.

The model is trained on manually labeled segments and extracts audio features such as MFCC, chroma, tempo, and pitch using Librosa. It uses a feedforward neural network built in PyTorch, and is evaluated with standard classification metrics.


🛠 Tech Stack

  • Python
  • PyTorch – model architecture and training
  • TensorFlow – experimentation support
  • Librosa – feature extraction (MFCC, chroma, tempo, pitch)
  • FFmpeg – audio slicing and preprocessing
  • Scikit-learn – evaluation (confusion matrix, F1-score)
  • Pandas

🧠 Model Architecture

  • 4 hidden layers with ReLU activation
  • Final layer uses Sigmoid for binary classification
  • Trained on labeled 10-second audio clips (0 = not suitable, 1 = suitable)

⚙️ Pipeline

  1. Audio Preprocessing
    Full songs are sliced into 10-second segments using FFmpeg.

  2. Feature Extraction
    Each segment is converted to a feature vector using Librosa:

    • MFCCs
    • Chroma
    • Tempo
    • Pitch
  3. Model Training
    A binary classifier is trained on the extracted features using PyTorch.

  4. Evaluation
    Model performance is analyzed using:

    • F1-score
    • Confusion matrix
  5. Deployment (Optional)
    The classifier can be integrated into apps for automated segment selection.


📊 Example Use Case

Used in Bollyguess to select ideal audio clips for a daily Bollywood music guessing game.


🚀 Getting Started

  1. Clone the repo

    git clone https://github.com/omn25/audio-classifier.git
    cd audio-classifier
  2. Install dependencies

    pip install -r requirements.txt
  3. Run preprocessing

    python model_trainer/build_dataset.py --input songs/ --output clips/
  4. Extract features

    python model_trainer/utils.py --input clips/ --output features.csv
  5. Train the model

    python model_trainer/train.py --features features.csv

🧪 Sample Output

  • Segment: 00:50–01:00Predicted: 1
  • Segment: 02:15–02:25Predicted: 0

📂 Project Structure

audio-classifier/
│
├── model_trainer/
│   ├── build_dataset.py
│   ├── model.py
│   ├── test_adaptive_segments.py
│   ├── test_model.py
│   ├── train.py
│   └── utils.py
│
├── songs/                  # Optional: raw input songs (if used)
├── test_songs/            # Sample songs for testing predictions
│   ├── abhi_na_jao_chod_kar.mp3
│   └── ghar_more_pardesiya.mp3

📄 License

MIT License


🙋‍♂️ Contact

Built by Om Nathwani
Email: ornathwa@uwaterloo.ca
GitHub: omn25

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages