🎵 Audio Classifier

A binary classification model that labels 10-second audio segments from full songs as either "good" (1) or "bad" (0) for use in audio-based applications like guessing games, highlight extraction, or clip curation.

🔍 Overview

This project powers automated clip selection for Bollyguess, improving the quality of daily audio snippets by predicting which segments are likely to be familiar yet challenging.

The model is trained on manually labeled segments and extracts audio features such as MFCC, chroma, tempo, and pitch using Librosa. It uses a feedforward neural network built in PyTorch, and is evaluated with standard classification metrics.

🛠 Tech Stack

Python
PyTorch – model architecture and training
TensorFlow – experimentation support
Librosa – feature extraction (MFCC, chroma, tempo, pitch)
FFmpeg – audio slicing and preprocessing
Scikit-learn – evaluation (confusion matrix, F1-score)
Pandas

🧠 Model Architecture

4 hidden layers with ReLU activation
Final layer uses Sigmoid for binary classification
Trained on labeled 10-second audio clips (0 = not suitable, 1 = suitable)

⚙️ Pipeline

Audio Preprocessing
Full songs are sliced into 10-second segments using FFmpeg.
Feature Extraction
Each segment is converted to a feature vector using Librosa:
- MFCCs
- Chroma
- Tempo
- Pitch
Model Training
A binary classifier is trained on the extracted features using PyTorch.
Evaluation
Model performance is analyzed using:
- F1-score
- Confusion matrix
Deployment (Optional)
The classifier can be integrated into apps for automated segment selection.

📊 Example Use Case

Used in Bollyguess to select ideal audio clips for a daily Bollywood music guessing game.

🚀 Getting Started

Clone the repo

git clone https://github.com/omn25/audio-classifier.git
cd audio-classifier

Install dependencies
```
pip install -r requirements.txt
```

Run preprocessing

python model_trainer/build_dataset.py --input songs/ --output clips/

Extract features

python model_trainer/utils.py --input clips/ --output features.csv

Train the model

python model_trainer/train.py --features features.csv

🧪 Sample Output

Segment: 00:50–01:00 → Predicted: 1
Segment: 02:15–02:25 → Predicted: 0

📂 Project Structure

audio-classifier/
│
├── model_trainer/
│   ├── build_dataset.py
│   ├── model.py
│   ├── test_adaptive_segments.py
│   ├── test_model.py
│   ├── train.py
│   └── utils.py
│
├── songs/                  # Optional: raw input songs (if used)
├── test_songs/            # Sample songs for testing predictions
│   ├── abhi_na_jao_chod_kar.mp3
│   └── ghar_more_pardesiya.mp3

📄 License

MIT License

🙋‍♂️ Contact

Built by Om Nathwani
Email: ornathwa@uwaterloo.ca
GitHub: omn25

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
model_trainer		model_trainer
song_bin2		song_bin2
songs		songs
test_songs		test_songs
LICENSE.md		LICENSE.md
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

🎵 Audio Classifier

🔍 Overview

🛠 Tech Stack

🧠 Model Architecture

⚙️ Pipeline

📊 Example Use Case

🚀 Getting Started

🧪 Sample Output

📂 Project Structure

📄 License

🙋‍♂️ Contact

About

Uh oh!

Releases

Packages

Languages

Uh oh!

License

Uh oh!

omn25/Audio-Classifier

Folders and files

Latest commit

History

Repository files navigation

🎵 Audio Classifier

🔍 Overview

🛠 Tech Stack

🧠 Model Architecture

⚙️ Pipeline

📊 Example Use Case

🚀 Getting Started

🧪 Sample Output

📂 Project Structure

📄 License

🙋‍♂️ Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages