Drop Detector

A machine-learning model designed to detect "drops" (high-energy payoffs) in musical tracks, with a specific optimization for Electronic Dance Music (EDM).

This project iterates upon the work of Yadati et al. (ISMIR 2014) regarding content-based drop detection. Using EDM-specific features and XGBoost, this implementation significantly improves detection accuracy:

Model	F1 Score
Yadati et al. (2014)	0.71
Drop Detector (This Repo)	0.95

📦 Installation

Clone the repository:

git clone https://github.com/yourusername/edm-drop-detector.git
cd edm-drop-detector

Install Python dependencies:
```
pip install -r requirements.txt
```
Install FFmpeg: This tool relies on ffmpeg for audio processing.
- Mac: brew install ffmpeg
- Linux: sudo apt install ffmpeg
- Windows: Download binary and add to PATH.

🚀 Quick Start: Using the Pre-trained Model

You can use the CLI tool model_predict.py to scan folders of audio and detect drops.

Usage

python model_predict.py [OPTIONS]

Arguments

Flag	Argument	Description
`-f`, `--folder`	`PATH`	Required. Path to the root directory containing audio files to scan.
`-m`, `--model`	`PATH`	Path to the `.joblib` model file. Defaults to `./model.joblib`.
`-T`, `--threshold`	`FLOAT`	Manual confidence threshold (0.0 - 1.0). Overrides the model's optimal default.
`-k`, `--topk`	`INT`	Output only the top `K` drops per track. If set without a threshold, ignores confidence scores.
`-c`, `--csv`	`PATH`	Save predictions to a CSV file. Default: `./model_predictions.csv`.
`-t`, `--tag`	N/A	Write drop times (e.g., `DROP_TIME=60.5,124.2`) into the audio file metadata.

Example

Scan the test folder, save the top 3 drops that have a confidence above 90%, save the results to CSV, and tag the actual audio files:

python model_predict.py \
  -f "/Users/Admin/Music/Download/test" \
  --csv \
  --topk 3 \
  --threshold 0.90 \
  --tag

🛠️ Advanced: Training Your Own Model

If you wish to retrain the model on your own dataset:

1. Data Preparation

Place your raw audio files into the dataset_train/ folder.

2. Labeling

Run the labeling assistant:

python dataset_build.py

This script will iterate through your tracks, propose candidates, and ask you to verify if they are true drops.

3. Cleaning

Once labeled, process the data into a clean CSV format for the model:

python dataset_clean.py

4. Training

Run the training pipeline:

python model_build.py

This will:

Extract features for all labeled candidates.
Run Bayesian Hyperparameter Optimization (Optuna).
Train an XGBoost classifier.
Output the final model.joblib file.

🧠 How It Works

The system utilizes a three-stage pipeline:

1. 🔍 Candidate Generation

The code uses fast signal processing heuristics to identify potential points of interest:

Bass Boost & Envelope: Applies a low-shelf filter and scans the volume envelope for sharp energy rises.
Transient Snapping: Mathematically aligns timestamps to the exact "kick" or transient.

2. 🎛️ Feature Extraction

For each candidate, the model extracts 29 context-aware features, comparing the audio after the impact against the build-up before it:

RMS Energy: Does the volume increase?
Future Energy: Is there a louder section coming up later?
Grid Alignment: Does the candidate land on a significant 4, 8, 16, or 32-bar boundary?
Pulse Clarity: Is there a strong, defined rhythmic pulse, or is the texture messy?
Transient Dominance: Does the initial "kick" stand out compared to its surroundings?
Bass Ratio: Is the sound spectrum suddenly dominated by low-frequency energy?
Bass Continuity: Does the bassline sustain, or does it fade?

3. 🤖 Classification (XGBoost)

The core logic is handled by a Gradient Boosted Decision Tree.

Optuna Tuning: Uses Bayesian optimization to find the perfect hyperparameter combination
Dynamic Thresholding: Automatically calibrates the probability cutoff to maximize the F1-Score

⚠️ Limitations

Tracks with irregular time signatures or lack of percussion may yield fewer candidates.
Extreme compression or sparse bass mixing can mask the specific contrast features the model looks for.
"Fake drops" can trick the model if they are rhythmically similar to a real drop.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Drop Detector

📦 Installation

🚀 Quick Start: Using the Pre-trained Model

Usage

Arguments

Example

🛠️ Advanced: Training Your Own Model

1. Data Preparation

2. Labeling

3. Cleaning

4. Training

🧠 How It Works

1. 🔍 Candidate Generation

2. 🎛️ Feature Extraction

3. 🤖 Classification (XGBoost)

⚠️ Limitations

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
dataset_train		dataset_train
LICENSE		LICENSE
README.md		README.md
dataset_build.py		dataset_build.py
dataset_clean.py		dataset_clean.py
model.joblib		model.joblib
model_build.py		model_build.py
model_predict.py		model_predict.py
processors.py		processors.py
requirements.txt		requirements.txt
utilities.py		utilities.py

License

felixmeyer6/drop-detector

Folders and files

Latest commit

History

Repository files navigation

Drop Detector

📦 Installation

🚀 Quick Start: Using the Pre-trained Model

Usage

Arguments

Example

🛠️ Advanced: Training Your Own Model

1. Data Preparation

2. Labeling

3. Cleaning

4. Training

🧠 How It Works

1. 🔍 Candidate Generation

2. 🎛️ Feature Extraction

3. 🤖 Classification (XGBoost)

⚠️ Limitations

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages