pitch-detection

Autocorrelation-based C++ pitch detection algorithms with O(nlogn) or lower running time:

McLeod pitch method - 2005 paper - visualization
YIN(-FFT) - 2002 paper - visualization
Probabilistic YIN - 2014 paper
Probabilistic MPM - my own invention

The size of the FFT used is the same as the size of the input waveform, such that the output is a single pitch for the entire waveform.

Librosa (among other libraries) uses the STFT to create frames of the input waveform, and applies pitch tracking to each frame with a fixed FFT size (typically 2048 or some other power of two). If you want to track the temporal evolution of pitches in sub-sections of the waveform, you have to handle the waveform splitting yourself (look at wav_analyzer for more details).

📯 Latest news 📰

Dec 27, 2023 🎅 release:

Removed SWIPE' algorithm
- It is not based on autocorrelation, I skipped it in all of the tests, and my implementation was basically copy-pasted from kylebgorman/swipe: just use their code instead!
Fix autocorrelation (in YIN and MPM) for power-of-two sizes in FFTS (see ffts issue #65) by using r2c/c2r transforms (addresses bug #72 reported by jeychenne)
Fix PYIN bugs to pass all test cases (addresses jansommer's comments in pull-request #84)
Added many more unit tests, all passing (228/228)

Other programming languages

Go: Go implementation of YIN in this repo (for tutorial purposes)
Rust: Rust implementation of MPM in this repo (for tutorial purposes)
Python: transcribe is a Python version of MPM for a proof-of-concept of primitive pitch transcription
Javascript (WebAssembly): pitchlite has WASM modules of MPM/YIN running at realtime speeds in the browser, and also introduces sub-chunk detection to return the overall pitch of the chunk and the temporal sub-sequence of pitches within the chunk

Usage

Suggested usage of this library can be seen in the utility wav_analyzer which divides a wav file into chunks of 0.01s and checks the pitch of each chunk. Sample output of wav_analyzer:

std::vector<float> chunk; // chunk of audio

float pitch_mpm = pitch::mpm(chunk, sample_rate);
float pitch_yin = pitch::yin(chunk, sample_rate);

Tests

Unit tests

There are unit tests that use sinewaves (both generated with std::sin and with librosa.tone), and instrument tests using txt files containing waveform samples from the University of Iowa MIS recordings:

$ ./build/pitch_tests
Running main() from ./googletest/src/gtest_main.cc
[==========] Running 228 tests from 22 test suites.
[----------] Global test environment set-up.
[----------] 2 tests from MpmSinewaveTestManualAllocFloat
[ RUN      ] MpmSinewaveTestManualAllocFloat.OneAllocMultipleFreqFromFile
[       OK ] MpmSinewaveTestManualAllocFloat.OneAllocMultipleFreqFromFile (38 ms)
...
[----------] 5 tests from YinInstrumentTestFloat
...
[ RUN      ] YinInstrumentTestFloat.Acoustic_E2_44100
[       OK ] YinInstrumentTestFloat.Acoustic_E2_44100 (1 ms)
[ RUN      ] YinInstrumentTestFloat.Classical_FSharp4_48000
[       OK ] YinInstrumentTestFloat.Classical_FSharp4_48000 (58 ms)
[----------] 5 tests from YinInstrumentTestFloat (174 ms total)
...
[----------] 5 tests from MpmInstrumentTestFloat
[ RUN      ] MpmInstrumentTestFloat.Violin_A4_44100
[       OK ] MpmInstrumentTestFloat.Violin_A4_44100 (61 ms)
[ RUN      ] MpmInstrumentTestFloat.Piano_B4_44100
[       OK ] MpmInstrumentTestFloat.Piano_B4_44100 (24 ms)

...
[==========] 228 tests from 22 test suites ran. (2095 ms total)
[  PASSED  ] 228 tests.

Degraded audio tests

All testing files are here - the progressive degradations are described by the respective numbered JSON file, generated using audio-degradation-toolbox. The original clip is a Viola playing E3 from the University of Iowa MIS. The results come from parsing the output of wav_analyzer to count how many 0.1s slices of the input clip were in the ballpark of the expected value of 164.81 - I considered anything 160-169 to be acceptable:

Degradation level	MPM # correct	YIN # correct
0	26	22
1	23	21
2	19	21
3	18	19
4	19	19
5	18	19

Build and install

You need Linux, cmake, and gcc (I don't officially support other platforms). The library depends on ffts and mlpack. The tests depend on libnyquist, googletest, and google benchmark. Dependency graph:

Build and install with cmake:

cmake -S . -B build -DCMAKE_BUILD_TYPE=Release
cmake --build "build"

# install to your system
cd build && make install

# run tests and benches 
./build/pitch_tests
./build/pitch_bench

# run wav_analyzer
./build/wav_analyzer

Docker

To simplify the setup, there's a Dockerfile that sets up a Ubuntu container with all the dependencies for compiling the library and running the included tests and benchmarks:

# build
$ docker build --rm --pull -f "Dockerfile" -t pitchdetection:latest "."
$ docker run --rm --init -it pitchdetection:latest

n.b. You can pull the esimkowitz/pitchdetection image from DockerHub, but I can't promise that it's up-to-date.

Detailed usage

Read the header and the example wav_analyzer program.

The namespaces are pitch and pitch_alloc. The functions and classes are templated for <double> and <float> support.

The pitch namespace functions perform automatic buffer allocation, while pitch_alloc::{Yin, Mpm} give you a reusable object (useful for computing pitch for multiple uniformly-sized buffers):

#include <pitch_detection.h>

std::vector<double> audio_buffer(8192);

double pitch_yin = pitch::yin<double>(audio_buffer, 48000);
double pitch_mpm = pitch::mpm<double>(audio_buffer, 48000);
double pitch_pyin = pitch::pyin<double>(audio_buffer, 48000);
double pitch_pmpm = pitch::pmpm<double>(audio_buffer, 48000);

pitch_alloc::Mpm<double> ma(8192);
pitch_alloc::Yin<double> ya(8192);

for (int i = 0; i < 10000; ++i) {
        auto pitch_yin = ya.pitch(audio_buffer, 48000);
        auto pitch_mpm = ma.pitch(audio_buffer, 48000);
        auto pitch_pyin = ya.probabilistic_pitch(audio_buffer, 48000);
        auto pitch_pmpm = ma.probabilistic_pitch(audio_buffer, 48000);
}

Name		Name	Last commit message	Last commit date
Latest commit History 174 Commits
cmake		cmake
include		include
misc		misc
src		src
test		test
wav_analyzer		wav_analyzer
.clang-format		.clang-format
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

pitch-detection

📯 Latest news 📰

Other programming languages

Usage

Tests

Unit tests

Degraded audio tests

Build and install

Docker

Detailed usage

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors 5

Uh oh!

Languages

License

sevagh/pitch-detection

Folders and files

Latest commit

History

Repository files navigation

pitch-detection

📯 Latest news 📰

Other programming languages

Usage

Tests

Unit tests

Degraded audio tests

Build and install

Docker

Detailed usage

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors 5

Uh oh!

Languages

Packages