pfa

This is an implementation of landmark-based Audio FingerPrint. Also the baseline model used in my thesis, "Improvement of Neural Network- and Landmark-based Audio Fingerprinting". See https://github.com/stdio2016/pfann for more information.

To recognize a music, you need to build a fingerprint database, then search from this database. In this repo, I refer to the songs as music, because audio fingerprint shouldn't be limited to songs with vocals, it should be able to search instrumental music as well.

Requirements

C++ programs (builder, matcher, getlm and finddup):

OpenMP enabled C++ compiler
Make
(if you want to compile finddup) MPI (Message Passing Interface)

Python programs (in tools/ and adversary/ folder):

pydub
matplotlib
scipy
miniaudio
soundfile
numpy
FFMPEG

Building

On linux, just type make all to compile builder, matcher and getlm.

On Windows, you need to compile with MinGW, or run the following commands in Visual Studio command prompt:

cl /EHsc /O2 /openmp builder.cpp Landmark.cpp Database.cpp lib/WavReader.cpp lib/Timing.cpp lib/ReadAudio.cpp lib/BmpReader.cpp lib/Signal.cpp lib/utils.cpp lib/Sound.cpp PeakFinder.cpp PeakFinderDejavu.cpp Analyzer.cpp
cl /EHsc /O2 /openmp matcher.cpp Landmark.cpp Database.cpp lib/WavReader.cpp lib/Timing.cpp lib/ReadAudio.cpp lib/BmpReader.cpp lib/Signal.cpp lib/utils.cpp lib/Sound.cpp .\PeakFinder.cpp .\PeakFinderDejavu.cpp Analyzer.cpp
cl /EHsc /O2 /openmp getlm.cpp Landmark.cpp Database.cpp lib/WavReader.cpp lib/Timing.cpp lib/ReadAudio.cpp lib/BmpReader.cpp lib/Signal.cpp lib/utils.cpp lib/Sound.cpp PeakFinder.cpp PeakFinderDejavu.cpp Analyzer.cpp

Note: I have only tested on Visual Studio 2015.

On Mac, you might need to type:

make CXXFLAGS='-std=c++11 -O3 -Xpreprocessor -fopenmp -I /opt/homebrew/include' LIBS=/opt/homebrew/lib/libomp.dylib all

To compile finddup, please type make finddup. MPI is required. Currently only tested on Linux.

Usage

Build a fingerprint database

./builder <music list file> <output database location>

Music list file is a file containing list of music file paths. File must be UTF-8 without BOM. For example:

/path/to/fma_medium/000/000002.mp3
/path/to/fma_medium/000/000005.mp3
/path/to/your/music/aaa.wav
/path/to/your/music/bbb.wav

This program supports both MP3 and WAV audio format. Relative paths are supported but not recommended.

Recognize music

./matcher <query list> <database location> <result file>

Query list is a file containing list of query file paths. For example:

/path/to/queries/out2_snr2/000002.wav
/path/to/queries/out2_snr2/000005.wav
/path/to/song_recorded_on_street1.wav
/path/to/song_recorded_on_street2.wav

Database location is the place where builder saves database.

The result file will be a TSV file with 2 fields: query file path, and matched music path, but without header. It may look like this:

/path/to/queries/out2_snr2/000002.wav	/path/to/fma_medium/000/000002.mp3
/path/to/queries/out2_snr2/000005.wav	/path/to/fma_medium/000/000005.mp3
/path/to/song_recorded_on_street1.wav	/path/to/your/music/aaa.wav
/path/to/song_recorded_on_street2.wav	/path/to/your/music/aaa.wav

Matcher will also generate a CSV file and a BIN file. CSV file contains more information about the matches. It has 5 columns: query, answer, score, time, and part_scores.

query: Query file path
answer: Matched music path
score: Matching score, used in my thesis
time: The time when the query clip starts in the matched music, in seconds
part_scores: Mainly used for debugging, currently empty

BIN file contains matching scores of every database music for each query. It is used in my ensemble experiments. The file format is a flattened 2D array of following structure, without header:

struct match_t {
  float score; // Matching score
  float offset; // The time when the query clip starts in the matched music, in seconds
};

The matching score of j-th database music in i-th query is at index i * database size + j.

getlm

getlm is a utility for landmark extraction. Usage:

./getlm <audio file> <output> [REPEAT_COUNT]

REPEAT_COUNT is for my experiment, "對查詢做多次時間位移," in section 4.5.3 of my thesis. REPEAT_COUNT is called N in that section. (TODO: translate my thesis)

Accourding to my thesis, REPEAT_COUNT is 1 when creating database, then REPEAT_COUNT is 1, 2, or 4 when querying.

Output file will be a binary file without header, with extension .lm. Its format is an array of the following struct:

struct Landmark {
  int32_t time1;
  int32_t freq1;
  float energy1;
  int32_t time2;
  int32_t freq2;
  float energy2;
};

If REPEAT_COUNT is set to peaks, then it outputs array of peaks instead.

struct Peak {
  int32_t time;
  int32_t freq;
  float energy;
};

finddup

finddup is a utility for finding possibly duplicate music. It uses landmark-based audio fingerprint, the same technique as this repo. It works by splitting every music to 10-second clips, then query those clips in database. It is designed to be run on a supercomputer, so it uses both OpenMP and MPI. Usage:

mpiexec [mpi params] ./finddup $peak_file_list $database_dir $result_file

You need to first preprocess music:

Build database using ./builder $music_list $database_dir.
For each $music in $music_list, extract peaks using ./getlm $music $out_peak_file peaks.
Save each $out_peak_file name to a file $peak_file_list.

The program will output to ${result_file}-pid-${pid}, for each process $pid. They are CSV files with 32 fields, without header. Each row corresponds to a clip of music.

Field 1: music id (0-based)
Field 2: position of clip in that music, in seconds
Field 3n: n-th best music match
Field 3n + 1: landmark match count
Field 3n + 2: the time when the query clip starts in the matched music

The program only outputs top-10 matches to save space.

Name		Name	Last commit message	Last commit date
Latest commit History 88 Commits
adversary		adversary
lib		lib
tools		tools
.gitignore		.gitignore
Analyzer.cpp		Analyzer.cpp
Analyzer.hpp		Analyzer.hpp
Database.cpp		Database.cpp
Database.hpp		Database.hpp
Landmark.cpp		Landmark.cpp
Landmark.hpp		Landmark.hpp
Makefile		Makefile
PeakFinder.cpp		PeakFinder.cpp
PeakFinder.hpp		PeakFinder.hpp
PeakFinderDejavu.cpp		PeakFinderDejavu.cpp
PeakFinderDejavu.hpp		PeakFinderDejavu.hpp
PitchDatabase.cpp		PitchDatabase.cpp
PitchDatabase.hpp		PitchDatabase.hpp
README.md		README.md
builder.cpp		builder.cpp
finddup.cpp		finddup.cpp
getlm.cpp		getlm.cpp
matchServer.cpp		matchServer.cpp
matcher.cpp		matcher.cpp
qbshServer.cpp		qbshServer.cpp

stdio2016/pfa

Folders and files

Latest commit

History

Repository files navigation

pfa

Requirements

Building

Usage

Build a fingerprint database

Recognize music

getlm

finddup

About

Topics

Resources

Stars

Watchers

Forks

Languages