hbredin edited this page Jan 3, 2012 · 6 revisions

What is Audio Fingerprinting

Audio fingerprinting denotes a set of techniques to perform audio identification. The latter covers the detection and the identification of an audio excerpt (a music track, an advertisement, a jingle ...) in an audio recording (either a short excerpt or a broadcast stream).

While, audio watermarking relies on the embedding within of meta-information the very audio signal to be processed, audio fingerprinting (sometimes called audio hashing), is based on the detection of audio occurrences through the recognition of code signatures extracted from short snippets of the signal.

The design of such signature codes must jointly answer several constraints :

  • Robustness : the representation must be as invariant as possible with regard to typical audio distortions, such as :
    • Noise addition
    • Transmission distortions (channel filtering, analog/digital conversion...)
    • Time scale change, with subsequent pitching
    • Amplitude changes, including dynamic amplitude compression
    • Typical audio encodings, i.e. MPEG encoding, Real Audio, WMA, or even GSM...
    • Temporal shifting between the reference track and the search cue.
  • Compacity : the complexity of the search of new codes among the databases codes is directly related to its dimensions. The signature code must therefore be as compact as possible.
  • Discrimination : however, a compact code implies a more narrow scope of values. Codes from different tracks get closer and harder to discriminate. The signature code must therefore meet an acceptable compromise between compacity and discrimination.
  • Computability : finally the codes must be easily and quickly computable to ensure live processing on any audio query.

An audio fingerprinting method is classically the conjunction of two key elements : the design of the signature code, described therebefore, and an efficient search strategy to retrieve an unknown code within the database. The search strategy is very important to ensure the scalabable of the algorithm, i.e. its ability to scale up to very large databases including several millions reference tracks. This scalability issue of course implies a computational aspect but the main bottleneck generally lies in the handling of multiple memory and hard drive accesses. Another key contraint of the complexity, audio identification systems are often applied to the live monitoring of audio streams. A real-time functionning is therefore required. Of course, these two issues must be answered while guaranteeing the accuracy (the performance) of the system.

While the complexity aspect is not considered here, the PyAFE toolkit provided here brings the experimentator a consistant framework for the evaluation of any audio fingerprinting system.

PyAFE toolkit

The PyAFE toolkit was developped in the framework of the Evaluation work-package of the Quaero project and is made freely available as open-source software. It is designed as modular piece of software, in ordre to be easilly extended in the future:

  • two modules provides the necessary functions to parse groundtruth and detection XML files.
  • the core module includes the actual implementation of audio fingerprinting score metrics. Computation of the number of correct detections, misses and false alarms is obviously also available. Included in the PyAFE toolkit, an all-in-one command line evaluation tool is also available. It provides an easy to use, straightforward way of getting evaluation results. It gets as simple as typing:

python full_eval.py --groundtruth=/path/to/groundtruth/files --submission=/path/to/detection/files

Documentation can be found below.


Clone the GitHub repository for current development version

Release notes

  • Version 0.5.2 - 26/01/2011

    • Date/time format has changed. Microseconds are now mandatory.
    • Updated sample data accordingly.
  • Version 0.5.1 - 14/12/2010

    • Updated sample data
  • Version 0.5.0 - 14/12/2010

    • First public release



Quaero Music Identification Database

Available soon (last update: 2011-01-11)