# 🎼 Audio Chord Estimation (ACE) with Madmom

Madmom is an audio signal processing library written in Python with a strong focus on music information retrieval (MIR) tasks.

It includes pretrained models for chord estimation, among other things.
The ACE models implemented in malmon were originally proposed in the [paper](https://archives.ismir.net/ismir2016/paper/000178.pdf):
```
Filip Korzeniowski and Gerhard Widmer, “Feature Learning for Chord Recognition: The Deep Chroma Extractor”, Proceedings of the 17th International Society for Music Information Retrieval Conference (ISMIR), 2016.
```
In this notebook, we show how to use Madmom to estimate the chords of the audio files in the dataset and how to evaluate the performance of the models implemented in the library.

For more information about Madmom, please visit the [official website](https://madmom.readthedocs.io/en/v0.16.1/index.html).

## 🪛 Setup and installation

Madmom has some compatibility issues, mainly related to the
numpy/scipy version.
This notebook is tested with python `3.10.12` and madmom `0.16.1`. If you have any issues,
please make sure you have the correct version of madmom and numpy installed.

Furthermore, in this notebook, a number of tricks are implemented in order to be able to use madmom in Python 3.10, i.e. the version of Python currently distributed in Google Colab.  

In [10]:
# python version check
!python --version

Python 3.10.12


**!!!Session restart required after installing libraries!!!**

Please click on `RESTART SESSION` once the installation process has finished in order to make numpy installation effective.

In [1]:
# install libraries
!pip install numpy==1.23.5
!pip install madmom==0.16.1
!pip install mir_eval



In [2]:
#replace some source code in order to make it usable in python 3.10
!sed -i "s/from collections import/from collections.abc import/" /usr/local/lib/python3.10/dist-packages/madmom/processors.py

## 💽 Load audio files
The script loads any audio files uploaded in the main directory of Google Colab having the following extensions: `.wav`, `.mp3`, `.mp4`, `.flac`.

If more audio files are uploaded, given the showcase purpose of this notebook, only one will be considered.

In [3]:
# imports
import os

In [4]:
audio_extensions = [".mp3", ".mp4" ".wav", ".flac"]
audio_files = [file for file in os.listdir('./') if file.endswith(tuple(audio_extensions))]

assert len(audio_files) > 0, "No audio files uploaded in the main `content` folder."

file_path = audio_files[0]
f"Processing audio file: {file_path}"

'Processing audio file: Chains of Tomorrow - Beat it - Michael Jackson.mp3'

## 1 Chord recognition from Deep Chroma Features

Recognise major and minor chords from deep chroma vectors using a Conditional Random Field.

In [5]:
# imports
from madmom.audio.chroma import DeepChromaProcessor  # type: ignore
from madmom.features.chords import DeepChromaChordRecognitionProcessor  # type: ignore

In [6]:
# initialise the DeepChromaProcessor to extract the chroma vectors from audio
dcp = DeepChromaProcessor()
# create a DeepChromaChordRecognitionProcessor to decode a chord sequence from the extracted chromas
decode = DeepChromaChordRecognitionProcessor()

In [8]:
# get chord predictions
chroma = dcp(file_path)
chords = decode(chroma)

## 2 Chord recognition from learned features

Recognise major and minor chords from learned features extracted by a convolutional neural network.

In [56]:
from madmom.features.chords import CNNChordFeatureProcessor, CRFChordRecognitionProcessor

In [57]:
featproc = CNNChordFeatureProcessor()
crf_decode = CRFChordRecognitionProcessor()

In [58]:
feats = featproc('Digital Dreaming - I Got You (I Feel Good) - James Brown.mp3')
crf_decode(feats)

array([(  0. ,   0.3, 'N'), (  0.3,   2.8, 'D#:maj'),
       (  2.8,   4.9, 'D:maj'), (  4.9,   7.9, 'G:min'),
       (  7.9,   9.1, 'C:maj'), (  9.1,   9.7, 'G:min'),
       (  9.7,  10.5, 'C:min'), ( 10.5,  13.3, 'D#:maj'),
       ( 13.3,  16.5, 'G:min'), ( 16.5,  17.2, 'G:maj'),
       ( 17.2,  18. , 'G:min'), ( 18. ,  19. , 'C:min'),
       ( 19. ,  20.4, 'G:min'), ( 20.4,  23. , 'D#:maj'),
       ( 23. ,  24.6, 'D:maj'), ( 24.6,  25.4, 'F:maj'),
       ( 25.4,  26.6, 'A#:maj'), ( 26.6,  28.1, 'G:min'),
       ( 28.1,  30.4, 'D:maj'), ( 30.4,  33.1, 'G:min'),
       ( 33.1,  35.5, 'C:maj'), ( 35.5,  38.3, 'D#:maj'),
       ( 38.3,  40.2, 'D:maj'), ( 40.2,  42.1, 'G:min'),
       ( 42.1,  44.6, 'A#:maj'), ( 44.6,  44.9, 'C:min'),
       ( 44.9,  47.1, 'G:min'), ( 47.1,  49.1, 'A#:maj'),
       ( 49.1,  50.3, 'C:maj'), ( 50.3,  52.4, 'G:min'),
       ( 52.4,  53.5, 'A#:maj'), ( 53.5,  54.5, 'D#:maj'),
       ( 54.5,  55.7, 'F:maj'), ( 55.7,  58.4, 'D#:maj'),
       ( 58.4,  60.4, 'D:

In [20]:
starts = [start for (start, end, chord) in chords]
durations = [end - start for (start, end, chord) in chords]
chord_list = [chord for (start, end, chord) in chords]

In [16]:
# Convert to json

In [17]:
import json

In [18]:
json.dumps(starts)

'[0.0, 1.6, 2.6, 3.3000000000000003, 5.300000000000001, 6.6000000000000005, 8.6, 9.3, 10.200000000000001, 12.0, 12.700000000000001, 13.600000000000001, 15.5, 16.2, 17.0, 19.0, 19.700000000000003, 20.700000000000003, 22.3, 23.0, 24.200000000000003, 25.8, 26.400000000000002, 27.400000000000002, 29.0, 30.0, 32.7, 34.2, 35.800000000000004, 36.7, 37.5, 39.5, 40.900000000000006, 42.1, 43.300000000000004, 44.2, 46.2, 47.7, 49.7, 50.5, 51.6, 53.2, 53.900000000000006, 55.1, 56.6, 57.300000000000004, 58.300000000000004, 60.0, 60.800000000000004, 61.5, 63.400000000000006, 64.10000000000001, 65.0, 66.8, 67.5, 68.4, 70.10000000000001, 71.4, 73.60000000000001, 74.4, 75.3, 77.10000000000001, 78.7, 80.30000000000001, 82.10000000000001, 83.80000000000001, 85.5, 87.30000000000001, 88.10000000000001, 88.9, 90.80000000000001, 92.4, 93.5, 94.30000000000001, 95.80000000000001, 98.10000000000001, 101.2, 101.80000000000001, 102.60000000000001, 104.4, 106.2, 108.0, 108.7, 109.5, 111.2, 113.0, 122.0, 122.300000

In [19]:
json.dumps(durations)

'[1.6, 1.0, 0.7000000000000002, 2.0000000000000004, 1.2999999999999998, 1.9999999999999991, 0.7000000000000011, 0.9000000000000004, 1.799999999999999, 0.7000000000000011, 0.9000000000000004, 1.8999999999999986, 0.6999999999999993, 0.8000000000000007, 2.0, 0.7000000000000028, 1.0, 1.5999999999999979, 0.6999999999999993, 1.2000000000000028, 1.5999999999999979, 0.6000000000000014, 1.0, 1.5999999999999979, 1.0, 2.700000000000003, 1.5, 1.6000000000000014, 0.8999999999999986, 0.7999999999999972, 2.0, 1.4000000000000057, 1.1999999999999957, 1.2000000000000028, 0.8999999999999986, 2.0, 1.5, 2.0, 0.7999999999999972, 1.1000000000000014, 1.6000000000000014, 0.7000000000000028, 1.1999999999999957, 1.5, 0.7000000000000028, 1.0, 1.6999999999999957, 0.8000000000000043, 0.6999999999999957, 1.9000000000000057, 0.7000000000000028, 0.8999999999999915, 1.7999999999999972, 0.7000000000000028, 0.9000000000000057, 1.7000000000000028, 1.2999999999999972, 2.200000000000003, 0.7999999999999972, 0.89999999999999

In [14]:
json.dumps(chord_list)

'["C:maj", "A#:maj", "D#:maj", "C:maj", "A#:maj", "C:maj", "A#:maj", "D#:maj", "C:maj", "A#:maj", "D#:maj", "C:maj", "A#:maj", "D#:maj", "C:maj", "A#:maj", "C:min", "C:maj", "A#:maj", "C:min", "C:maj", "A#:maj", "C:min", "C:maj", "A#:maj", "G#:maj", "G:maj", "C:min", "A#:maj", "D#:maj", "G#:maj", "G:maj", "C:maj", "C:min", "D#:maj", "G#:maj", "A#:maj", "C:maj", "A#:maj", "C:min", "C:maj", "A#:maj", "C:min", "C:maj", "A#:maj", "C:min", "C:maj", "A#:maj", "D#:maj", "C:maj", "A#:maj", "D#:maj", "C:maj", "A#:maj", "D#:maj", "G#:maj", "A#:maj", "C:maj", "A#:maj", "D#:maj", "G#:maj", "G:maj", "C:maj", "D#:maj", "G#:maj", "A#:maj", "C:maj", "A#:maj", "D#:maj", "G#:maj", "G:maj", "C:maj", "C:min", "D#:maj", "G#:maj", "C:maj", "A#:maj", "D#:maj", "G#:maj", "G:maj", "C:maj", "A#:maj", "D#:maj", "G#:maj", "A#:maj", "C:maj", "A#:maj", "D#:maj", "C:maj", "A#:maj", "C:min", "C:maj", "A#:maj", "D#:maj", "C:maj", "A#:maj", "D#:maj", "C:maj", "A#:maj", "D#:maj", "C:maj", "A#:maj", "D#:maj", "C:maj", "A