Skip to content

eleurent/melodic-dictation

Repository files navigation

melodic-dictation

This project is a matlab script for automated melodic dictation.

How it works

First, the spectrogram of the music track is extracted using a Short-Term Fourier Transform (STFT, or TFCT).

STFT

Then, the idea is to break down the spectrogram matrix V as a product of two matrices: V = W*H Where:

  • W is a preconstituted matrix containing the spectrograms of all the notes in the chromatic scale
  • H is a matrix describing which notes are played at all times - a.k.a the music score.

chromatic scale

So while we can easily generate W, the goal is to find H using the Non-negative Matrix Factorization (NMF) method. An algorithm using the Kullback-Leibler divergence is described in a paper by Lee and Seung.

Applied to the previous spectrogram, we get the following H matrix:

Calculated H

Some signal processing will help to determine the start and end of each note:

Filtered H

And we can finally export the result to a music score or a MIDI file

Music score

A few examples

Original Reconstruction
tetris.mid tetris_out.mid
zelda.mid zelda_out.mid
mario.mid mario_out.mid

About

Automated melodic dictation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published