<a href="https://colab.research.google.com/github/carlosholivan/ColabNotebooksforAudio/blob/master/3_CREPE_tracking_wav_to_MIDI_colab.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<img src="https://www.unizar.es/sites/default/files/identidadCorporativa/imagen/logoUZ.png"  width="480">

# <a name="top"></a>WAV TO MIDI ISOLATED VOICE (21/05/2020)


Authors: José Ramón Beltrán and Carlos Hernández

Department of Electronic Engineering and Communications, Universidad de Zaragoza, Calle María de Luna 3, 50018 Zaragoza

This notebook transforms a mono wav audio file into a MIDI file processing the pitch curve predicted by CREPE neural network.

## Table of Contents

- [1. CREPE Onsets Detection and Pitch Processing](#crepe)
    - [1.1. Crepe Raw Pitch Curves](#crepe-extract)
    - [1.2. Tracking Algorithm - MIDI Writing](#midi)
    - [1.3. Results](#results)
    - [1.4. Conclusions and Future Work](#crepe-conclusions)





## INSTALLING DEPENDENCIES

In [None]:
#@title Install MIDIUtil
!pip install MIDIUtil

In [None]:
#@title Install Librosa

!pip install librosa

In [None]:
#@title Install Google Magenta

print('Installing dependencies...')
!apt-get update -qq && apt-get install -qq libfluidsynth1 fluid-soundfont-gm build-essential libasound2-dev libjack-dev
!pip install -qU pyfluidsynth pretty_midi

!pip install -qU magenta

# Hack to allow python to pick up the newly-installed fluidsynth lib. 
# This is only needed for the hosted Colab environment.
import ctypes.util
orig_ctypes_util_find_library = ctypes.util.find_library
def proxy_find_library(lib):
  if lib == 'fluidsynth':
    return 'libfluidsynth.so.1'
  else:
    return orig_ctypes_util_find_library(lib)
ctypes.util.find_library = proxy_find_library

print('Importing libraries and defining some helper functions...')
from google.colab import files

import magenta.music as mm
import magenta
import tensorflow

print('🎉 Done!')
print(magenta.__version__)
print(tensorflow.__version__)

In [None]:
#@title Clone CREPE

!git clone https://github.com/marl/crepe.git

In [None]:
cd crepe/

In [None]:
#@title Download CREPE trained models

!python setup.py install

In [None]:
cd ..

In [None]:
#@title Install Soundfile

!pip install SoundFile

In [None]:
#@title Install hmmlearn

!pip install git+https://github.com/hmmlearn/hmmlearn.git

<img src="https://4.bp.blogspot.com/-WELZsAfX1U0/Vl7UxvJNHdI/AAAAAAAAF34/9Kl1x1y0Uv4/s1600/separador.png" style="width:500px;"/>

In [None]:
#@title Upload Audio File (wav)
import os

from google.colab import files
uploaded = files.upload()

for name, data in uploaded.items():
  with open(name, 'wb') as f:
    f.write(data)
    #os.rename(f.name, 'file.wav')
    song = f.name[:-4]

# <a name="crepe"></a>1. CREPE PITCH TO MIDI

In [None]:
import scipy
from scipy.io import wavfile
from collections import Counter
import soundfile
import os
import csv
from midiutil import MIDIFile
import numpy as np
import matplotlib.pyplot as plt
import IPython.display as ipd
import librosa

import sys
sys.path.insert(1, 'crepe/crepe/')
import core #this will allow us to run CREPE in the script to not generate a csv file

In [None]:
"-------------------------------------------------------------------------"
"-----------------------STEP 1: CREATING 16bit WAV FILE-------------------"
"-------------------------------------------------------------------------"
wav = song + '.wav'

#Reading WAV attributes
file = soundfile.SoundFile(wav)
print('Sample rate: {}'.format(file.samplerate))
print('Channels: {}'.format(file.channels))
print('Subtype: {}'.format(file.subtype))

#Crepe NN works with 16bit wav files so if the imported file is 24bit we
#convert it into a 16bit wav file and we export it into another path
if file.subtype == 'PCM_24':
    data, samplerate = soundfile.read(wav)
    soundfile.write('16bitwav_' + song + '.wav', data, samplerate, subtype='PCM_16')
    wav_16bit = '16bitwav_' + song + '.wav'
else:
  wav_16bit = song + '.wav'

## <a name="crepe-extract"></a>1.1. CREPE RAW PITCH CURVES

### <a name="crepe-pitch"></a>1.1.1. CREPE PITCH EXTRACTION (PREDICTION)

In [None]:
"----------------------PITCH ESTIMATION WITH CREPE NN---------------------"    
"-------------------------STEP 2: CREPE PREDICTION------------------------"
"-------------------------------------------------------------------------"

sr, audio = wavfile.read(wav_16bit)
time, frequency, confidence, activation = core.predict(audio, sr, viterbi=True, step_size=10)

#### PLOTTING

Pitch bins in activation curve are calculated as:
\begin{equation}
pitchbin = 1200*log_2(\frac{f}{10})
\end{equation}

where 10Hz is the reference frequency.
This unitprovides a logarithmic pitch scale where 100 cents equal one semitone

In [None]:
#We can plot the activation curve from CREPE
plt.figure(figsize=(20, 20))
plt.title('Activation curve from CREPE')
plt.xlabel('time (ms)')
plt.ylabel('Pitch bins')
plt.imshow(1-activation.T, origin='lower', cmap='gray')  

#We can plot the activation curve from CREPE with frequency an time variables
plt.figure(figsize=(20, 5))
plt.title('Activation curve from CREPE (f0, t)')
plt.xlabel('time (s)')
plt.ylabel('Fundamental frequency (Hz)')
plt.plot(time, frequency)

## <a name="midi"></a>1.1.2. TRACKING ALGORITHM - MIDI WRITING

We now have to convert the frequencies to MIDI notes

The way to convert f0 to a MIDI note is:
\begin{equation}
MIDInote = 12*(log_2(f) - log_2(440)) + 69 
\end{equation}

<img src="https://i.pinimg.com/originals/fe/13/eb/fe13ebb344f12175dca1b0a63617bf73.gif" style="width:500px;"/>

In [None]:
#frequencies to MIDI notes are represented by the function
x = np.arange(27.5, 3500, 1)

mid_notes_f = [12*(np.log2(p) - np.log2(440)) + 69 for p in x]

plt.figure(figsize=(20, 10))
plt.plot(mid_notes_f, x, linewidth=3.0)
plt.xlabel('MIDI note')
plt.ylabel('Frequency')

We'll write 2 MIDI files:
* 1. CREPE_raw: MIDI file written directly from CREPE pitch curve
* 2. CREPE_tracked: MIDI file written after removing frequencies which conficence < min_confidence and joining continous notes

In [None]:
#f0 to MIDI notes
midi_notes = [12*(np.log2(i) - np.log2(440.0)) + 69 for i in frequency] 

#Minimum confidence of the presence of a pitch to take as a valid pitch
min_confidence = 0.85

In [None]:
"-------------------------------------------------------------------------"
"-----------------------RAW CREPE PITCH CURVE TO MIDI---------------------"
"-------------------------------------------------------------------------"    
# create your MIDI object
mf_raw = MIDIFile(numTracks = 1)     # only 1 track
track_raw = 0   # the only track

time_note_raw = 0    # start at the beginning
bpm = 110
mf_raw.addTrackName(track_raw, time_note_raw, "Sample Track")
mf_raw.addTempo(track_raw, time_note_raw, bpm)
ms_in_1beat = 60000/bpm

# add some notes
channel_raw = 0
volume_raw = 100

pitch_raw_list = []
note_on_raw_list = []
note_off_raw_list = []
for i in range(len(frequency)-1):
    pitch = int(round(midi_notes[i])) #write MIDi note     
    note_on_raw = time[i]*1000 / ms_in_1beat          
    note_off_raw = time[i+1]*1000 / ms_in_1beat 
    duration = note_off_raw - note_on_raw
    mf_raw.addNote(track_raw, channel_raw, pitch, note_on_raw, duration, volume_raw)
    
    pitch_raw_list.append(pitch)
    note_on_raw_list.append(note_on_raw*ms_in_1beat/1000)
    note_off_raw_list.append(note_off_raw*ms_in_1beat/1000)
    
# write it to disk
with open('CREPE_raw_' + song + '.mid', 'wb') as outf:
    mf_raw.writeFile(outf)

print("CREPE MIDI file has been created")

In [None]:
"-------------------------------------------------------------------------"
"--------------------------TRACKING ALGORITHM-----------------------------"
"-------------------------------------------------------------------------" 
# create your MIDI object
mf_raw_clean = MIDIFile(numTracks = 1)     # only 1 track
track_raw = 0   # the only track

time_note_raw = 0    # start at the beginning
bpm = 110
mf_raw_clean.addTrackName(track_raw, time_note_raw, "Sample Track")
mf_raw_clean.addTempo(track_raw, time_note_raw, bpm)
ms_in_1beat = 60000/bpm

# add some notes
channel_raw = 0
volume_raw = 100

pitch_list = []
note_on_list = []
note_off_list = []
j = 0
for i in range(len(time)-1):  
    if j > len(time)-1:
        break
    if j != 0:
        i = j
    if confidence[i] > min_confidence: #if confidence is low we don't write the note 
        note_on = time[i]*1000 / ms_in_1beat
        j = i
        while ((round(midi_notes[j+1]) - round(midi_notes[j])) <= 0.0) and (confidence[j] > min_confidence): #we search pitches that are the same in the future frames            j += 1
            if j >= len(time)-1:
                break

        note_off = time[j]*1000 / ms_in_1beat
            
        pitch = int(round(midi_notes[i]))
        duration = note_off - note_on
        
        if duration > 0.05*ms_in_1beat/1000: #if duration<50ms don't write the note
            pitch_list.append(pitch)
            note_on_list.append(note_on*ms_in_1beat/1000)
            note_off_list.append(note_off*ms_in_1beat/1000)
            mf_raw_clean.addNote(track_raw, channel_raw, pitch, note_on, duration, volume_raw)
        else:
            j+=1
    else:
        j +=1

# write it to disk
with open('CREPE_tracked_' + song + '.mid', 'wb') as outf:
    mf_raw_clean.writeFile(outf)

print("CREPE MIDI file has been created")

In [None]:
#General function to plot piano-roll representation + midi and wav comparation
import bokeh 
import magenta.music as mm

def midi_pianoroll(file):
    print(file)

    note_seq = mm.midi_file_to_sequence_proto(file)

    # This is a colab utility method that visualizes a NoteSequence.
    fig = mm.plot_sequence(note_seq, show_figure=False)
    bokeh.plotting.output_notebook()
    bokeh.plotting.show(fig)

    # This is a colab utility method that plays a NoteSequence.
    mm.play_sequence(note_seq,synth=mm.fluidsynth)
    return

In [None]:
#original song
ipd.Audio(wav_16bit, rate=sr)

In [None]:
"-------------------MIDI-------------------------"
#MIDI file from raw pitch curve
file = 'CREPE_raw_' + song + '.mid'
midi_pianoroll(file)

In [None]:
"-------------------MIDI-------------------------"
#MIDI file from raw pitch curve
file = 'CREPE_tracked_' + song + '.mid'
midi_pianoroll(file)

In [None]:
#Plotting results
from matplotlib.ticker import MultipleLocator
def plot_pianoroll(pitch_list, note_on_list, note_off_list, plot_title=''):
    plt.style.use('dark_background')
    fig, ax = plt.subplots(figsize=(20, 5))
    plt.title(plot_title)
    plt.xlabel('time s')
    plt.ylabel('Pitch')
    for p, pitch in enumerate(pitch_list):
        plt.vlines(x=note_on_list[p], ymin=pitch, ymax=pitch+1,
                   color='#E8B2FF', linewidth=0.01)
        ax.add_patch(plt.Rectangle((note_on_list[p],pitch), 
                                 width=note_off_list[p]-note_on_list[p],
                                 height=1,
                                 edgecolor='#E8B2FF',facecolor='#C232FF'))
    ax.yaxis.set_major_locator(MultipleLocator(1))
    ax.yaxis.grid(linewidth=0.25)
    ax.set_facecolor('#282828')
    return

plot_pianoroll(pitch_raw_list, note_on_raw_list, note_off_raw_list, 'Activation raw curve from CREPE')

plot_pianoroll(pitch_list, note_on_list, note_off_list,
               'Activation curve from CREPE after joining notes and removing f0 which confidence < {}'.format(min_confidence))

In [None]:
#Comparison plot
fig, ax = plt.subplots(figsize=(20, 5))
plt.title('Comparison between raw (yellow) and tracked (purple) MIDI files')
plt.xlabel('time s')
plt.ylabel('Pitch')
for p, pitch in enumerate(pitch_raw_list):
    plt.vlines(x=note_on_raw_list[p], ymin=pitch, ymax=pitch+1,
               color='#6CFF9A', linewidth=0.01)
    ax.add_patch(plt.Rectangle((note_on_raw_list[p],pitch), 
                                 width=note_off_raw_list[p]-note_on_raw_list[p],
                                 height=1,
                                 edgecolor='#6CFF9A',facecolor='#0AFE57'))
for p, pitch in enumerate(pitch_list):
    plt.vlines(x=note_on_list[p], ymin=pitch, ymax=pitch+1,
               color='#E8B2FF', linewidth=0.01)
    ax.add_patch(plt.Rectangle((note_on_list[p],pitch), 
                                 width=note_off_list[p]-note_on_list[p],
                                 height=1,
                                 edgecolor='#E8B2FF',facecolor='#C232FF'))
ax.yaxis.set_major_locator(MultipleLocator(1))
ax.yaxis.grid(linewidth=0.25)
ax.set_facecolor('#282828')


In [None]:
#@title Download Raw MIDI File (optional)

files.download('CREPE_raw_' + song + '.mid')

In [None]:
#@title Download Tracked MIDI File (optional)

files.download('CREPE_tracked_' + song + '.mid')

## <a name="crepe-conclusions"></a>1.5. CONCLUSIONS AND FUTURE WORK

* Join notes whose pitch is equal to the previous ones but which have been isolated (because of the confidence...)

<a href="#top">Back to top</a>

## REFERENCES

* https://github.com/marl/crepe

<img src="https://4.bp.blogspot.com/-WELZsAfX1U0/Vl7UxvJNHdI/AAAAAAAAF34/9Kl1x1y0Uv4/s1600/separador.png" style="width:500px;"/>