<a href="https://colab.research.google.com/github/carlosholivan/ColabNotebooksforAudio/blob/master/2_CREPE_onsets_wav_to_MIDI_colab.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<img src="https://www.unizar.es/sites/default/files/identidadCorporativa/imagen/logoUZ.png"  width="480">

# <a name="top"></a>WAV TO MIDI ISOLATED VOICE (05/2020)


Authors: José Ramón Beltrán and Carlos Hernández

Department of Electronic Engineering and Communications, Universidad de Zaragoza, Calle María de Luna 3, 50018 Zaragoza

This notebook transforms a mono wav audio file into a MIDI file processing the pitch curve predicted by CREPE neural network.

## Table of Contents

- [1. CREPE Onsets Detection and Pitch Processing](#crepe)
    - [1.1. Onsets Detection](#onsets)
        - [1.1.1. Crepe Pitch Extraction](#crepe-pitch)
        - [1.1.2. Processing Crepe Pitch Curve Estimation](#estimation)
        - [1.1.3. Librosa Onsets Detection vs CREPE Onsets Detection from Pitch Curve](#librosa)
    - [1.2. Crepe Pitch Processing](#crepe-processing)
    - [1.3. MIDI Writing](#midi)
    - [1.4. Results](#results)
    - [1.5. Conclusions and Future Work](#crepe-conclusions)





## INSTALLING DEPENDENCIES

In [None]:
#@title Install MIDIUtil
!pip install MIDIUtil

In [None]:
#@title Install Librosa

!pip install librosa

In [None]:
#@title Install Google Magenta

print('Installing dependencies...')
!apt-get update -qq && apt-get install -qq libfluidsynth1 fluid-soundfont-gm build-essential libasound2-dev libjack-dev
!pip install -qU pyfluidsynth pretty_midi

!pip install -qU magenta

# Hack to allow python to pick up the newly-installed fluidsynth lib. 
# This is only needed for the hosted Colab environment.
import ctypes.util
orig_ctypes_util_find_library = ctypes.util.find_library
def proxy_find_library(lib):
  if lib == 'fluidsynth':
    return 'libfluidsynth.so.1'
  else:
    return orig_ctypes_util_find_library(lib)
ctypes.util.find_library = proxy_find_library

print('Importing libraries and defining some helper functions...')
from google.colab import files

import magenta.music as mm
import magenta
import tensorflow

print('🎉 Done!')
print(magenta.__version__)
print(tensorflow.__version__)

In [None]:
#@title Clone CREPE

!git clone https://github.com/marl/crepe.git

In [None]:
cd crepe/

In [None]:
#@title Download CREPE trained models

!python setup.py install

In [None]:
cd ..

In [None]:
#@title Install Soundfile

!pip install SoundFile

In [None]:
#@title Install hmmlearn

!pip install git+https://github.com/hmmlearn/hmmlearn.git

<img src="https://4.bp.blogspot.com/-WELZsAfX1U0/Vl7UxvJNHdI/AAAAAAAAF34/9Kl1x1y0Uv4/s1600/separador.png" style="width:500px;"/>

In [None]:
#@title Upload Audio File (wav)
import os

from google.colab import files
uploaded = files.upload()

for name, data in uploaded.items():
  with open(name, 'wb') as f:
    f.write(data)
    #os.rename(f.name, 'file.wav')
    song = f.name[:-4]

# <a name="crepe"></a>1. CREPE ONSETS DETECION AND PITCH PROCESSING 

Process:
* Convert wav to 16bit to process it with CREPE
* Pitch extraction with CREPE
* Onsets detection with Librosa or CREPE (derivative of the Pitch curve)
* Average, mode, most expected value... of the pitch from the current onset to the next one
* MIDI or note sequence conversion

In [None]:
import scipy
from scipy.io import wavfile
from collections import Counter
import soundfile
import os
import csv
from midiutil import MIDIFile
import numpy as np
import matplotlib.pyplot as plt
import IPython.display as ipd
import librosa

import sys
sys.path.insert(1, 'crepe/crepe/')
import core #this will allow us to run CREPE in the script to not generate a csv file

In [None]:
"-------------------------------------------------------------------------"
"-----------------------STEP 1: CREATING 16bit WAV FILE-------------------"
"-------------------------------------------------------------------------"
wav = song + '.wav'

#Reading WAV attributes
file = soundfile.SoundFile(wav)
print('Sample rate: {}'.format(file.samplerate))
print('Channels: {}'.format(file.channels))
print('Subtype: {}'.format(file.subtype))

#Crepe NN works with 16bit wav files so if the imported file is 24bit we
#convert it into a 16bit wav file and we export it into another path
if file.subtype == 'PCM_24':
    data, samplerate = soundfile.read(wav)
    soundfile.write('16bitwav_' + song + '.wav', data, samplerate, subtype='PCM_16')
    wav_16bit = '16bitwav_' + song + '.wav'
else:
  wav_16bit = song + '.wav'

## <a name="onsets"></a>1.1. ONSETS  DETECTION

### <a name="crepe-pitch"></a>1.1.1. CREPE PITCH EXTRACTION (PREDICTION)

In [None]:
"----------------------PITCH ESTIMATION WITH CREPE NN---------------------"    
"-------------------------STEP 2: CREPE PREDICTION------------------------"
"-------------------------------------------------------------------------"

sr, audio = wavfile.read(wav_16bit)
time, frequency, confidence, activation = core.predict(audio, sr, viterbi=True, step_size=10)

Pitch bins in activation curve are calculated as:
\begin{equation}
pitchbin = 1200*log_2(\frac{f}{10})
\end{equation}

where 10Hz is the reference frequency.
This unitprovides a logarithmic pitch scale where 100 cents equal one semitone

In [None]:
#We can plot the activation curve from CREPE
plt.figure(figsize=(20, 20))
plt.title('Activation curve from CREPE')
plt.xlabel('time (ms)')
plt.ylabel('Pitch bins')
plt.imshow(1-activation.T, origin='lower', cmap='gray')  

#We can plot the activation curve from CREPE with frequency an time variables
plt.figure(figsize=(20, 5))
plt.title('Activation curve from CREPE (f0, t)')
plt.xlabel('time (s)')
plt.ylabel('Fundamental frequency (Hz)')
plt.plot(time, frequency)

### <a name="estimation"></a>1.1.2. PROCESSING CREPE PITCH CURVE ESTIMATION

In [None]:
"----------------------PITCH ESTIMATION WITH CREPE NN---------------------"    
"-------------------------STEP 3: CREPE PROCESSING------------------------"
"-------------------------------------------------------------------------"
#We remove frequencies and time where confidence is lower than 'min_confidence'
min_confidence = 0.87
#We can plot the activation curve from CREPE without f with a confidence < min_confidence
f_clean = [] #list to store frequencies which confidence is >= min_confidence
t_clean = [] 
for i in range(len(frequency)):
    if confidence[i] >= min_confidence:
        f_clean.append(frequency[i])
        t_clean.append(time[i])


#The deleted f must be plotted with the real time in the song
plt.figure(figsize=(20, 5))
plt.title('Activation curve from CREPE after removing f0 which confidence < {}'.format(min_confidence))
plt.xlabel('time s*100')
plt.ylabel('Pitch bins')
plt.imshow(1-activation.T, origin='lower', cmap='gray')  
#we plot the removed f0 which confidence < min_confidence
for j in range(len(confidence)):
    if confidence[j] < min_confidence:
        plt.axvline(time[j]*100, color='r', linestyle='dotted')
        
#frequency plot
plt.figure(figsize=(20, 5))
plt.title('Activation curve from CREPE after computing f0 average and delete f0 which confidence < {}'.format(min_confidence))
plt.ylabel('Fundamental frequency (Hz)')
plt.plot(f_clean)

plt.figure(figsize=(20, 5))
plt.title('Activation curve from CREPE after computing f0 average and delete f0 which confidence < {}'.format(min_confidence))
plt.ylabel('Fundamental frequency (Hz)')
plt.xlabel('Time (s)')
plt.plot(t_clean, f_clean)

#### DERIVATIVES

We now compute the 1st and 2nd order derivatives from which we'll extract the onsets.

<img src="https://i.ytimg.com/vi/3n-z5C30YPE/maxresdefault.jpg" style="width:500px;"/>

In [None]:
"----------------DERIVATIVES (ONSETs DETECTION)-------------------------"
#1st Derivative
derivative = []
for i in range(len(f_clean)):
    derivative = np.diff(f_clean)
     
#2nd Derivative
sec_derivative = []
for i in range(len(derivative)):
    sec_derivative = np.diff(derivative)

In [None]:
#Plotting the 1st order derivative
plt.figure(figsize=(20, 5))
plt.title('1st order Derivative of the Pitch curve after removing f0 which confidence < {}'.format(min_confidence))
plt.plot(derivative)

#Plotting the 2nd order derivative
plt.figure(figsize=(20, 5))
plt.title('2nd order Derivative of the Pitch curve after removing f0 which confidence < {}'.format(min_confidence))
plt.plot(sec_derivative)

#### EXTRACTING ONSETS FROM DERIVATIONVES OF THE PITCH CURVE

In [None]:
"----------------ONSETs DETECTION from DERIVATIVES-------------------------"
#We can extract the onsets from the 1st order derivative of the pitch curve
first_der = []
onset_times_first_der = []
limit = 2
for i in range(len(derivative)):
    if (derivative[i] - derivative[i-1]) > limit or (derivative[i-1] - derivative[i]) > limit:
        first_der.append(i)
        onset_times_first_der.append(t_clean[i]) 

#We can extract the onsets from the 2nd order derivative of the pitch curve
sec_der = []
onset_times_sec_der = []
limit = 2
for i in range(len(sec_derivative)):
    if (sec_derivative[i] - sec_derivative[i-1]) > limit or (sec_derivative[i-1] - sec_derivative[i]) > limit:
        sec_der.append(i)
        onset_times_sec_der.append(t_clean[i]) 

In [None]:
plt.figure(figsize=(20, 5))
plt.title('1st order Derivative with onsets detection')
plt.plot(derivative)
for onset_crepe in first_der:
    plt.axvline(onset_crepe, color='magenta', linestyle='dotted')
plt.legend()

plt.figure(figsize=(20, 5))
plt.title('2nd order Derivative with onsets detection')
plt.plot(sec_derivative)
for onset_crepe in sec_der:
    plt.axvline(onset_crepe, color='magenta', linestyle='dotted')
plt.legend()

Table of bpm with notes duration in sec: http://www.sengpielaudio.com/calculator-bpmtempotime.htm

In [None]:
#We clean the onsets by removing onsets which distance between them is less than 'lim'
lim = 0.08 #sec (a 1/16 note at 180bpm is 0.084sec so we won't loose so much information setting the limit to 0.08s)
onset_times_first_der_clean = []
for i in range(len(onset_times_first_der)):
    if (onset_times_first_der[i] - onset_times_first_der[i-1]) > lim or i == 0 or i == len(onset_times_first_der):
        onset_times_first_der_clean.append(onset_times_first_der[i])
onset_times_first_der_clean.insert(0,time[0])
onset_times_first_der_clean.append(time[-1])

#We clean the onsets by removing onsets which distance between them is less than 'lim'
lim = 0.08 #sec (a 1/16 note at 180bpm is 0.084sec so we won't loose so much information setting the limit to 0.2s)
onset_times_sec_der_clean = []
for i in range(len(onset_times_sec_der)):
    if (onset_times_sec_der[i] - onset_times_sec_der[i-1]) > lim or i == 0 or i == len(onset_times_sec_der):
        onset_times_sec_der_clean.append(onset_times_sec_der[i])
onset_times_sec_der_clean.insert(0,time[0])
onset_times_sec_der_clean.append(time[-1])

### <a name="librosa"></a>1.1.3. LIBROSA ONSETS DETECTION vs CREPE ONSETS DETECTION FROM PITCH CURVE

In [None]:
x, sr = librosa.load(wav_16bit)

hop_length = 250
onset_env = librosa.onset.onset_strength(x, sr=sr, hop_length=hop_length)

onset_samples = librosa.onset.onset_detect(x,
                                           sr=sr, units='samples', 
                                           hop_length=hop_length, 
                                           backtrack=False,
                                           pre_max=20,
                                           post_max=20,
                                           pre_avg=100,
                                           post_avg=100,
                                           delta=0.2,
                                           wait=0)
onset_boundaries = np.concatenate([[0], onset_samples, [len(x)]])
onset_times = librosa.samples_to_time(onset_boundaries, sr=sr)

In [None]:
plt.figure(figsize=(20, 5))
plt.title('CREPE onsets detection from 1st order derivative')
plt.xlabel('time (ms)')
plt.ylabel('Pitch bins')
plt.imshow(1-activation.T, origin='lower', cmap='gray') 
for onset_crepe in onset_times_first_der_clean:
    plt.axvline(onset_crepe*100, color='magenta', linestyle='dotted')

plt.figure(figsize=(20, 5))
plt.title('CREPE onsets detection from 2nd order derivative')
plt.xlabel('time (ms)')
plt.ylabel('Pitch bins')
plt.imshow(1-activation.T, origin='lower', cmap='gray') 
for onset_crepe in onset_times_sec_der_clean:
    plt.axvline(onset_crepe*100, color='magenta', linestyle='dotted')

plt.figure(figsize=(20, 5))
plt.title('Librosa onsets detection')
plt.xlabel('time (ms)')
plt.ylabel('Pitch bins')
plt.imshow(1-activation.T, origin='lower', cmap='gray') 
for onset_crepe in onset_times:
    plt.axvline(onset_crepe*100, color='BLUE', linestyle='dotted')

In [None]:
#Example of why we need to compute the weighted average and not just an average
f_clean_int = [int(round(i)) for i in f_clean] #convert f to integers

limit = np.searchsorted(t_clean, onset_times_first_der_clean[1])
upper_limit = np.searchsorted(t_clean, onset_times_first_der_clean[2])

#The average is
av = np.mean(f_clean_int[limit:upper_limit])
print( "The average of f in the interval", '[',limit,',', upper_limit,'] ms', "is : ", av)

#The mode is
data = Counter(f_clean_int[limit:upper_limit])   # Returns all unique items and their counts
data.most_common(1)[0][0]  # Returns the highest occurring item
print("The mode (most repeated f) is :", data.most_common(1)[0][0],'which is repeated', data.most_common(1)[0][1],'times')

mode_ex = data.most_common() #mode.most_common() is a list of tuples [element, repetitions]
f_ex, count_ex = zip(*mode_ex) #we split the list of tuples in 2 lists, one will store the value of f and the other the repetitions of each f

mean_ex = np.average(f_ex, weights=count_ex)
print("The weighted average is :", mean_ex)

## <a name="crepe-processing"></a>1.2. CREPE PITCH PROCESSING 

Once we have a good estimation of the onsets, we can continue with adjusting the pitch between onsets intervals

(colors for matplotlib plots: https://i.stack.imgur.com/nCk6u.jpg)

In [None]:
"-------------------------------------------------------------------------"
"------------------STEP 4: PROCESS CREPE PITCH EXTRACTION-----------------"
"-------------------------------------------------------------------------"
#We will write just one f for each onset interval. This pitch will be the weigthed average mean. 
#The average will be set depending on how many times does each pitch is repeated in the interval, so the more repeated
#pitches will be a higher weigth to compute the mean

mean_list = []
for onset in range(len(onset_times_first_der_clean)-1):
    limit = np.searchsorted(t_clean, onset_times_first_der_clean[onset])
    upper_limit = np.searchsorted(t_clean, onset_times_first_der_clean[onset+1])
    if limit != upper_limit:
        mode = Counter(f_clean_int[limit:upper_limit])
        mode_list = mode.most_common() #mode.most_common() is a list of tuples [element, repetitions]
        f_list, count_list = zip(*mode_list) #we split the list of tuples in 2 lists, one will store the value of f and the other the repetitions of each f
        if len(set(count_list)) == 1: #il all f in the onsets interval have the same repetitions
            mean = np.mean(f_list) #we take the average f value in the onsets interval
        else:
            mean = f_list[0] #mode in the onsets interval
        mean_list.append(mean)  
    else:
        mean_list.append(0.01) #not 0 because later we'll compute a logarithm so setting 0.01 we'll avoid inf values
mean_list = [int(i) for i in mean_list]

In [None]:
#We can plot the initial activation pitch curve from crepe with the obtained mean after all the previous processing
plt.style.use('dark_background')

plt.figure(figsize=(20, 5))
plt.title('Raw pitch prediction curve with final processed pitches')
plt.xlabel('time (s)')
plt.ylabel('Fundamental frequency f0 (Hz)')
plt.plot(time, frequency) 
for onset_crepe in range(len(onset_times_first_der_clean)-1):
    plt.axvline(onset_times_first_der_clean[onset_crepe], color='magenta', linestyle='dotted')
    plt.hlines(y=mean_list[onset_crepe],
                        xmin=onset_times_first_der_clean[onset_crepe], 
                        xmax=onset_times_first_der_clean[onset_crepe+1], color='yellow')

plt.figure(figsize=(20, 5))
plt.title('CREPE onsets detection from 1st order derivative')
plt.xlabel('time (s)')
plt.ylabel('Fundamental frequency f0 (Hz)')
plt.plot(t_clean, f_clean) 
for onset_crepe in range(len(onset_times_first_der_clean)-1):
    plt.axvline(onset_times_first_der_clean[onset_crepe], color='magenta', linestyle='dotted')
    plt.hlines(y=mean_list[onset_crepe],
                        xmin=onset_times_first_der_clean[onset_crepe], 
                        xmax=onset_times_first_der_clean[onset_crepe+1], color='yellow')
    
plt.figure(figsize=(20, 5))
plt.title('Predicted frequencies')
plt.xlabel('time (s)')
plt.ylabel('Fundamental frequency f0 (Hz)')
for onset_crepe in range(len(onset_times_first_der_clean)-1):
    plt.hlines(y=mean_list[onset_crepe],
                        xmin=onset_times_first_der_clean[onset_crepe], 
                        xmax=onset_times_first_der_clean[onset_crepe+1], color='yellow')

midi_notes_list = [12*(np.log2(i) - np.log2(440.0)) + 69 for i in mean_list] 
plt.style.use('dark_background')
plt.figure(figsize=(20, 5))
plt.title('Predicted Pitch: Pianoroll Representation')
plt.xlabel('time (s)')
plt.ylabel('Pitch')
for onset_crepe in range(len(onset_times_first_der_clean)-1):
    plt.hlines(y=midi_notes_list[onset_crepe],
                        xmin=onset_times_first_der_clean[onset_crepe], 
                        xmax=onset_times_first_der_clean[onset_crepe+1], color='#8cffdb', linewidth=7.0)


## <a name="midi"></a>1.3. MIDI WRITING

We now have to convert the frequencies to MIDI notes

The way to convert f0 to a MIDI note is:
\begin{equation}
MIDInote = 12*(log_2(f) - log_2(440)) + 69 
\end{equation}

<img src="https://i.pinimg.com/originals/fe/13/eb/fe13ebb344f12175dca1b0a63617bf73.gif" style="width:500px;"/>

We'll write 3 MIDI files:
* 1. CREPE_raw: MIDI file written directly from CREPE pitch curve
* 2. CREPE_raw_clean: MIDI file written after removing frequencies which conficence < 0.87
* 3. CREPE: MIDI file after pitch processing which has been done in this notebook

In [None]:
"-------------------------------------------------------------------------"
"------------------STEP 5: RAW CREPE PITCH CURVE TO MIDI------------------"
"-------------------------------------------------------------------------"    
midi_notes_list_raw = [12*(np.log2(i) - np.log2(440.0)) + 69 for i in frequency] 

# create your MIDI object
mf_raw = MIDIFile(numTracks = 1)     # only 1 track
track_raw = 0   # the only track

time_note_raw = 0    # start at the beginning
bpm = 110
mf_raw.addTrackName(track_raw, time_note_raw, "Sample Track")
mf_raw.addTempo(track_raw, time_note_raw, bpm)
ms_in_1beat = 60000/bpm

# add some notes
channel_raw = 0
volume_raw = 100

for i in range(len(frequency)-1):
    pitch = int(round(midi_notes_list_raw[i])) #write MIDi note     
    time_note = time[i]*1000 / ms_in_1beat          # start on beat 0
    duration = (time[i+1]-time[i])*1000 / ms_in_1beat        # 1 beat long
    mf_raw.addNote(track_raw, channel_raw, pitch, time_note, duration, volume_raw)
    
# write it to disk
with open('CREPE_raw_' + song + '.mid', 'wb') as outf:
    mf_raw.writeFile(outf)

print("CREPE MIDI file has been created")

In [None]:
"-------------------------------------------------------------------------"
"-------STEP 5: CREPE PITCH CURVE TO MIDI REMOVING <0.87 CONFIDENCE-------"
"-------------------------------------------------------------------------"    
midi_notes_list_clean = [12*(np.log2(i) - np.log2(440.0)) + 69 for i in f_clean] 

# create your MIDI object
mf_raw_clean = MIDIFile(numTracks = 1)     # only 1 track
track_raw = 0   # the only track

time_note_raw = 0    # start at the beginning
bpm = 110
mf_raw_clean.addTrackName(track_raw, time_note_raw, "Sample Track")
mf_raw_clean.addTempo(track_raw, time_note_raw, bpm)
ms_in_1beat = 60000/bpm

# add some notes
channel_raw = 0
volume_raw = 100

for i in range(len(t_clean)-1):
    pitch = int(round(midi_notes_list_clean[i])) #write MIDi note     
    time_note = t_clean[i]*1000 / ms_in_1beat          # start on beat 0
    duration = (t_clean[i+1]-t_clean[i])*1000 / ms_in_1beat        # 1 beat long
    mf_raw_clean.addNote(track_raw, channel_raw, pitch, time_note, duration, volume_raw)
    
# write it to disk
with open('CREPE_raw_clean_' + song + '.mid', 'wb') as outf:
    mf_raw_clean.writeFile(outf)

print("CREPE MIDI file has been created")

In [None]:
"-------------------------------------------------------------------------"
"------------STEP 5: CONVERT CREPE PITCH PREDICTION TO MIDI---------------"
"-------------------------------------------------------------------------"    
midi_notes_list = []
for i in mean_list:
    if i == 0:
        midi_notes = 0
    else:
        midi_notes = 12*(np.log2(i) - np.log2(440.0)) + 69
    midi_notes_list.append(midi_notes)

# create your MIDI object
mf = MIDIFile(numTracks = 1)     # only 1 track
track = 0   # the only track

time_note = 0    # start at the beginning
bpm = 110
mf.addTrackName(track, time_note, "Sample Track")
mf.addTempo(track, time_note, bpm)
ms_in_1beat = 60000/bpm

# add some notes
channel = 0
volume = 100

for i in range(len(onset_times_first_der_clean)-1):
    pitch = int(round(midi_notes_list[i])) #write MIDi note     
    time_note = onset_times_first_der_clean[i]*1000 / ms_in_1beat          # start on beat 0
    duration = (onset_times_first_der_clean[i+1]-onset_times_first_der_clean[i])*1000 / ms_in_1beat        # 1 beat long
    mf.addNote(track, channel, pitch, time_note, duration, volume)
    
# write it to disk
with open('CREPE_onsets_' + song + '.mid', 'wb') as outf:
    mf.writeFile(outf)

print("CREPE MIDI file has been created")

## <a name="results"></a>1.4. RESULTS

And now we just play the results

In [None]:
#General function to plot piano-roll representation + midi and wav comparation
import bokeh 
import magenta.music as mm

def midi_pianoroll(file):
    print(file)

    note_seq = mm.midi_file_to_sequence_proto(file)

    # This is a colab utility method that visualizes a NoteSequence.
    fig = mm.plot_sequence(note_seq, show_figure=False)
    bokeh.plotting.output_notebook()
    bokeh.plotting.show(fig)

    # This is a colab utility method that plays a NoteSequence.
    mm.play_sequence(note_seq,synth=mm.fluidsynth)
    return

In [None]:
#original song
ipd.Audio(wav_16bit, rate=sr)

In [None]:
"-------------------MIDI-------------------------"
#MIDI file from raw pitch curve
file = 'CREPE_raw_' + song + '.mid'
midi_pianoroll(file)

"-------------------MIDI-------------------------"
#MIDI file from raw pitch curve
file = 'CREPE_raw_clean_' + song + '.mid'
midi_pianoroll(file)

"-------------------MIDI-------------------------"
#MIDI file after processing
file = 'CREPE_' + song + '.mid'
midi_pianoroll(file)

In [None]:
plt.figure(figsize=(20, 5))
plt.title('Pianoroll Representation: raw pitch from CREPE')
plt.grid(axis='y', linewidth=0.3)
plt.xlabel('time (s)')
plt.ylabel('Pitch')
for o in range(len(time)-1):
    plt.hlines(y=midi_notes_list_raw[o],
                        xmin=time[o], 
                        xmax=time[o+1], color='#EEFF42', linewidth=7.0)

plt.figure(figsize=(20, 5))
plt.title('Pianoroll Representation: pitch after removing f which confidence < 0.87')
plt.grid(axis='y', linewidth=0.3)
plt.xlabel('time (s)')
plt.ylabel('Pitch')
for o in range(len(t_clean)-1):
    plt.hlines(y=midi_notes_list_clean[o],
                        xmin=t_clean[o], 
                        xmax=t_clean[o+1], color='#FF0000', linewidth=7.0)
    
plt.figure(figsize=(20, 5))
plt.title('Pianoroll Representation: Processed pitch')
plt.grid(axis='y', linewidth=0.3)
plt.xlabel('time (s)')
plt.ylabel('Pitch')
for onset_crepe in range(len(onset_times_first_der_clean)-1):
    plt.hlines(y=midi_notes_list[onset_crepe],
                        xmin=onset_times_first_der_clean[onset_crepe], 
                        xmax=onset_times_first_der_clean[onset_crepe+1], color='#8cffdb', linewidth=7.0)

In [None]:
#plotting all together
plt.figure(figsize=(20, 5))
plt.title('Pianoroll Representation: Processed pitch')
plt.grid(axis='y', linewidth=0.3)
plt.xlabel('time (s)')
plt.ylabel('Pitch')

for o in range(len(time)-1):
    plt.hlines(y=midi_notes_list_raw[o],
                        xmin=time[o], 
                        xmax=time[o+1], color='#EEFF42', linewidth=7.0)

for o in range(len(t_clean)-1):
    plt.hlines(y=midi_notes_list_clean[o],
                        xmin=t_clean[o], 
                        xmax=t_clean[o+1], color='#FF0000', linewidth=7.0)

for onset_crepe in range(len(onset_times_first_der_clean)-1):
    plt.hlines(y=midi_notes_list[onset_crepe],
                        xmin=onset_times_first_der_clean[onset_crepe], 
                        xmax=onset_times_first_der_clean[onset_crepe+1], color='#8cffdb', linewidth=7.0)


In [None]:
#@title Download raw MIDI File (optional)

files.download('CREPE_raw_' + song + '.mid')

In [None]:
#@title Download raw clean MIDI File (optional)

files.download('CREPE_raw_clean_' + song + '.mid')

In [None]:
#@title Download Processed MIDI File (optional)

files.download('CREPE_onsets_' + song + '.mid')

## <a name="crepe-conclusions"></a>1.5. CONCLUSIONS AND FUTURE WORK

* No silence removal -> Remove silences looking at the removed f0 with less than 0.87 confidence 
* The limits set in the derivatives to extract the onsets depend on the songs/voice effects -> is onsets detection necessary?

REFERENCES

* https://github.com/marl/crepe

<a href="#top">Back to top</a>

<img src="https://4.bp.blogspot.com/-WELZsAfX1U0/Vl7UxvJNHdI/AAAAAAAAF34/9Kl1x1y0Uv4/s1600/separador.png" style="width:500px;"/>