<a href="https://colab.research.google.com/github/Juanvr/Dathoven/blob/main/notebooks/0%20-%20Dathoven%20-%20Exploring%20MIDI.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Exploring MIDI

The goal of this notebook is to find a way of processing MIDI data in python. 

Looking around for possible python libraries for MIDI we have: 

    - MIDIFile
    - mido
    - pretty_midi
    - Music21
    

After some trial and error I find Music21 to be the python library best suited for this projet needs. 

    

## PrettyMIDI

PrettyMIDI is a library developed by Colin Raffel with MIT license. 

In [1]:
! pip install pretty_midi



## Get Midi Examples

In [2]:
! mkdir examples

mkdir: cannot create directory ‘examples’: File exists


In [3]:
import requests

url = 'https://github.com/Juanvr/Dathoven/raw/main/examples/silent_night_easy.mid'
r = requests.get(url, allow_redirects=True)

open('examples/silent_night_easy.mid', 'wb').write(r.content);

url = 'https://github.com/Juanvr/Dathoven/raw/main/examples/Queen-We_Are_The_Champions.mid'
r = requests.get(url, allow_redirects=True)

open('examples/Queen-We_Are_The_Champions.mid', 'wb').write(r.content);

## Silent Night

For this first example I'll be using a very simple midi song file. It's silent night. 

In [4]:
import pretty_midi

In [5]:
pm = pretty_midi.PrettyMIDI('examples/silent_night_easy.mid')

We can play the song here right in the jupyter notebook:

In [6]:
!apt install fluidsynth
!cp /usr/share/sounds/sf2/FluidR3_GM.sf2 ./font.sf2

Reading package lists... Done
Building dependency tree       
Reading state information... Done
fluidsynth is already the newest version (1.1.9-1).
0 upgraded, 0 newly installed, 0 to remove and 39 not upgraded.


In [9]:
!fluidsynth -ni font.sf2 examples/silent_night_easy.mid -F output.wav -r 4100

FluidSynth version 1.1.9
Copyright (C) 2000-2018 Peter Hanappe and others.
Distributed under the LGPL license.
SoundFont(R) is a registered trademark of E-mu Systems, Inc.

Rendering audio to file 'output.wav'..


In [10]:
from IPython.display import Audio
Audio('output.wav')

We can list the instruments present in the file: 

In [11]:
pm.instruments

[Instrument(program=0, is_drum=False, name="Piano")]

In this simple file we only have one instrument, it's a piano. 

## We are the champions - Queen

If we analyze a more complex song we can see there are many more instruments available in MIDI. 

In [12]:
!fluidsynth -ni font.sf2 /content/examples/Queen-We_Are_The_Champions.mid -F output.wav -r 4100

FluidSynth version 1.1.9
Copyright (C) 2000-2018 Peter Hanappe and others.
Distributed under the LGPL license.
SoundFont(R) is a registered trademark of E-mu Systems, Inc.

Rendering audio to file 'output.wav'..


In [13]:
Audio('output.wav')

In [14]:
pm = pretty_midi.PrettyMIDI('examples/Queen-We_Are_The_Champions.mid')

In [15]:
pm.instruments

[Instrument(program=48, is_drum=False, name=""),
 Instrument(program=48, is_drum=False, name=""),
 Instrument(program=0, is_drum=False, name=""),
 Instrument(program=33, is_drum=False, name=""),
 Instrument(program=25, is_drum=False, name=""),
 Instrument(program=0, is_drum=True, name=""),
 Instrument(program=52, is_drum=False, name=""),
 Instrument(program=29, is_drum=False, name=""),
 Instrument(program=30, is_drum=False, name="")]

We can get the names associated with each of the MIDI programs thanks to pretty_midi. 

In [16]:
for instrument in pm.instruments:
    print(f"Instrument {pretty_midi.program_to_instrument_name(instrument.program)}, {len(instrument.notes)}")

Instrument String Ensemble 1, 164
Instrument String Ensemble 1, 164
Instrument Acoustic Grand Piano, 831
Instrument Electric Bass (finger), 192
Instrument Acoustic Guitar (steel), 52
Instrument Acoustic Grand Piano, 494
Instrument Choir Aahs, 116
Instrument Overdriven Guitar, 130
Instrument Distortion Guitar, 110


## Getting the Notes

For each instrument we have an array of notes: 

In [17]:
for note in pm.instruments[2].notes[:15]:
    print(note)

Note(start=4.736842, end=5.039474, pitch=36, velocity=118)
Note(start=5.684211, end=6.006579, pitch=60, velocity=127)
Note(start=6.000001, end=6.361843, pitch=63, velocity=96)
Note(start=6.315790, end=6.447369, pitch=60, velocity=101)
Note(start=5.684211, end=6.552632, pitch=67, velocity=118)
Note(start=5.052632, end=6.631579, pitch=48, velocity=114)
Note(start=5.368422, end=6.717106, pitch=55, velocity=116)
Note(start=6.631579, end=6.730264, pitch=65, velocity=98)
Note(start=6.631579, end=6.730264, pitch=58, velocity=86)
Note(start=6.631579, end=6.736843, pitch=62, velocity=96)
Note(start=6.631579, end=6.743422, pitch=43, velocity=100)
Note(start=6.947369, end=8.138159, pitch=58, velocity=107)
Note(start=6.947369, end=8.532895, pitch=43, velocity=109)
Note(start=8.210527, end=8.539474, pitch=58, velocity=101)
Note(start=6.947369, end=8.552632, pitch=55, velocity=109)


Each note is defined by its start timestamp, its end timestamp, its pitch and its velocity. 

(Velocity is the force with which a note is played)

If we just want an harmonic aproach to music analysis, it is sensible to use only the instruments that are harmonic. This means no drums: 

In [18]:
all_notes = []
for instrument in pm.instruments:
    # Drum instrument notes don't have pitches!
    if instrument.is_drum:
        continue
    for note in instrument.notes:
        #n_c_to_d += (first_note.pitch % 12 == 0) and (second_note.pitch % 12 == 2)
        all_notes.append(note)

In [19]:
len(all_notes)

1759

Let's try to put all this notes in pandas DataFrame, with the name of the note:

In [20]:
!pip install pandas



In [21]:
import pandas as pd

In [22]:
df = pd.DataFrame()
df["pitch"]= pd.Series(note.pitch for note in all_notes)
df["note_name"] = pd.Series(pretty_midi.note_number_to_name(note.pitch) for note in all_notes)

df.head()

Unnamed: 0,pitch,note_name
0,55,G3
1,58,A#3
2,60,C4
3,60,C4
4,55,G3


At this point we are able to get a midi file, separate the instruments that aren't drums and put all their notes in a pandas DataFrame. 

Yey!


We can encapsulate this functionality in a python function for easer use: 

In [23]:
def get_all_notes_sorted(file_path): 
    pm = pretty_midi.PrettyMIDI(file_path)                    
    all_notes = []
    for instrument in pm.instruments:
        # Drum instrument notes don't have pitches!
        if instrument.is_drum:
            continue
        for note in instrument.notes:
            #n_c_to_d += (first_note.pitch % 12 == 0) and (second_note.pitch % 12 == 2)
            all_notes.append(note)
    all_notes.sort(key=lambda x: x.start, reverse=False)
    return all_notes

In [24]:
all_notes = get_all_notes_sorted('examples/silent_night_easy.mid')
all_notes[:5]

[Note(start=0.000000, end=0.916406, pitch=67, velocity=53),
 Note(start=0.000000, end=1.800000, pitch=60, velocity=53),
 Note(start=0.900000, end=1.204687, pitch=69, velocity=56),
 Note(start=1.200000, end=1.811719, pitch=67, velocity=61),
 Note(start=1.800000, end=3.600000, pitch=60, velocity=57)]

## Getting the Chords

We would like to achieve the same thing that we did with notes, but now with chords. We want a function that gets the path of a MIDI file and returns the groups of notes that sound at the same time in that song. 

We create a function that tells us if a note if sounding at an exact time: 

In [25]:
def is_note_sounding(note, time):
    return note.start <= time and note.end > time + 0.2

This functions returns all the notes that are sounding at a certain time: 

In [26]:
def notes_sounding(notes, time):
    return list(filter(lambda note: is_note_sounding(note, time), notes))

We get all the ticks relevant on our song, we can do it by taking the time on which each note starts:

In [27]:
ticks = []
for note in all_notes:
    ticks += [note.start]

ticks.sort()
ticks = list(dict.fromkeys(ticks))
ticks[:10]

[0.0,
 0.8999999999999999,
 1.2,
 1.7999999999999998,
 3.5999999999999996,
 4.5,
 4.8,
 5.3999999999999995,
 6.0,
 6.6]

We use these ticks to get all the chords on our song:

In [28]:
chords = []
for tick in ticks:
    chords += [notes_sounding(all_notes, tick)]
chords[:5]

[[Note(start=0.000000, end=0.916406, pitch=67, velocity=53),
  Note(start=0.000000, end=1.800000, pitch=60, velocity=53)],
 [Note(start=0.000000, end=1.800000, pitch=60, velocity=53),
  Note(start=0.900000, end=1.204687, pitch=69, velocity=56)],
 [Note(start=0.000000, end=1.800000, pitch=60, velocity=53),
  Note(start=1.200000, end=1.811719, pitch=67, velocity=61)],
 [Note(start=1.800000, end=3.600000, pitch=60, velocity=57),
  Note(start=1.800000, end=3.635156, pitch=64, velocity=54)],
 [Note(start=3.600000, end=4.516406, pitch=67, velocity=65),
  Note(start=3.600000, end=5.400000, pitch=60, velocity=64)]]

We need a function that turns this arrays of notes into arrays of note names:

In [29]:
def note_array_number_to_name( note_array ):
    return list(pretty_midi.note_number_to_name(note.pitch) for note in note_array)

In [30]:
def note_matrix_number_to_name(note_matrix):
    return list(map(note_array_number_to_name, note_matrix))

In [31]:
note_matrix_number_to_name(chords)[:20]

[['G4', 'C4'],
 ['C4', 'A4'],
 ['C4', 'G4'],
 ['C4', 'E4'],
 ['G4', 'C4'],
 ['C4', 'A4'],
 ['C4', 'G4'],
 ['C4', 'E4'],
 ['E4', 'B3'],
 ['E4', 'A3'],
 ['D5', 'G3'],
 ['G3', 'D5'],
 ['G3', 'B4'],
 ['B4', 'G3'],
 ['C5', 'C4'],
 ['C4', 'C5'],
 ['G4', 'C4'],
 ['A4', 'F3'],
 ['F3', 'A4'],
 ['C5', 'F3']]

This are the groups of notes present on the song Silent Night that we were analysing.

We can name this chords using the library music21:

In [32]:
pip install --upgrade music21

Requirement already up-to-date: music21 in /usr/local/lib/python3.7/dist-packages (6.7.1)


In [33]:
from music21 import *
cMinor = chord.Chord(["C4","G4","E-5"])
cMinor.pitchedCommonName

'C-minor triad'

We would like to have a list of all the chords, and when they started sounding: 

In [34]:
chords = []
for tick in ticks:
    chords.append((notes_sounding(all_notes, tick), tick))

In [35]:
pre_chords = [(note_array_number_to_name(x[0]), x[1]) for x in chords]

In [36]:
timed_chords = [(chord.Chord(x[0]).pitchedCommonName, x[1]) for x in pre_chords]
timed_chords[:6]

[('Perfect Fifth above C', 0.0),
 ('Major Sixth above C', 0.8999999999999999),
 ('Perfect Fifth above C', 1.2),
 ('Major Third above C', 1.7999999999999998),
 ('Perfect Fifth above C', 3.5999999999999996),
 ('Major Sixth above C', 4.5)]

In [37]:
def previous_chords (start, timed_chords): 
    return [x for x in timed_chords if x[1] < start]

We can build the DataFrame:

In [38]:
df = pd.DataFrame()
df["pitch"]= pd.Series(note.pitch for note in pm.instruments[0].notes)
df["note_name"] = pd.Series(pretty_midi.note_number_to_name(note.pitch) for note in pm.instruments[0].notes)
df["start"] = pd.Series(note.start for note in pm.instruments[0].notes)


for i in range(10): 
    df[f"prev_chord {i+1}"] = pd.Series( previous_chords(note.start, timed_chords)[-i][0] if len(previous_chords(note.start, timed_chords)) > i else ''  for note in pm.instruments[0].notes)

df.head(10)

Unnamed: 0,pitch,note_name,start,prev_chord 1,prev_chord 2,prev_chord 3,prev_chord 4,prev_chord 5,prev_chord 6,prev_chord 7,prev_chord 8,prev_chord 9,prev_chord 10
0,55,G3,3.769737,Perfect Fifth above C,Perfect Fifth above C,Major Third above C,Perfect Fifth above C,Major Sixth above C,,,,,
1,58,A#3,4.105264,Perfect Fifth above C,Perfect Fifth above C,Major Third above C,Perfect Fifth above C,Major Sixth above C,,,,,
2,60,C4,4.414474,Perfect Fifth above C,Perfect Fifth above C,Major Third above C,Perfect Fifth above C,Major Sixth above C,,,,,
3,60,C4,4.75,Perfect Fifth above C,Major Sixth above C,Perfect Fifth above C,Major Third above C,Perfect Fifth above C,Major Sixth above C,,,,
4,55,G3,7.559211,Perfect Fifth above C,Perfect Twelfth above G,Perfect Fifth above A,Perfect Fourth above B,Major Third above C,Perfect Fifth above C,Major Sixth above C,Perfect Fifth above C,Major Third above C,Perfect Fifth above C
5,58,A#3,7.953948,Perfect Fifth above C,Perfect Twelfth above G,Perfect Fifth above A,Perfect Fourth above B,Major Third above C,Perfect Fifth above C,Major Sixth above C,Perfect Fifth above C,Major Third above C,Perfect Fifth above C
6,60,C4,8.25658,Perfect Fifth above C,Perfect Twelfth above G,Perfect Fifth above A,Perfect Fourth above B,Major Third above C,Perfect Fifth above C,Major Sixth above C,Perfect Fifth above C,Major Third above C,Perfect Fifth above C
7,60,C4,8.56579,Perfect Fifth above C,Perfect Twelfth above G,Perfect Twelfth above G,Perfect Fifth above A,Perfect Fourth above B,Major Third above C,Perfect Fifth above C,Major Sixth above C,Perfect Fifth above C,Major Third above C
8,55,G3,11.335527,Perfect Fifth above C,Perfect Octave above C,Major Tenth above G,Major Tenth above G,Perfect Twelfth above G,Perfect Twelfth above G,Perfect Fifth above A,Perfect Fourth above B,Major Third above C,Perfect Fifth above C
9,58,A#3,11.69079,Perfect Fifth above C,Perfect Octave above C,Major Tenth above G,Major Tenth above G,Perfect Twelfth above G,Perfect Twelfth above G,Perfect Fifth above A,Perfect Fourth above B,Major Third above C,Perfect Fifth above C


In [39]:
def get_timed_chords(all_notes):
    chords = []
    for tick in ticks:
        chords.append((notes_sounding(all_notes, tick), tick))
    pre_chords = [(note_array_number_to_name(x[0]), x[1]) for x in chords]
    timed_chords = [(chord.Chord(x[0]).pitchedCommonName, x[1]) for x in pre_chords]
    return timed_chords

In [40]:
def get_notes_and_chords(midi_file_path):
    all_notes = get_all_notes_sorted(midi_file_path)
    timed_chords = get_timed_chords(all_notes)
    df = pd.DataFrame()
    df["pitch"]= pd.Series(note.pitch for note in all_notes)
    df["note_name"] = pd.Series(pretty_midi.note_number_to_name(note.pitch) for note in all_notes)
    df["start"] = pd.Series(note.start for note in all_notes)


    for i in range(10): 
        df[f"prev_chord {i+1}"] = pd.Series( previous_chords(note.start, timed_chords)[-i][0] if len(previous_chords(note.start, timed_chords)) > i else ''  for note in all_notes)
    return df

In [41]:
get_notes_and_chords('examples/silent_night_easy.mid')

Unnamed: 0,pitch,note_name,start,prev_chord 1,prev_chord 2,prev_chord 3,prev_chord 4,prev_chord 5,prev_chord 6,prev_chord 7,prev_chord 8,prev_chord 9,prev_chord 10
0,67,G4,0.0,,,,,,,,,,
1,60,C4,0.0,,,,,,,,,,
2,69,A4,0.9,Perfect Fifth above C,,,,,,,,,
3,67,G4,1.2,Perfect Fifth above C,Major Sixth above C,,,,,,,,
4,60,C4,1.8,Perfect Fifth above C,Perfect Fifth above C,Major Sixth above C,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...
69,55,G3,37.8,Perfect Fifth above C,Major Third above C,Perfect Fifth above C,Perfect Octave above C,Major Tenth above C,Perfect Octave above C,Major Tenth above G,Perfect Twelfth above G,Minor Fourteenth above G,Perfect Twelfth above G
70,65,F4,38.7,Perfect Fifth above C,Perfect Octave above G,Major Third above C,Perfect Fifth above C,Perfect Octave above C,Major Tenth above C,Perfect Octave above C,Major Tenth above G,Perfect Twelfth above G,Minor Fourteenth above G
71,62,D4,39.0,Perfect Fifth above C,Minor Seventh above G,Perfect Octave above G,Major Third above C,Perfect Fifth above C,Perfect Octave above C,Major Tenth above C,Perfect Octave above C,Major Tenth above G,Perfect Twelfth above G
72,60,C4,39.6,Perfect Fifth above C,Perfect Fifth above G,Minor Seventh above G,Perfect Octave above G,Major Third above C,Perfect Fifth above C,Perfect Octave above C,Major Tenth above C,Perfect Octave above C,Major Tenth above G


# Music21 for MIDI

We explore the possibilities of Music21:

In [42]:
from music21 import converter, corpus, instrument, midi, note, chord, pitch, stream, interval

In [43]:
def get_stream_from_midi_without_drums(midi_path):
    mf = midi.MidiFile()
    mf.open(midi_path)
    mf.read()
    mf.close()
    
    for i in range(len(mf.tracks)):
        mf.tracks[i].events = [ev for ev in mf.tracks[i].events if ev.channel != 10]          

    return midi.translate.midiFileToStream(mf)

In [44]:
def stream_to_array_of_notes_strings (stream):
    result = []
    for element in stream.flat.notes:
        stringRepresentationOfElement = ''
        if isinstance(element, note.Note):
            stringRepresentationOfElement = element.nameWithOctave
        else: # it's a chord
            listOfNotesWithOctaves = [note.nameWithOctave for note in element.notes]
            stringRepresentationOfElement = ' '.join(listOfNotesWithOctaves)
        result.append(stringRepresentationOfElement)
    return result

In [45]:
def from_midi_to_array_of_notes (midi_path):
    return stream_to_array_of_notes_strings(get_stream_from_midi_without_drums(midi_path))

In [46]:
from_midi_to_array_of_notes('examples/silent_night_easy.mid')

['G4',
 'C4',
 'A4',
 'G4',
 'E4 C4',
 'G4',
 'C4',
 'A4',
 'G4',
 'E4',
 'C4',
 'B3',
 'A3',
 'D5',
 'G3',
 'D5',
 'B4',
 'G3',
 'G3',
 'C5',
 'C4',
 'C5',
 'G4',
 'C4',
 'A4',
 'F3',
 'A4',
 'C5',
 'F3',
 'B4',
 'A4',
 'G4',
 'C4',
 'A4',
 'G4',
 'E4',
 'C4',
 'A4',
 'F3',
 'A4',
 'C5',
 'F3',
 'B4',
 'A4',
 'G4',
 'C4',
 'A4',
 'G4',
 'E4',
 'C4',
 'B3',
 'A3',
 'D5',
 'G3',
 'D5',
 'F5',
 'G3',
 'D5',
 'B4',
 'C5 C4',
 'E5',
 'C4',
 'C5',
 'C4',
 'G4',
 'E4',
 'G4',
 'G3',
 'F4',
 'D4',
 'C4',
 'C4']

With Music21 we have all in one place, midi reading and music notation. It looks like we have a winner.