# MIDI Datasets for Training

There are many great MIDI Datasets out there, however not all are simply lists of MIDI files but may often come in various formats, including xml, etc. 


## Setup Notebook

In [1]:
#setup google drive mount (for Colab only)
#from google.colab import drive
#drive.mount('/content/drive')

## MIDI Datasets Available Online

There are many different types of MIDI datasets you could use. We'll discuss a few good ones here, because they are often used in the literature. 

### MuseData

[MuseData](https://musedata.org/) is a dataset of 100,000 MIDI files, each of which contains a single note played on a piano. The dataset is split into 100 different MIDI files

### Giant MIDI Piano
* Piano Midi Dataset: https://paperswithcode.com/dataset/giantmidi-piano

### Magenta Datasets
* Datasets used by Magenta (Google Midi-based Generative models): https://magenta.tensorflow.org/datasets/

### Lakh Music Dataset
* https://paperswithcode.com/dataset/lakh-midi-dataset
  
### Bach Doodle Dataset
* Bach Doodle Dataset: https://magenta.tensorflow.org/datasets/bach_doodle

In [None]:
data_path = '/content/drive/MyDrive/Colab Notebooks/train-data/bach-midi-dataset'

In [None]:
!tar -xvf '/content/drive/MyDrive/Colab Notebooks/train-data/bach-midi-dataset/bach-doodle.tfrecord.tar.gz' -C '/content/drive/MyDrive/Colab Notebooks/train-data/bach-midi-dataset/'

In [None]:
# load in the data
# bach dataset from magenta: https://magenta.tensorflow.org/datasets/bach-doodle#download
from torchdata.datapipes.iter import FileLister, FileOpener
datapipe1 = FileLister(data_path,"*.tfrecord-*")
print(len(list(datapipe1)))
datapipe2 = FileOpener(datapipe1, mode="b")
tfrecord_loader_dp = datapipe2.load_from_tfrecord()
example_1 = None
for example in tfrecord_loader_dp:
    example_1 = example
    break

print(example_1.keys())
print(example_1['backend'])

192


### TheoryTab - Reading from XML files and preprocessing
 

But, midi datasets don't always come in that format. Many datasets like the TheoryTab dataset come in XML format, which includes byte streams for each note which need to be parsed. Eventually we want to convert these into piano rolls for each bar of music. This example below is for MelodyRNN where the inputs are a sequence of 16 notes for each sample (i.e. 16 quarter notes per bar). 

Let's take a look at one of these XML files: 

```xml
<theorytab>
  <version>1.2</version>
  <meta>
    <artist>Daddy Yankee</artist>
    <title>Limbo</title>
    <beats_in_measure>4</beats_in_measure>
    <BPM>124</BPM>
    <key>G</key>
    <YouTubeID>6BTjG-dhf5s</YouTubeID>
    <mode>1</mode>
    <band>
      <lead>
        <member>
          <name>Piano</name>
          <velocity>0.8</velocity>
          <octaveOffset>0</octaveOffset>
          <mute>false</mute>
        </member>
      </lead>
      <lead2/>
      <lead3/>
      <lead4/>
      <harmony>
        <member>
          <name>Piano 1/4s</name>
          <velocity>0.8</velocity>
          <octaveOffset>0</octaveOffset>
          <mute>false</mute>
        </member>
      </harmony>
      <bass>
        <member>
          <name>Piano Bass Dotted</name>
          <velocity>0.8</velocity>
          <octaveOffset>0</octaveOffset>
          <mute>false</mute>
        </member>
      </bass>
      <drums/>
    </band>
    <global_start>122.99</global_start>
    <duration>19.21</duration>
    <active_start>1.92</active_start>
    <active_stop>17.29</active_stop>
  </meta>
  <data>
    <segment>
      <melody>
        <voice>
          <lyrics/>
          <notes>
            <note>
              <start_beat_abs>0</start_beat_abs>
              <start_measure>1</start_measure>
              <start_beat>1</start_beat>
              <note_length>0.25</note_length>
              <scale_degree>1</scale_degree>
              <octave>0</octave>
              <isRest>0</isRest>
            </note>
            <note>
              <start_beat_abs>0.25</start_beat_abs>
              <start_measure>1</start_measure>
              <start_beat>1.25</start_beat>
              <note_length>0.5</note_length>
              <scale_degree>1</scale_degree>
              <octave>0</octave>
              <isRest>0</isRest>
            </note>
            <note>
              <start_beat_abs>0.75</start_beat_abs>
              <start_measure>1</start_measure>
              <start_beat>1.75</start_beat>
              <note_length>0.5</note_length>
              <scale_degree>1</scale_degree>
              <octave>0</octave>
              <isRest>0</isRest>
            </note>
            ...
          </notes>
        </voice>
      </melody>
      <harmony>
        <chord>
          <sd>6</sd>
          <fb/>
          <sec/>
          <sus/>
          <pedal/>
          <alternate/>
          <borrowed/>
          <chord_duration>4</chord_duration>
          <start_measure>1</start_measure>
          <start_beat>1</start_beat>
          <start_beat_abs>0</start_beat_abs>
          <isRest>0</isRest>
        </chord>
        <chord>
          <sd>4</sd>
          <fb/>
          <sec/>
          <sus/>
          <pedal/>
          <alternate/>
          <borrowed/>
          <chord_duration>4</chord_duration>
          <start_measure>2</start_measure>
          <start_beat>1</start_beat>
          <start_beat_abs>4</start_beat_abs>
          <isRest>0</isRest>
        </chord>
        ...
      </harmony>
      <numMeasures>8</numMeasures>
    </segment>
  </data>
</theorytab>
```

### Setup XML parsing

In [None]:
!pip install xmldataset




In [None]:
import xml.etree.ElementTree as ET 
import xmldataset
import os 
from os.path import basename, dirname, join, exists, splitext
import numpy as np

## Parse and filter xml files

This is more of a "dataset cleaning" task which we can ignore for our purposes. Filtering large datasets by Key or Time Signature helps to make the dataset more uniform and create more consisten results. As datasets grow larger and more recent architecture can represent more complex arpeggios and chord progressions, this step will become less important. 

In [None]:
# Helper functions to filter the data from the XML files and remove those that are non-standard. In particular we are interested in 
# 1) Songs that do not have any complex chords that are not standard major or minor chords - has to do with figured bass notation: https://music.stackexchange.com/questions/14866/classical-music-theory-notation-for-chord-inversions-figured-bass
# 2) Songs that have standard 4 beat measures (4/4 time signature)

def get_listfile(dataset_path):

    list_file=[]

    for root, dirs, files in os.walk(dataset_path):    
        for f in files:
            if splitext(f)[0]=='chorus':                
                fp = join(root, f)
                list_file.append(fp)

    return list_file

def filter_songs_w_non_standard_chords(list_file):
    list_ = []
    for file_ in list_file:
        try:
            chorus_file = ET.parse(file_)
            root = chorus_file.getroot()
            check_list = []
            counter = 0
            None_counter = 0
            for item in root.iter(tag='fb'):
                check_list.append(item.text)
                counter +=1
                if item.text == None:
                    None_counter +=1
            for item in root.iter(tag='borrowed'):
                check_list.append(item.text)
                counter +=1
                if item.text == None:
                    None_counter +=1
            #print(check_list)
            #print(counter)
            #print(None_counter)
            if counter == None_counter :
                list_.append(file_)
        except:
            print('cannot open the file (not valid xml)')
    return list_

def filter_songs_w_4_beats(list_):
    list_of_four_beat =[]
    for file_ in list_:
        try:
            chorus_file = ET.parse(file_)
            root = chorus_file.getroot()
            beats = root.findall('.//beats_in_measure')
            num = beats[0].text
            if num == '4':
                list_of_four_beat.append(file_) 
        except:
            print('cannot open the file (not valid xml)')
    return list_of_four_beat

def split_files_by_key(list_of_four_beat):
    key_list =[]
    c_key_list = []
    d_key_list = []
    e_key_list = []
    f_key_list = []
    g_key_list = []
    a_key_list = []
    b_key_list = []
    for file_ in list_of_four_beat:
        try:
            chorus_file = ET.parse(file_)
            root = chorus_file.getroot()
            key = root.findall('.//key')
            key_list.append(key[0].text)
            if key[0].text == 'C':
                c_key_list.append(file_)
            if key[0].text == 'D':
                d_key_list.append(file_)
            if key[0].text == 'E':
                e_key_list.append(file_) 
            if key[0].text == 'F':
                f_key_list.append(file_)
            if key[0].text == 'G':
                g_key_list.append(file_) 
            if key[0].text == 'A':
                a_key_list.append(file_)  
            if key[0].text == 'B':
                b_key_list.append(file_)                            
        except:
            print('file broken')
    # print('A key: {}'.format(key_list.count('A')))
    # print('B key: {}'.format(key_list.count('B')))
    # print('C key: {}'.format(key_list.count('C')))
    # print('D key: {}'.format(key_list.count('D')))
    # print('E key: {}'.format(key_list.count('E')))
    # print('F key: {}'.format(key_list.count('F')))
    # print('G key: {}'.format(key_list.count('G')))
    return c_key_list,d_key_list,e_key_list,f_key_list,g_key_list,a_key_list,b_key_list

In [None]:
datapath_xml = '/Users/anand/code/audio/midi-data/xml-TT'
list_file = get_listfile(datapath_xml)
list_ = filter_songs_w_non_standard_chords(list_file)
print(len(list_))
list_of_four_beat = filter_songs_w_4_beats(list_)
print(list_of_four_beat)
print(len(list_of_four_beat))
c_key_list,d_key_list,e_key_list,f_key_list,g_key_list,a_key_list,b_key_list = split_files_by_key(list_of_four_beat)

# list_file = get_listfile(datapath_xml)
# list_ = filter_non_standard_chords(list_file)
# list_of_four_beat = beats_(list_)
# c_key_list,d_key_list,e_key_list,f_key_list,g_key_list,a_key_list,b_key_list = get_key(list_of_four_beat)

# note_list_all,dur_list_all = transform_note(c_key_list,d_key_list,e_key_list,f_key_list,g_key_list,a_key_list,b_key_list)
# in_range,note_list_all_c,dur_list_all_c = check_melody_range(note_list_all,dur_list_all)
# print('total normal chord: {}'.format(len(list_)))
# print('total in four: {}'.format(len(list_of_four_beat)))
# print('melody in range: {}'.format(len(note_list_all)))

cannot open the file (not valid xml)
cannot open the file (not valid xml)
cannot open the file (not valid xml)
cannot open the file (not valid xml)
cannot open the file (not valid xml)
cannot open the file (not valid xml)
cannot open the file (not valid xml)
cannot open the file (not valid xml)
cannot open the file (not valid xml)
cannot open the file (not valid xml)
cannot open the file (not valid xml)
cannot open the file (not valid xml)
cannot open the file (not valid xml)
cannot open the file (not valid xml)
cannot open the file (not valid xml)
cannot open the file (not valid xml)
cannot open the file (not valid xml)
cannot open the file (not valid xml)
cannot open the file (not valid xml)
cannot open the file (not valid xml)
cannot open the file (not valid xml)
cannot open the file (not valid xml)
cannot open the file (not valid xml)
cannot open the file (not valid xml)
cannot open the file (not valid xml)
cannot open the file (not valid xml)
cannot open the file (not valid xml)
c

Converting the notes to an index between 0 and 127 to fit within the 128 pitch levels of a piano roll is the next step. Each note in the xml looks like this: 

```xml
<note>
    <start_beat_abs>0</start_beat_abs>
    <start_measure>1</start_measure>
    <start_beat>1</start_beat>
    <note_length>0.25</note_length>
    <scale_degree>1</scale_degree>
    <octave>0</octave>
    <isRest>0</isRest>
</note>
```
In order to map this to a pitch, knowing the key of the song is critical since the scale degrees are relative to the key. 

In [None]:
# Helper functions to convert .xml files to piano roll matrices

def transform_note(c_key_list,d_key_list,e_key_list,f_key_list,g_key_list,a_key_list,b_key_list):
    scale = [48,50,52,53,55,57,59,60,62,64,65,67,69,71,72,74,76,77,79,81,83,84,86,88,89,91,93]
    transfor_list_C1 = scale[0:7]
    transfor_list_C2 = scale[7:14]
    transfor_list_C3 = scale[14:21]

    transfor_list_D1 = scale[1:8]
    transfor_list_D2 = scale[8:15]
    transfor_list_D3 = scale[15:22]

    transfor_list_E1 = scale[2:9]
    transfor_list_E2 = scale[9:16]
    transfor_list_E3 = scale[16:23]

    transfor_list_F1 = scale[3:10]
    transfor_list_F2 = scale[10:17]
    transfor_list_F3 = scale[17:24]

    transfor_list_G1 = scale[4:11]
    transfor_list_G2 = scale[11:18]
    transfor_list_G3 = scale[18:25]

    transfor_list_A1 = scale[5:12]
    transfor_list_A2 = scale[12:19]
    transfor_list_A3 = scale[19:26]

    transfor_list_B1 = scale[6:13]
    transfor_list_B2 = scale[13:20]
    transfor_list_B3 = scale[20:27]

    note_c =[]  
    dur_c =[]
    for file_ in c_key_list:
        note_list = [file_]
        dur_list = [file_]  
        try:
            chorus_file = ET.parse(file_)
            root = chorus_file.getroot()

            for item in root.iter(tag='note'):
                note = item[4].text
                dur = item[3].text
                octave = item[5].text
                dur = float(dur)
                dur_list.append(dur)

                try:
                    note = int(note)
                    if octave == '-1':
                        h_idx = transfor_list_C1[note-1]
                    elif octave == '0':
                        h_idx = transfor_list_C2[note-1]
                    elif octave == '1':
                        h_idx = transfor_list_C3[note-1]        
                    note_list.append(h_idx)
                    
                except:
                    if len(note_list)==1:
                        note = 0
                        note_list.append(note)

                    else:
                        note = note_list[-1]
                        note_list.append(note)

            if note_list[1]== 0:
                note_list[1] = note_list[2]
                dur_list[1] = dur_list[2]

            note_c.append(note_list)
            dur_c.append(dur_list)

        except:
            print('c key but no melody/notes :{}'.format(file_))

    note_d = []
    dur_d = []
    for file_ in d_key_list:
        note_list = [file_]
        dur_list = [file_]  
        try:
            chorus_file = ET.parse(file_)
            root = chorus_file.getroot()

            for item in root.iter(tag='note'):
                note = item[4].text
                dur = item[3].text
                octave = item[5].text
                dur = float(dur)
                dur_list.append(dur)

                try:
                    note = int(note)
                    if octave == '-1':
                        h_idx = transfor_list_D1[note-1]
                    elif octave == '0':
                        h_idx = transfor_list_D2[note-1]
                    elif octave == '1':
                        h_idx = transfor_list_D3[note-1]        
                    note_list.append(h_idx)
                    
                except:
                    if len(note_list)==1:
                        note = 0
                        note_list.append(note)

                    else:
                        note = note_list[-1]
                        note_list.append(note)

            if note_list[1]== 0:
                note_list[1] = note_list[2]
                dur_list[1] = dur_list[2]

            note_d.append(note_list)
            dur_d.append(dur_list)

        except:
            print('d key but no melody/notes :{}'.format(file_))

    note_e = []
    dur_e = []
    for file_ in e_key_list:
        note_list = [file_]
        dur_list = [file_]  
        try:
            chorus_file = ET.parse(file_)
            root = chorus_file.getroot()

            for item in root.iter(tag='note'):
                note = item[4].text
                dur = item[3].text
                octave = item[5].text
                dur = float(dur)
                dur_list.append(dur)

                try:
                    note = int(note)
                    if octave == '-1':
                        h_idx = transfor_list_E1[note-1]
                    elif octave == '0':
                        h_idx = transfor_list_E2[note-1]
                    elif octave == '1':
                        h_idx = transfor_list_E3[note-1]        
                    note_list.append(h_idx)
                    
                except:
                    if len(note_list)==1:
                        note = 0
                        note_list.append(note)

                    else:
                        note = note_list[-1]
                        note_list.append(note)

            if note_list[1]== 0:
                note_list[1] = note_list[2]
                dur_list[1] = dur_list[2]

            note_e.append(note_list)
            dur_e.append(dur_list)

        except:
            print('e key but no melody/notes :{}'.format(file_))

    note_f = []
    dur_f = []
    for file_ in e_key_list:
        note_list = [file_]
        dur_list = [file_]  
        try:
            chorus_file = ET.parse(file_)
            root = chorus_file.getroot()

            for item in root.iter(tag='note'):
                note = item[4].text
                dur = item[3].text
                octave = item[5].text
                dur = float(dur)
                dur_list.append(dur)

                try:
                    note = int(note)
                    if octave == '-1':
                        h_idx = transfor_list_F1[note-1]
                    elif octave == '0':
                        h_idx = transfor_list_F2[note-1]
                    elif octave == '1':
                        h_idx = transfor_list_F3[note-1]        
                    note_list.append(h_idx)
                    
                except:
                    if len(note_list)==1:
                        note = 0
                        note_list.append(note)

                    else:
                        note = note_list[-1]
                        note_list.append(note)

            if note_list[1]== 0:
                note_list[1] = note_list[2]
                dur_list[1] = dur_list[2]

            note_f.append(note_list)
            dur_f.append(dur_list)

        except:
            print('f key but no melody/notes :{}'.format(file_))


    note_g = []
    dur_g = []
    for file_ in a_key_list:
        note_list = [file_]
        dur_list = [file_]  
        try:
            chorus_file = ET.parse(file_)
            root = chorus_file.getroot()

            for item in root.iter(tag='note'):
                note = item[4].text
                dur = item[3].text
                octave = item[5].text
                dur = float(dur)
                dur_list.append(dur)

                try:
                    note = int(note)
                    if octave == '-1':
                        h_idx = transfor_list_G1[note-1]
                    elif octave == '0':
                        h_idx = transfor_list_G2[note-1]
                    elif octave == '1':
                        h_idx = transfor_list_G3[note-1]        
                    note_list.append(h_idx)
                    
                except:
                    if len(note_list)==1:
                        note = 0
                        note_list.append(note)

                    else:
                        note = note_list[-1]
                        note_list.append(note)

            if note_list[1]== 0:
                note_list[1] = note_list[2]
                dur_list[1] = dur_list[2]

            note_g.append(note_list)
            dur_g.append(dur_list)

        except:
            print('g key but no melody/notes :{}'.format(file_))

    note_a = []
    dur_a = []
    for file_ in a_key_list:
        note_list = [file_]
        dur_list = [file_]  
        try:
            chorus_file = ET.parse(file_)
            root = chorus_file.getroot()

            for item in root.iter(tag='note'):
                note = item[4].text
                dur = item[3].text
                octave = item[5].text
                dur = float(dur)
                dur_list.append(dur)

                try:
                    note = int(note)
                    if octave == '-1':
                        h_idx = transfor_list_A1[note-1]
                    elif octave == '0':
                        h_idx = transfor_list_A2[note-1]
                    elif octave == '1':
                        h_idx = transfor_list_A3[note-1]        
                    note_list.append(h_idx)
                    
                except:
                    if len(note_list)==1:
                        note = 0
                        note_list.append(note)

                    else:
                        note = note_list[-1]
                        note_list.append(note)

            if note_list[1]== 0:
                note_list[1] = note_list[2]
                dur_list[1] = dur_list[2]

            note_a.append(note_list)
            dur_a.append(dur_list)

        except:
            print('e key but no melody/notes :{}'.format(file_))


    note_b = []
    dur_b = []
    for file_ in a_key_list:
        note_list = [file_]
        dur_list = [file_]  
        try:
            chorus_file = ET.parse(file_)
            root = chorus_file.getroot()

            for item in root.iter(tag='note'):
                note = item[4].text
                dur = item[3].text
                octave = item[5].text
                dur = float(dur)
                dur_list.append(dur)

                try:
                    note = int(note)
                    if octave == '-1':
                        h_idx = transfor_list_A1[note-1]
                    elif octave == '0':
                        h_idx = transfor_list_A2[note-1]
                    elif octave == '1':
                        h_idx = transfor_list_A3[note-1]        
                    note_list.append(h_idx)
                    
                except:
                    if len(note_list)==1:
                        note = 0
                        note_list.append(note)

                    else:
                        note = note_list[-1]
                        note_list.append(note)

            if note_list[1]== 0:
                note_list[1] = note_list[2]
                dur_list[1] = dur_list[2]

            note_b.append(note_list)
            dur_b.append(dur_list)

        except:
            print('b key but no melody/notes :{}'.format(file_))
   

    note_list_all = note_c + note_d + note_e + note_f + note_g + note_a + note_b
    dur_list_all = dur_c + dur_d + dur_e  + dur_f + dur_g + dur_a  + dur_b

    return note_list_all,dur_list_all

Next is to filter these notes down to a reasonable range for the model. For example, if we want to only look at notes between pitches 48 and 84 as is done for the basic version of MelodyRNN (https://github.com/magenta/magenta/blob/main/magenta/models/melody_rnn/README.md), we can filter out the rest with the `filter_notes_to_melody_range` function and specify those as starting and ending pitches.  

In [None]:
def filter_notes_to_melody_range(note_list_all,dur_list_all, start_pitch, end_pitch):
    in_range=0
    note_list_all_c = []
    dur_list_all_c = []
    
    for i in range(len(note_list_all)):
        song = note_list_all[i]
        if len(song[1:]) ==0:
            # ipdb.set_trace()
            pass
        elif min(song[1:])>= start_pitch and max(song[1:])<= end_pitch:
            in_range +=1
            note_list_all_c.append(song)
            dur_list_all_c.append(dur_list_all[i])
    np.save('dur_list_all_c.npy',dur_list_all_c)
    np.save('note_list_all_c.npy',note_list_all_c)

    return in_range,note_list_all_c,dur_list_all_c

Using the notes and the duration of each note, we can create a piano roll for each bar of music by iterating through each note and setting the appropriate index in the piano roll to 1.

The final matrix will be a set of samples generated from all songs in the dataset. Each sample will be a 2D matrix with 128 rows (one for each pitch) and 16 columns (one for each time step). `data_x` and `prev_x`

In [None]:
def get_sample(cur_song, cur_dur,n_ratio, dim_pitch, dim_bar):

    cur_bar =np.zeros((1,dim_pitch,dim_bar),dtype=int)
    idx = 1
    sd = 0
    ed = 0
    song_sample=[]
    
    while idx < len(cur_song):
        cur_pitch = cur_song[idx]-1
        ed = int(ed + cur_dur[idx]*n_ratio)
        # print('pitch: {}, sd:{}, ed:{}'.format(cur_pitch, sd, ed))
        if ed <dim_bar:
            cur_bar[0,cur_pitch,sd:ed]=1
            sd = ed
            idx = idx +1
        elif ed >= dim_bar:
            cur_bar[0,cur_pitch,sd:]=1
            song_sample.append(cur_bar)
            cur_bar =np.zeros((1,dim_pitch,dim_bar),dtype=int)
            sd = 0
            ed = 0
            # print(cur_bar)
            # print(song_sample)
        # if idx == len(cur_song)-1 and np.sum(cur_bar)!=0:
        #     song_sample.append(cur_bar)
    return song_sample

def build_matrix(note_list_all_c,dur_list_all_c):
    data_x = []           
    prev_x = []
    zero_counter = 0
    for i in range(len(note_list_all_c)):
        song = note_list_all_c[i]
        dur = dur_list_all_c[i]
        song_sample = get_sample(song,dur,4,128,128)
        np_sample = np.asarray(song_sample)
        if len(np_sample) == 0:
            zero_counter +=1
        if len(np_sample) != 0:
            np_sample =np_sample[0]
            np_sample = np_sample.reshape(1,1,128,128)

            if np.sum(np_sample) != 0:
                place = np_sample.shape[3]
                new=[]
                for i in range(0,place,16):
                    new.append(np_sample[0][:,:,i:i+16])
                new = np.asarray(new)  # (2,1,128,128) will become (16,1,128,16)
                new_prev = np.zeros(new.shape,dtype=int)
                new_prev[1:, :, :, :] = new[0:new.shape[0]-1, :, :, :]            
                data_x.append(new)
                prev_x.append(new_prev)  

    data_x = np.vstack(data_x)
    prev_x = np.vstack(prev_x)


    return data_x,prev_x,zero_counter

In [None]:
note_list_all,dur_list_all = transform_note(c_key_list,d_key_list,e_key_list,f_key_list,g_key_list,a_key_list,b_key_list)

# for i in range(len(note_list_all)):
#     print(len(note_list_all[i]))

# currently all different sizes of notes
import ipdb

def collate_and_matrix(note_list_all,dur_list_all):
    in_range=0
    note_list_all_c = []
    dur_list_all_c = []
    
    # filter notes within melody range (see filter_notes_to_melody_range function)
    for i in range(len(note_list_all)):
        song = note_list_all[i]
        if len(song[1:]) ==0:
            # ipdb.set_trace()
            pass
        elif min(song[1:])>= 60 and max(song[1:])<= 83:
            in_range +=1
            note_list_all_c.append(song)
            dur_list_all_c.append(dur_list_all[i])
    # np.save('dur_list_all_c.npy',dur_list_all_c)
    # np.save('note_list_all_c.npy',note_list_all_c)

    data_x = []           
    prev_x = []
    zero_counter = 0
    for i in range(len(note_list_all_c)):
        song = note_list_all_c[i]
        dur = dur_list_all_c[i]
        song_sample = get_sample(song,dur,4,128,128)
        # convert song sample to piano roll to vector representation
        np_sample = np.asarray(song_sample)
        print(np_sample.shape)
        print(len(np_sample))
        if len(np_sample) == 0:
            zero_counter +=1
        if len(np_sample) != 0:
            np_sample =np_sample[0]
            np_sample = np_sample.reshape(1,1,128,128)
            print(np_sample[0][0])
            if np.sum(np_sample) != 0:
                place = np_sample.shape[3]
                new=[]
                # split into 16 chunks - i.e. 4 counts of 4 beats
                for i in range(0,place,16):
                    new.append(np_sample[0][:,:,i:i+16])
                new = np.asarray(new)  # (2,1,128,128) will become (16,1,128,16)
                new_prev = np.zeros(new.shape,dtype=int)
                new_prev[1:, :, :, :] = new[0:new.shape[0]-1, :, :, :]            
                data_x.append(new)
                # prev x will have the prev chunk at the same index as data_x - that is the prev bar for ind=0 in data_x is at ind=0 in prev_x
                prev_x.append(new_prev)
        if i==5:
            break
    data_x = np.vstack(data_x)
    prev_x = np.vstack(prev_x)
    return data_x,prev_x,zero_counter

data_x, prev_x, zero_counter = collate_and_matrix(note_list_all,dur_list_all)

c key but no melody/notes :/Users/anand/code/audio/midi-data/xml-TT/xml/t/the-police/so-lonely/chorus.xml
c key but no melody/notes :/Users/anand/code/audio/midi-data/xml-TT/xml/s/sha-na-na/those-magic-changes---grease/chorus.xml
c key but no melody/notes :/Users/anand/code/audio/midi-data/xml-TT/xml/m/melodys-echo-chamber/shirim/chorus.xml
c key but no melody/notes :/Users/anand/code/audio/midi-data/xml-TT/xml/m/muse/citizen-erased/chorus.xml
c key but no melody/notes :/Users/anand/code/audio/midi-data/xml-TT/xml/m/minae-fuji/mega-man-4---ring-man/chorus.xml
c key but no melody/notes :/Users/anand/code/audio/midi-data/xml-TT/xml/j/jack-johnson/shot-reverse-shot/chorus.xml
c key but no melody/notes :/Users/anand/code/audio/midi-data/xml-TT/xml/c/clean-bandit/solo-feat-demi-lovato/chorus.xml
c key but no melody/notes :/Users/anand/code/audio/midi-data/xml-TT/xml/c/cat-stevens/wild-world/chorus.xml
c key but no melody/notes :/Users/anand/code/audio/midi-data/xml-TT/xml/x/xyconstant/white

In [None]:
print(data_x.shape)
print(4816/16)
print(data_x[1000,0,:,:].shape)
print(list(data_x[1500,0,:,12]).index(1))
print(prev_x.shape)
print(note_list_all)
print(dur_list_all)

(4816, 1, 128, 16)
301.0
(128, 16)
70
(4816, 1, 128, 16)
[['/Users/anand/code/audio/midi-data/xml-TT/xml/r/rascal-flatts/bless-the-broken-road/chorus.xml', 0, 0, 64, 65, 67, 65, 64, 60, 64, 65, 67, 62, 62, 64, 64, 64, 65, 67, 65, 64, 60, 60, 62, 64, 65, 64, 62, 64, 65, 67], ['/Users/anand/code/audio/midi-data/xml-TT/xml/r/rae-sremmurd/swang/chorus.xml', 67, 67, 67, 71, 67, 65, 64, 60, 60, 64, 65, 64, 64, 64, 64, 60, 64, 62, 62, 67, 67, 71, 67, 65, 64, 60, 60, 64, 65, 64, 64, 64, 60, 64, 62], ['/Users/anand/code/audio/midi-data/xml-TT/xml/r/ritchie-valen/la-bamba/chorus.xml', 0, 0, 0, 55, 57, 59, 60, 64, 67, 65, 65, 69, 67, 55, 59, 62, 65, 65, 64, 62, 60, 64, 67, 65, 65, 69, 62, 67, 65, 65, 65, 65, 65, 64, 60, 60, 60, 65, 65, 65, 65, 65, 64, 60, 60, 60, 60, 62, 59, 62, 65, 64, 62, 64, 62, 60], ['/Users/anand/code/audio/midi-data/xml-TT/xml/r/rudimental/these-days/chorus.xml', 57, 57, 57, 60, 60, 55, 55, 55, 62, 62, 60, 64, 62, 60, 60, 65, 65, 65, 67, 67, 62, 60, 62, 60, 62, 60, 57, 55, 

In [None]:


if is_get_matrix == 1:
    note_list_all_c = np.load('note_list_all_c.npy')
    dur_list_all_c = np.load('dur_list_all_c.npy')

    data_x, prev_x,zero_counter = build_matrix(note_list_all_c,dur_list_all_c)
    np.save('data_x.npy',data_x)
    np.save('prev_x.npy',prev_x)

    print('final tab num: {}'.format(len(note_list_all_c)))
    print('songs not long enough: {}'.format(zero_counter))
    print('sample shape: {}, prev sample shape: {}'.format(data_x.shape, prev_x.shape))

# Extra: Converting Audio to MIDI

Ideally we have a clean MIDI dataset to work with already, but as we'll see in our next post - that isn't always the case. If we are building our own or curating our own MIDI files, that can be done using Ableton for example by playing instruments and extracting MIDI. However, most of us have more raw audio files and it is easiest to record instruments and convert to MIDI. 

Conversion of Audio to MIDI is an imperfect process - there are many tools that have been created over many years to address this. In particular, most tools are not very good, unless you have a simple 1 stem (track) audio file. 

The wisdom of the crowd helps here - [Reddit Post](https://www.reddit.com/r/learnpython/comments/12kiu31/recommended_python_library_for_converting_audio/). The ones that worked best for me were these

* Magenta (from Google) and is neural network based: https://piano-scribe.glitch.me/
* Basic Pitch (from Spotify): https://github.com/spotify/basic-pitch
* Samplab: https://samplab.com/features#audio2midi
* Melodyne: https://producersociety.com/export-midi-from-melodyne-tutorial/


The goal of conversion is to create robust MIDI datasets that we can use to train a simple model to predict the next notes. Of course, when building our training sets we'll look for the following in priority order, given the poor nature of conversion tools: 
* open source MIDI datasets (topic of our next post)
* clean MIDI files we can generate
* audio files we can convert to MIDI (using the tools above)

**Note: As you'll see below, these tools are NOT great for anything more than 1 stem or even very complex single step audio files (think piano pieces with complex arpeggios, triplets, 32nd notes, etc.).**


In [None]:
# Code converting audio files (.wav, .mp3, etc) to .mid files