# Generating a MIDI music file from a time series: a mostly useless but quite cool routine
*Rémy Lapere - eScience course - 8/11/2022*

The key library for that is miditime. Then, all you need to do is build an array specifying (time, pitch, velocity, duration) for every time step. The pitch is based on the values in the time series.

In this example we will compose the original music: *Evolution of annual mean temperature in the Arctic in C minor* from NorESM CMIP6 data in scenario ssp585.

Main steps:
- define the minimum and maximum pitch
- rescale the data to this range
- apply a key to obtain a nicer harmony
- manipulate the data to avoid repetitions and generate info on time, velocity and duration
- generate a MIDI file
- export that MIDI file to your favorite software

For a more comprehensive reference see *de Mora et al., 2020* - https://gc.copernicus.org/articles/3/263/2020/

In [5]:
import xarray as xr
xr.set_options(display_style='html')
import intake
import cftime
import numpy as np
import pandas as pd
from miditime.miditime import MIDITime

Define minimum and maximum notes for the song (middle C is 60, below 25 is very low, higher than 100 is very high). A unit corresponds to a semitone.

In [7]:
minpitch, maxpitch = 24, 96

Load data from pangeo (here surface temperature in historical and ssp585 experiments)

In [8]:
cat_url = "https://storage.googleapis.com/cmip6/pangeo-cmip6.json"
col = intake.open_esm_datastore(cat_url)
cat = col.search(variable_id=['pr'], member_id=['r1i1p1f1'], source_id=['NorESM2-LM'], experiment_id=['historical','ssp585'], table_id=['day'])

In [9]:
dset_dict = cat.to_dataset_dict(zarr_kwargs={'use_cftime':True})


--> The keys in the returned dictionary of datasets are constructed as follows:
	'activity_id.institution_id.source_id.experiment_id.table_id.grid_label'


In [10]:
dataset_list = list(dset_dict.keys())

In [11]:
AR_hist = dset_dict[dataset_list[1]] # historical experiment
AR_fut5 = dset_dict[dataset_list[0]] # ssp585 experiment

Extract the area/period of interest, and rescale the time series to the range of authorized pitch

In [13]:
def ext_pole():
    """
    function for extraction area/period of interest
    and concatenate into a single numpy
    """
    lmi,lma = 60,90

    AR_hist_yr_ARC = AR_hist.sel(lat=slice(lmi,lma))
    AR_hist_yr_ARC = AR_hist_yr_ARC.reduce(np.mean,dim=('lat','lon'))
    AR_hist_yr_ARC = AR_hist_yr_ARC.resample(time='1Y').mean()
    histar = AR_hist_yr_ARC.pr.values

    arc585 = AR_fut5.sel(lat=slice(lmi,lma))
    arc585 = arc585.reduce(np.mean,dim=('lat','lon'))
    arc585 = arc585.resample(time='1Y').mean()
    arc585 = arc585.pr.values
    
    out585 = np.append(histar,arc585)
    return out585
    
n585 = ext_pole()

mini = np.min(n585)
maxi = np.max(n585)

# normalize data to [0,1]
n585_ = (n585-mini)/(maxi-mini)

# apply a mapping to the range of authorized min/max pitch
n585_ = minpitch+n585_*(maxpitch-minpitch)

# if you want to associate increasing temperature with lower pitch notes
#n585_ = minpitch-n585_+maxpitch

# make it an integer type because MIDI only handles semitones
n585_ = n585_.astype(int)

# store notes into a data set along with time steps
df_ = pd.DataFrame({'val':n585_,'step':range(len(n585_))})

- So far we have a list of notes and time steps... but not all the notes can work together in harmony.

- In the next part we interpolate the notes to the nearest "authorized" note according to the chosen key.

- Here the example key is C minor, i.e. authorized notes are: C,Eb,G.

- Our 'zero' note is 24 which corresponds to C, 2 octaves (1 octave is 12 semitones) below middle C.

- Therefore the list of authorized notes is {0,3,7}mod(12)

In [14]:
def _to_chords_(df, key):
    """
    function to map the original notes to the defined key
    """
    notes = range(len(np.arange(minpitch,maxpitch+1,1)))
    notes_ = np.arange(minpitch,maxpitch+1,1)
    dom = np.mod(notes,12)==key[0]
    tir = np.mod(notes,12)==key[1]
    qui = np.mod(notes,12)==key[2]
    auth = dom+tir+qui
    auth_notes = notes_[auth]
    notin = df.val.values
    i=0
    for nn in notin:
        dist = np.abs(auth_notes-nn)
        tru_note = auth_notes[np.argmin(dist)]
        notin[i] = tru_note
        i=i+1
    df['val'] = notin
    return df

In [15]:
def extract_sdt(indata,kkeys):
    """
    function to aggregate consecutive notes
    and includes info on duration/velocity
    """
    didif = [indata['step'].values[0]]
    indata = _to_chords_(indata,kkeys)
    for i in np.arange(1,len(indata.val.values)):
        if indata['val'].values[i]==indata['val'].values[i-1]:
            didif = np.append(didif,indata['step'].values[i-1])
        else:
            didif = np.append(didif,indata['step'].values[i])
    indata['dif'] = didif
    steps = indata.groupby(['dif','val'],as_index=False).count().step.values
    vals = (indata.groupby(['dif','val'],as_index=False).mean().val.values).astype(int)
    new_df = pd.DataFrame({'note':vals,
                       'steps':np.cumsum(steps)-np.min(np.cumsum(steps)),
                       'duration':np.append(steps[1:],2),
                      'force':np.repeat(127,len(steps))})
    for j in np.arange(1,len(new_df.note.values)):
        if new_df.note.values[j] == new_df.note.values[j-1]:
            new_df.steps[j] = new_df.steps.values[j-1]
    dur = new_df.groupby(['steps'],as_index=False).sum().duration.values
    new_df = new_df.drop_duplicates(['steps','note'])
    new_df.duration = dur
    new_df['force'] = np.linspace(100,126,len(dur)).astype(int)
    new_df = new_df[['steps','note','force','duration']]
    return new_df

ext_mus = extract_sdt(df_,[0,3,7])
# save to csv
ext_mus.to_csv('arc_tas_music_585')

In [16]:
def to_midi(infile,nm):
    """
    function to convert the data to MIDI file
    """
    mymidi = MIDITime(160, nm+'.mid') # 160 is the tempo
    music = np.array(pd.read_csv(infile,skiprows=1,header=None,index_col=0)).tolist()
    # Add a track with those notes                                                                                                                                                                              
    mymidi.add_track(music)
    # Output the .mid file                                                                                                                                                                                      
    mymidi.save_midi()
    
# the exported MIDI file can be played with a dedicated audio player (e.g. GarageBand)
to_midi('arc_tas_music_585','arc_tas_music_585')

36 0 1 100
31 1 1 100
43 2 1 100
51 3 1 100
31 4 1 100
36 5 1 100
39 6 1 100
36 7 1 100
39 8 1 101
31 9 1 101
39 10 1 101
43 11 2 101
36 13 1 101
39 14 1 101
43 15 1 101
36 16 2 101
31 18 1 102
27 19 1 102
39 20 1 102
31 21 2 102
36 23 2 102
39 25 1 102
43 26 1 102
48 27 1 103
39 28 2 103
36 30 1 103
31 31 1 103
36 32 2 103
39 34 2 103
36 36 1 103
39 37 2 103
36 39 1 104
43 40 1 104
39 41 1 104
31 42 2 104
43 44 2 104
48 46 1 104
39 47 1 104
48 48 1 104
39 49 1 105
43 50 1 105
48 51 1 105
43 52 2 105
36 54 1 105
39 55 1 105
31 56 2 105
39 58 2 106
36 60 1 106
39 61 2 106
31 63 1 106
24 64 2 106
36 66 1 106
39 67 1 106
43 68 1 106
36 69 1 107
39 70 1 107
36 71 1 107
43 72 1 107
51 73 1 107
39 74 1 107
36 75 2 107
39 77 1 108
48 78 1 108
36 79 2 108
39 81 1 108
36 82 1 108
39 83 2 108
36 85 1 108
43 86 1 108
39 87 1 109
36 88 1 109
48 89 1 109
43 90 1 109
31 91 2 109
39 93 1 109
36 94 1 109
43 95 1 109
48 96 2 110
43 98 1 110
36 99 1 110
39 100 1 110
36 101 1 110
48 102 2 110
39 104 1 11