<a href="https://colab.research.google.com/github/curtiscu/LYIT/blob/master/MidiTimingTools_class.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# MIDI Timing Tools

# Setup env


In [0]:
# print all cell output
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"


## Google drive access

In [2]:
# mount google drive

from google.colab import drive
drive.mount('/content/drive', force_remount=True)


Mounted at /content/drive


In [3]:
# test, peek at data
! ls -al '/content/drive/My Drive/groove-v1.0.0-midionly/groove/drummer1/eval_session/'

# test, modules from local  'E:\Google Drive\LYIT\Dissertation\modules'
! ls -al '/content/drive/My Drive/LYIT/Dissertation/modules/'

total 35
-rw------- 1 root root 2589 Apr 27 12:01 10_soul-groove10_102_beat_4-4.mid
-rw------- 1 root root 4793 Apr 27 12:01 1_funk-groove1_138_beat_4-4.mid
-rw------- 1 root root 3243 Apr 27 12:01 2_funk-groove2_105_beat_4-4.mid
-rw------- 1 root root 4466 Apr 27 12:01 3_soul-groove3_86_beat_4-4.mid
-rw------- 1 root root 2551 Apr 27 12:01 4_soul-groove4_80_beat_4-4.mid
-rw------- 1 root root 3798 Apr 27 12:01 5_funk-groove5_84_beat_4-4.mid
-rw------- 1 root root 3760 Apr 27 12:01 6_hiphop-groove6_87_beat_4-4.mid
-rw------- 1 root root 1894 Apr 27 12:01 7_pop-groove7_138_beat_4-4.mid
-rw------- 1 root root 2437 Apr 27 12:01 8_rock-groove8_65_beat_4-4.mid
-rw------- 1 root root 3448 Apr 27 12:01 9_soul-groove9_105_beat_4-4.mid
total 15
-rw------- 1 root root 11226 May 20 23:38 data_prep.py
drwx------ 2 root root  4096 May 10 13:31 __pycache__


## Auto reload module

I'm now using a module edited and saved to google drive which is automatically pushed to the cloud and made available to the colab env. Changes need time to propagate, and imports don't 'reimport' to load changes, so trying the following...

Note the code below is not very reliable, it seems to work occasionally, after some time, but haven't been able to work out the pattern to it. 

If in a hurry, brute force loading of changes by restarting the runtime.

In [0]:
# tool to auto reload modules.
%load_ext autoreload

# config to auto-reload all modules, handy to make 
# writing and testing modules much easier.
%autoreload 2

## Imports and accessing lib functions

In [5]:
# install required libs
!pip install mido



In [6]:
# imports
import pandas as pd
import math

# import my modules
import sys
sys.path.append('/content/drive/My Drive/LYIT/Dissertation/modules/')
import data_prep



LOADING - data_prep.py module name is: data_prep


In [7]:
# testing auto reload of modules 
data_prep.test_function_call('bling')

test function called worked! :)  bling


## Pandas display options

In [0]:
def set_pandas_display_options() -> None:
    # Ref: https://stackoverflow.com/a/52432757/
    display = pd.options.display

    display.max_columns = 1000
    display.max_rows = 200
    display.max_colwidth = 1000
    display.width = None
    # display.precision = 2  # set as needed

set_pandas_display_options()
#pd.reset_option('all')


## Test creating object from custom module

In [0]:
gmt = data_prep.GrooveMidiTools

In [10]:
file_name = '/content/drive/My Drive/groove-v1.0.0-midionly/groove/drummer5/eval_session/1_funk-groove1_138_beat_4-4.mid'
midi_file = data_prep.MIDI_File_Wrapper(file_name, gmt.mappings)

FILE: /content/drive/My Drive/groove-v1.0.0-midionly/groove/drummer5/eval_session/1_funk-groove1_138_beat_4-4.mid
    tracks: [<midi track 'Base Midi' 1037 messages>]
    time sig: <meta message time_signature numerator=4 denominator=4 clocks_per_click=24 notated_32nd_notes_per_beat=8 time=0>
    tempo: <meta message set_tempo tempo=434783 time=0>
    last note_on: 30634
    good instruments: 4, {36.0: 'Bass Drum 1', 38.0: 'Acoustic Snare', 42.0: 'Closed Hi Hat', 51.0: 'Ride Cymbal 1'}


... the above verifies I'm able to create custom objects from custom code, great!


In [11]:
midi_file.last_hit()

30634

# Create MidiTimingTools class

Worker class to handle standard calculations with MIDI timing data, allow easily extracting beats, bars, bars in file, calculating beat positions, etc.

In [0]:
import math
import pandas as pd

class MidiTimingTools:


  # all parameters required
  def __init__(self, label, file_ticks_per_beat, us_per_beat, time_sig_numerator, time_sig_denominator, last_note_on):
    self.label = label                                # pretty label, for handy reference
    self.file_ticks_per_beat = file_ticks_per_beat    # from MIDI file header
    self.time_sig_numerator = time_sig_numerator      #   "
    self.time_sig_denominator = time_sig_denominator  #   "
    self.us_per_beat = us_per_beat                    # from MIDI file meta message
    self.last_note_on = last_note_on                  # from MIDI file data


  # For call to str(). Prints readable form, tests all 
  # function calls to build debug string output. 
  def __str__(self): 
    return str("LABEL: {} \n  Ticks p/beat: {} \n  BPM: {} \n  time sig: {}/ {} \n  bars in file: {} \n  beats in file: {} \n  ticks in file: {} \n  bins: {} \n  beats: {}"
    .format(self.label, 
            self.file_ticks_per_beat,
            self.bpm(),
            self.time_sig_numerator,
            self.time_sig_denominator,
            self.bars_in_file(),
            self.beats_in_file(), 
            self.ticks_in_file(), 
            self.get_bins(), 
            self.get_beats()))

  def bpm(self):
    return (60 * 1000000) / self.us_per_beat

  # ts = time signature
  def ts_ticks_per_beat(self):
    return self.file_ticks_per_beat * ( 4/ self.time_sig_denominator )

  def ticks_per_bar(self):
    return self.ts_ticks_per_beat() * self.time_sig_numerator

  def ticks_per_8(self):
    return self.file_ticks_per_beat/ 2
    
  def ticks_per_16(self):
    return self.file_ticks_per_beat / 4

  # calculates total bars, round up for whole bars
  def bars_in_file(self):
    return math.ceil(self.last_note_on / self.ticks_per_bar()) 

  def ticks_in_file(self):
    return int(self.bars_in_file() * self.ticks_per_bar()) # total ticks to render (file_range)

  def beats_in_file(self):
    return self.bars_in_file() * self.time_sig_numerator

  # bucket size for quantizing, set to 1/16 notes
  def bin_size(self):
    return int(self.ticks_per_16()) 

  def get_bins(self):   # my_bins
    my_bin_size = self.bin_size()
    file_range = self.ticks_in_file()
    return range(0 - (int(my_bin_size/ 2)), file_range + my_bin_size, my_bin_size)

  def get_beats(self):
    my_bin_size = self.bin_size()
    file_range = self.ticks_in_file()
    return range(0, file_range + my_bin_size, my_bin_size) 

  # NOTE - don't think I actually need this at all
  def calculated_bins(self, cumulative_ticks_series):
	  return pd.cut(cumulative_ticks_series, bins=self.get_bins(), right=False)

  # takes series with cumulative ticks since start of file for the 
  # MIDI note_on events, and returns a series stating the centre
  # of the beat for each given MIDI event
  def assigned_beat_location(self, cumulative_ticks_column):
    #return pd.cut(cumulative_ticks_column.values, bins=self.get_bins(), right=False, labels=self.get_beats())
    return pd.cut(cumulative_ticks_column, bins=self.get_bins(), right=False, labels=self.get_beats())


  def get_offsets(self, cumulative_ticks_column):
    my_beats = self.assigned_beat_location(cumulative_ticks_column)
    tmp_dict = dict(enumerate(my_beats.cat.categories))
    beat_centers = my_beats.cat.codes.map(tmp_dict)
    offsets = cumulative_ticks_column - beat_centers
    return offsets



## Test MidiTimingTools class

In [13]:
# will use previously loaded file ...
print("file: {}".format(file_name))
print("MIDI file: {}".format(midi_file))

file: /content/drive/My Drive/groove-v1.0.0-midionly/groove/drummer5/eval_session/1_funk-groove1_138_beat_4-4.mid
MIDI file: file: <midi file '/content/drive/My Drive/groove-v1.0.0-midionly/groove/drummer5/eval_session/1_funk-groove1_138_beat_4-4.mid' type 0, 1 tracks, 1037 messages>


In [0]:
f = midi_file
mtt = MidiTimingTools(file_name, f.ticks(), f.tempo_us(), f.ts_num(), f.ts_denom(), f.last_hit())

In [15]:
print(mtt)

LABEL: /content/drive/My Drive/groove-v1.0.0-midionly/groove/drummer5/eval_session/1_funk-groove1_138_beat_4-4.mid 
  Ticks p/beat: 480 
  BPM: 137.99987580011177 
  time sig: 4/ 4 
  bars in file: 16 
  beats in file: 64 
  ticks in file: 30720 
  bins: range(-60, 30840, 120) 
  beats: range(0, 30840, 120)


In [16]:
f_df = f.df_midi_data
f_df[f.cum_ticks_col]

0           0
1           0
2           0
3           0
4           0
        ...  
1032    30590
1033    30622
1034    30634
1035    30745
1036    30745
Name: total_ticks, Length: 1037, dtype: int64

In [0]:
# this should be most of what I need to grab a column of
# calculated beat offsets that can be used for everything else
# NOTE: might also need a 'beat_center' column .. ?  ##TODO##
beats_col = mtt.get_offsets(f_df[f.cum_ticks_col])
f_df['beat_offset'] = beats_col 

## Validating output

In [18]:
f_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1037 entries, 0 to 1036
Data columns (total 8 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   msg_type       1037 non-null   string 
 1   delta_ticks    1037 non-null   int64  
 2   total_ticks    1037 non-null   int64  
 3   total_seconds  1037 non-null   float64
 4   note           746 non-null    float64
 5   velocity       746 non-null    float64
 6   raw_data       1037 non-null   string 
 7   beat_offset    1037 non-null   int64  
dtypes: float64(3), int64(3), string(2)
memory usage: 64.9 KB


I've run this on the same file used in workbook 'BeatTrackingTests_1.ipynb' and variously compared lots of rows from head()/ tail() and verified the offsets match everywhere in the following ..

In [19]:
f_df.head(20)
f_df.tail(20)

Unnamed: 0,msg_type,delta_ticks,total_ticks,total_seconds,note,velocity,raw_data,beat_offset
0,track_name,0,0,0.0,,,"{'type': 'track_name', 'name': 'Base Midi', 'time': 0}",0
1,instrument_name,0,0,0.0,,,"{'type': 'instrument_name', 'name': 'Brooklyn', 'time': 0}",0
2,time_signature,0,0,0.0,,,"{'type': 'time_signature', 'numerator': 4, 'denominator': 4, 'clocks_per_click': 24, 'notated_32nd_notes_per_beat': 8, 'time': 0}",0
3,key_signature,0,0,0.0,,,"{'type': 'key_signature', 'key': 'C', 'time': 0}",0
4,smpte_offset,0,0,0.0,,,"{'type': 'smpte_offset', 'frame_rate': 24, 'hours': 33, 'minutes': 1, 'seconds': 15, 'frames': 16, 'sub_frames': 24, 'time': 0}",0
5,set_tempo,0,0,0.0,,,"{'type': 'set_tempo', 'tempo': 434783, 'time': 0}",0
6,control_change,4,4,0.003623,,,"{'type': 'control_change', 'time': 4, 'control': 4, 'value': 77, 'channel': 9}",4
7,note_on,1,5,0.004529,42.0,55.0,"{'type': 'note_on', 'time': 1, 'note': 44, 'velocity': 55, 'channel': 9}",5
8,note_on,4,9,0.008152,36.0,39.0,"{'type': 'note_on', 'time': 4, 'note': 36, 'velocity': 39, 'channel': 9}",9
9,note_on,6,15,0.013587,51.0,67.0,"{'type': 'note_on', 'time': 6, 'note': 51, 'velocity': 67, 'channel': 9}",15


Unnamed: 0,msg_type,delta_ticks,total_ticks,total_seconds,note,velocity,raw_data,beat_offset
1017,note_off,12,30113,27.276293,36.0,64.0,"{'type': 'note_off', 'time': 12, 'note': 36, 'velocity': 64, 'channel': 9}",-7
1018,note_off,2,30115,27.278104,38.0,64.0,"{'type': 'note_off', 'time': 2, 'note': 38, 'velocity': 64, 'channel': 9}",-5
1019,note_on,6,30121,27.283539,36.0,33.0,"{'type': 'note_on', 'time': 6, 'note': 36, 'velocity': 33, 'channel': 9}",1
1020,note_off,17,30138,27.298938,51.0,64.0,"{'type': 'note_off', 'time': 17, 'note': 51, 'velocity': 64, 'channel': 9}",18
1021,control_change,92,30230,27.382271,,,"{'type': 'control_change', 'time': 92, 'control': 4, 'value': 77, 'channel': 9}",-10
1022,note_on,0,30230,27.382271,42.0,42.0,"{'type': 'note_on', 'time': 0, 'note': 44, 'velocity': 42, 'channel': 9}",-10
1023,note_off,3,30233,27.384988,36.0,64.0,"{'type': 'note_off', 'time': 3, 'note': 36, 'velocity': 64, 'channel': 9}",-7
1024,note_on,23,30256,27.405822,51.0,96.0,"{'type': 'note_on', 'time': 23, 'note': 51, 'velocity': 96, 'channel': 9}",16
1025,note_on,7,30263,27.412162,38.0,109.0,"{'type': 'note_on', 'time': 7, 'note': 40, 'velocity': 109, 'channel': 9}",23
1026,control_change,2,30265,27.413974,,,"{'type': 'control_change', 'time': 2, 'control': 4, 'value': 90, 'channel': 9}",25


Another test, ran 'describe()' on the column and verified that using the same input file and the calculated offsets column, the outputs match from 'BeatTrackingTests_1.ipynb' and below ...

In [20]:
beats_col.describe()

count    1037.000000
mean        2.274831
std        24.107999
min       -60.000000
25%       -13.000000
50%         5.000000
75%        19.000000
max        59.000000
dtype: float64