<a href="https://colab.research.google.com/github/curtiscu/LYIT/blob/master/Visualizations.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Gathering visualizations ideas

# Setup env


In [0]:
# print all cell output
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"


## Google drive access

In [2]:
# mount google drive

from google.colab import drive
drive.mount('/content/drive', force_remount=True)


Mounted at /content/drive


In [3]:
# test, peek at data
! ls -al '/content/drive/My Drive/groove-v1.0.0-midionly/groove/drummer1/eval_session/'

# test, modules from local  'E:\Google Drive\LYIT\Dissertation\modules'
! ls -al '/content/drive/My Drive/LYIT/Dissertation/modules/'

total 35
-rw------- 1 root root 2589 Apr 27 12:01 10_soul-groove10_102_beat_4-4.mid
-rw------- 1 root root 4793 Apr 27 12:01 1_funk-groove1_138_beat_4-4.mid
-rw------- 1 root root 3243 Apr 27 12:01 2_funk-groove2_105_beat_4-4.mid
-rw------- 1 root root 4466 Apr 27 12:01 3_soul-groove3_86_beat_4-4.mid
-rw------- 1 root root 2551 Apr 27 12:01 4_soul-groove4_80_beat_4-4.mid
-rw------- 1 root root 3798 Apr 27 12:01 5_funk-groove5_84_beat_4-4.mid
-rw------- 1 root root 3760 Apr 27 12:01 6_hiphop-groove6_87_beat_4-4.mid
-rw------- 1 root root 1894 Apr 27 12:01 7_pop-groove7_138_beat_4-4.mid
-rw------- 1 root root 2437 Apr 27 12:01 8_rock-groove8_65_beat_4-4.mid
-rw------- 1 root root 3448 Apr 27 12:01 9_soul-groove9_105_beat_4-4.mid
total 20
-rw------- 1 root root 15416 May 22 19:20 data_prep.py
drwx------ 2 root root  4096 May 22 19:21 __pycache__


## Auto reload module

I'm now using a module edited and saved to google drive which is automatically pushed to the cloud and made available to the colab env. Changes need time to propagate, and imports don't 'reimport' to load changes, so trying the following...

Note the code below is not very reliable, it seems to work occasionally, after some time, but haven't been able to work out the pattern to it. 

If in a hurry, brute force loading of changes by restarting the runtime.

In [0]:
# tool to auto reload modules.
%load_ext autoreload

# config to auto-reload all modules, handy to make 
# writing and testing modules much easier.
%autoreload 2

## Imports and accessing lib functions

In [5]:
# install required libs
!pip install mido



In [6]:
# imports
import pandas as pd
import math

# import my modules
import sys
sys.path.append('/content/drive/My Drive/LYIT/Dissertation/modules/')
import data_prep



LOADING - data_prep.py module name is: data_prep


In [7]:
# testing auto reload of modules 
data_prep.test_function_call('bling')

test function called worked! :)  bling


## Pandas display options

In [0]:
def set_pandas_display_options() -> None:
    # Ref: https://stackoverflow.com/a/52432757/
    display = pd.options.display

    display.max_columns = 1000
    display.max_rows = 200
    display.max_colwidth = 1000
    display.width = None
    # display.precision = 2  # set as needed

set_pandas_display_options()
#pd.reset_option('all')


## Test creating object from custom module

In [0]:
gmt = data_prep.GrooveMidiTools

In [10]:
file_name = '/content/drive/My Drive/groove-v1.0.0-midionly/groove/drummer5/eval_session/1_funk-groove1_138_beat_4-4.mid'
midi_file = data_prep.MIDI_File_Wrapper(file_name, gmt.mappings)

FILE: /content/drive/My Drive/groove-v1.0.0-midionly/groove/drummer5/eval_session/1_funk-groove1_138_beat_4-4.mid
    tracks: [<midi track 'Base Midi' 1037 messages>]
    time sig: <meta message time_signature numerator=4 denominator=4 clocks_per_click=24 notated_32nd_notes_per_beat=8 time=0>
    tempo: <meta message set_tempo tempo=434783 time=0>
    last note_on: 30634
    good instruments: 4, {36.0: 'Bass Drum 1', 38.0: 'Acoustic Snare', 42.0: 'Closed Hi Hat', 51.0: 'Ride Cymbal 1'}


... the above verifies I'm able to create custom objects from custom code, great!


# Visualization class

Don't think I'm going to be using this, it's getting too complex and I'm getting nowhere...

In [0]:

import matplotlib
import matplotlib.pyplot as plt
import numpy as np

class DataVis:
  '''
  Class to collect come visualizations created so far

  Attributes:
      < stuff from __init__ >

  '''

  def __init__(self):
    pass

  # might want to just use something like this method to configure
  # things then call the chart I want.
  def setup():
    pass

  # uses broken_barh
  def score_plot(self, ts_num, ts_denom, start_tick, end_tick, s_notes, s_cum_ticks, s_offset_ticks, instruments=None):
    ''' 
    Print horizontal bar chart, somewhat resembles music score layout

    s_notes (MIDI note numbers), s_cum_ticks (cumulative tick position), s_offset_ticks (offset of MIDI event) should
    be series of aligned values, a row in one series relates to a row in the other series. These could be pulled
    from the same DataFrame holding (in order) the list of MIDI events load from a file.

    The expectation is that only 'note_on' events are passed in.

    Parameters:
      ts_num (int): time signature numerator
      ts_denom (int): time signature denominator
      start_tick (int): position to start printing from
      end_tick (int): click to print to
      s_notes (Series): Series of notes to print
      s_cum_ticks (Series): Series of cumulative tick counts to print
      s_offset_ticks (Series): Series of offsets to print
      instruments (array): array of instruments to display, can be a subset of those passed in which will filter
        the printed events. array order indicates chart order y-axis from low to high. empty array prints all.

    '''

    if ( s_notes.size != s_cum_ticks.size or s_cum_ticks.size != s_offset_ticks.size):
      raise ValueError('ERROR! s_notes ({}), s_cum_ticks({}), and s_offset_ticks({}) must all be the same size!'.format(s_notes, s_cum_ticks, s_offset_ticks))
    
    frame_data = {'note':s_notes, 'tick':s_cum_ticks, 'offset':s_offset_ticks}

    data_df = pd.DataFrame(frame_data)

    # filter to tick range
    time_limited_df = data_df[data_df.tick >= start_tick]
    time_limited_df = data_df[time_limited_df.tick < end_tick]

    if instruments is None:
      instruments = data_df.note.unique()


    for i in instruments:
      instrument_hits = []
      for i_time in time_limited_df.loc[time_limited_df[time_limited_df.note] == i, time_limited_df.tick]:
        instrument_hits.append((i_time, 10))

      print('instrument {}: {}'.format(i, instrument_hits))
    


  def time_offsets():
    pass






## Scratchpad for horitzonal broken_barh

```


#############################################
## Setup steps
#############################################

# number bars to render
ticks_to_render = ticks_per_bar * 2

print('will render bars: {}, total ticks: {}'.format(int(ticks_to_render/ ticks_per_bar), ticks_to_render))

# filter to first bar
limited_df = df[df.total_ticks < ticks_to_render]

# filter to only note_ons
df_ons_only = limited_df[limited_df[midi_file.type_col] == 'note_on']

# filter to selected instruments...
kic_sn_hats_df = df_ons_only[df_ons_only[midi_file.note_col].isin([36, 38, 51] )] 

# check just the specified notes...
kic_sn_hats_df.note.unique()

def get_hit_times_tuple():
  
  hh = [] # hi hats
  for t in kic_sn_hats_df.loc[kic_sn_hats_df[midi_file.note_col] == 51, midi_file.cum_ticks_col]:
    hh.append((t, 10))
  print('hat times: {}'.format(hh))

  sn = [] # snare drum
  for t in kic_sn_hats_df.loc[kic_sn_hats_df[midi_file.note_col] == 38, midi_file.cum_ticks_col]:
    sn.append((t, 10))
  print('snare times: {}'.format(sn))

  k = [] # kick/ bass drum
  for t in kic_sn_hats_df.loc[kic_sn_hats_df[midi_file.note_col] == 36, midi_file.cum_ticks_col]:
    k.append((t, 10))
  print('kick times: {}'.format(k))

  return hh, k, sn

hats, kicks, snares = get_hit_times_tuple()  

#############################################
## Show plot
#############################################

fig, ax = plt.subplots()
fig.set_size_inches(24, 4, forward=True)

ax.broken_barh(hats, (30, 9), facecolors='tab:green')

# do snares
ax.broken_barh(snares, (20, 9), facecolors='tab:blue')

# do kicks
ax.broken_barh(kicks, (10, 9), 
               facecolors='tab:red')


ax.set_ylim(5, 45)
ax.set_xlim(0, ticks_to_render) 
ax.set_xlabel('ticks since start')
ax.set_yticks([15, 25, 35])
ax.set_yticklabels(['Kick', 'Snare', 'Hats'])
ax.set_xticks(range(0, int(ticks_to_render), int(ticks_per_16)))
ax.grid(True)

plt.show()
```




### Testing - Setup timing info

In [42]:
file_name = '/content/drive/My Drive/groove-v1.0.0-midionly/groove/drummer5/eval_session/1_funk-groove1_138_beat_4-4.mid'
midi_file = data_prep.MIDI_File_Wrapper(file_name, gmt.mappings)
f = midi_file


FILE: /content/drive/My Drive/groove-v1.0.0-midionly/groove/drummer5/eval_session/1_funk-groove1_138_beat_4-4.mid
    tracks: [<midi track 'Base Midi' 1037 messages>]
    time sig: <meta message time_signature numerator=4 denominator=4 clocks_per_click=24 notated_32nd_notes_per_beat=8 time=0>
    tempo: <meta message set_tempo tempo=434783 time=0>
    last note_on: 30634
    good instruments: 4, {36.0: 'Bass Drum 1', 38.0: 'Acoustic Snare', 42.0: 'Closed Hi Hat', 51.0: 'Ride Cymbal 1'}


In [0]:
# capture timing data in df

mtt = data_prep.MidiTimingTools(file_name, f.ticks(), f.tempo_us(), f.ts_num(), f.ts_denom(), f.last_hit())
# print(mtt)

f_df = f.df_midi_data
f_df = f_df[f_df['msg_type'] == 'note_on'].copy() # we only care about 'note_on' events
beats_col, offsets_col = mtt.get_offsets(f_df[f.cum_ticks_col])
f_df['offset'] = offsets_col
f_df['beat'] = beats_col

### Testing - specify/ configure what to print
The next cell is all you need, put in what bar to start at and how many bars to print and hit the go button and it should do the rest ...

In [0]:
# specify what bar to start chart
bar_to_start=15

# specify how many bars to include
bars_to_render=2

# instrument filter
# if you leave empty, all instruments will be
# rendered, otherwise filtered to this list, will
# be ordered from bottom to top y-axis.

instruments_to_render=[]  # good empty list
# instruments_to_render=[38, 36, 42]  # good list
# instruments_to_render=[33, 32, 49, 999]  # broken list



### Testing - do the required calculations and print

In [46]:
# time filtering the df here...

ts_num = f.ts_num()
ts_den = f.ts_denom()
ticks_per_bar=mtt.ticks_per_bar()

# workout ticks
start_tick, end_tick = mtt.get_tick_range(bar_to_start, bars_to_render)
print('start tick: {}, end tick: {}'.format(start_tick, end_tick))
print('total to print, ticks: {}, bars: {}'.format(end_tick - start_tick, (end_tick - start_tick) /ticks_per_bar))

# filter our events to render
time_filtered_df = f_df[f_df.total_ticks >= start_tick]  # chop off early ones
time_filtered_df = time_filtered_df[time_filtered_df.total_ticks < end_tick] # chop off later ones

start tick: 26820.0, end tick: 30660.0
total to print, ticks: 3840.0, bars: 2.0


In [47]:
names = data_prep.MidiTools.getInstruments(instruments_to_render)
print('provided instrument filter: {}, names: {}'.format(instruments_to_render, names ))

instr_time_filtered_df = None 

# instrument filtering the df here ...
if len(instruments_to_render) == 0:
  instr_time_filtered_df = time_filtered_df[time_filtered_df['note'] != None]
else:
  instr_time_filtered_df = time_filtered_df[time_filtered_df['note'].isin(instruments_to_render)]


final_instruments_to_render = instr_time_filtered_df.note.unique()
names = data_prep.MidiTools.getInstruments(final_instruments_to_render)
print('final instruments: {}, names: {}'.format(final_instruments_to_render, names ))


provided instrument filter: [], names: []
final instruments: [36. 42. 51. 38.], names: ['Bass Drum 1', 'Closed Hi Hat', 'Ride Cymbal 1', 'Acoustic Snare']


### Build data structure required by broken_barh

In [60]:
bag_of_instruments = {}
for i in final_instruments_to_render:
  instrument_hits = instr_time_filtered_df.loc[instr_time_filtered_df['note'] == i, 'total_ticks']
  instrument_hit_duples = []
  for i_time in instrument_hits:
    instrument_hit_duples.append((i_time, 10))

  bag_of_instruments[i] = instrument_hit_duples


print('bag_of_instruments: {}'.format(bag_of_instruments))

bag_of_instruments: {36.0: [(26894, 10), (27123, 10), (28095, 10), (28232, 10), (28811, 10), (29032, 10), (30001, 10), (30121, 10)], 42.0: [(26897, 10), (27353, 10), (27863, 10), (28286, 10), (28808, 10), (29281, 10), (29799, 10), (30230, 10)], 51.0: [(26903, 10), (27135, 10), (27394, 10), (27621, 10), (27867, 10), (28112, 10), (28334, 10), (28584, 10), (28819, 10), (29059, 10), (29315, 10), (29538, 10), (29781, 10), (30026, 10), (30256, 10), (30511, 10)], 38.0: [(27020, 10), (27091, 10), (27138, 10), (27629, 10), (27967, 10), (28064, 10), (28341, 10), (28442, 10), (28692, 10), (28958, 10), (29015, 10), (29542, 10), (29877, 10), (30004, 10), (30263, 10), (30634, 10)]}


### work out some numbers for plotting

In [75]:

num_instruments = len(bag_of_instruments.keys())

dict_keys([36.0, 42.0, 51.0, 38.0])

In [0]:
from itertools import cycle
cycol = cycle('bgrcmykw')  # use this for colours


In [0]:

#############################################
## Show plot
#############################################
 
fig, ax = plt.subplots()
fig.set_size_inches(12*bars_to_render, 1+num_instruments, forward=True)
 
# loop for each instrument
y_axis = 0
for i in bag_of_instruments.keys():
  y_axis += 10
  ax.broken_barh(bag_of_instruments[i], (y_axis, 9), facecolors=cycol)


ax.set_ylim(5, (num_instruments*10)+10)
ax.set_xlim(start_tick, end_tick) 
ax.set_xlabel('ticks since start')
ax.set_yticks([15, 25, 35])
ax.set_yticklabels(['Kick', 'Snare', 'Hats'])
ax.set_xticks(range(0, int(ticks_to_render), int(ticks_per_16)))
ax.grid(True)
 
plt.show()


## Scratchpad for offset chart


```


#############################################
## Setup
#############################################

def get_hit_times_array():
  
  hh = [] # hi hats
  for t in kic_sn_hats_df.loc[kic_sn_hats_df[midi_file.note_col] == 51, midi_file.cum_ticks_col]:
    hh.append(t)
  print('hat times: {} - {}'.format(len(hh), hh))

  sn = [] # snare drum
  for t in kic_sn_hats_df.loc[kic_sn_hats_df[midi_file.note_col] == 38, midi_file.cum_ticks_col]:
    sn.append(t)
  print('snare times: {} - {}'.format(len(sn), sn))

  k = [] # kick/ bass drum
  for t in kic_sn_hats_df.loc[kic_sn_hats_df[midi_file.note_col] == 36, midi_file.cum_ticks_col]:
    k.append(t)
  print('kick times: {} - {}'.format(len(k), k))

  return np.array(hh), np.array(k), np.array(sn)

def get_offset_times_array():
  hats = [] # hi-hats
  for t in kic_sn_hats_df.loc[kic_sn_hats_df[midi_file.note_col] == 51, 'beat_offset']:
    hats.append(t)
  print('hat offsets: {} - {}'.format(len(hats), hats))

  snares = [] # snare offsets
  for t in kic_sn_hats_df.loc[kic_sn_hats_df[midi_file.note_col] == 38, 'beat_offset']:
    snares.append(t)
  print('snare offset: {} - {}'.format(len(snares), snares))

  kicks = [] # kick offsets
  for t in kic_sn_hats_df.loc[kic_sn_hats_df[midi_file.note_col] == 36, 'beat_offset']:
    kicks.append(t)
  print('kick offset: {} - {}'.format(len(kicks), kicks))

  return np.array(hats), np.array(kicks), np.array(snares)

# gather times of notes strikes of performance
hh_hits, kick_hits, sn_hits = get_hit_times_array()  

# gather list of strike offsets, i.e. how far the strikes were ahead/ behind the beat
hh_offsets, kick_offsets, sn_offsets = get_offset_times_array()

#############################################
## Show plot
#############################################

x_ticks = [*range(0, int(ticks_to_render), int(ticks_per_16))]
print(x_ticks)
len(x_ticks)

fig, ax = plt.subplots()

fig.set_size_inches(24, 6, forward=True)

ax.plot(hh_hits, hh_offsets, '-o', ms= 10, label='hats offsets')
ax.plot(kick_hits, kick_offsets, '-o', ms= 10, label='kick offsets')
ax.plot(sn_hits, sn_offsets, '-o', ms= 10,label='snare offsets')



ax.set(xlabel='time (ticks)', ylabel='offset (ticks)',
       title='kick, sn, hats timing offset data')

#Add horizontal and vertical lines
plt.axhline(0, color='black', linestyle='dotted', linewidth=4)  #horizontal line

# for xc in x_ticks:
#  plt.axvline(x=xc, color='black', linestyle='dotted')

ax.set_xticks(range(0, int(ticks_to_render), int(ticks_per_16)))

#ax.grid(axis='y')
ax.grid()
ax.legend()

#fig.savefig("test.png")
plt.show()


```
