# RMS/Peak matrix

In this notebook we will design and use filters to extract specific audio information from a piece of music.

In [None]:
%matplotlib inline
%pylab inline
import matplotlib.pyplot as plt
import numpy as np
import IPython
from IPython.display import Image
import scipy.signal as sp
from scipy.io import wavfile

gray();

Let's load an audio clip; we assume PCM mono format with 16 bits per sample. Change the code accordingly for other formats. Audio is normalized over the $[-1, 1]$ interval.

In [None]:
SF, s = wavfile.read('beethoven.wav')
# normalize amplitude
s = s / 32767.0

IPython.display.Audio(s, rate=SF)

Helper function to return rounded dB values. Max reference value is 1 because of audio normalization

In [None]:
def idB(val):
    if val == 0:
        return 120
    return int(-20 * np.log10(val))

The following parameters define the granularity of the analysis:

In [None]:
WIN_LEN_MS = 40 # length of the analysis window in milliseconds
WIN_OVERLAP = 0 # window overlap (percentage between 0 and 1)

Let's analyze the audio:

In [None]:
# convert ms to samples
win_len = int((WIN_LEN_MS * SF) / 1000.0)
win_ovr = int(win_len * WIN_OVERLAP)

# initialize matrix
res = np.zeros((121, 121))

# scan audio
for n in xrange(0, len(s) - win_len, win_len - win_ovr):
    w = s[n:n+win_len] 
    pm = idB(max(w))
    rms = idB(np.sqrt(np.mean(np.square(w))))
    res[pm, rms] += 1

In [None]:
# normalize matrix range so that max is black and min is white
m = np.max(res)
res = 1 - res / m

In [None]:
plt.matshow(res); 
plt.gca().invert_xaxis()
plt.plot([0,120], [0, 120], linewidth=0.2)
plt.plot([10,120], [0, 110], linewidth=0.2)
plt.plot([0,110], [10, 120], linewidth=0.2)