# 21M.387 Fundamentals of Music Processing
## Problem Set 2: Simpler Classifier

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import IPython.display as ipd
from ipywidgets import interact
import sys
sys.path.append("..")
import fmplib as fmp

plt.rcParams['figure.figsize'] = (12, 4)
fmp.documentation_button()

## Exercise 1

The following is a continuous-time signal $x_c(t)$.

![](data/ex1a.png)

If $x_c(t)$ is sampled with a periodicity of $T = 0.04$ seconds, thereby creating a discrete-time (DT) signal $x[n]$, what values of $n$ correspond to:  
a)  the first peak value  
b)  the first zero-crossing  


In [None]:
# answer

The following DT signal $x[n]$ was sampled from a continuous signal $x_c(t)$ at a sampling rate of $F_s = 150$Hz. 

![](data/ex1b.png)

When (in seconds) did the following occur in $x_c(t)$:  
a)  the second peak value  
b)  the second zero-crossing  


In [None]:
# answer

## Exercise 2

You have an audio signal $x[n]$, sampled at a sampling rate $F_s$. You create an energy feature signal $E^x[n]$ using a centered window length of $N$ and a hop size of $H$. Consider the following parameter chocies:  
1. $F_s = 22050, N = 1000, H = 500$
2. $F_s = 44100, N = 1024, H = 441$
3. $F_s = 8000,  N = 2048, H = 256$

For each of these parameters, calculate:
- the feature rate $F_f$ of the energy signal  
- the sampling period of the energy signal  
- the time (in seconds) corresponding to $E^x[300]$.

In [None]:
# answers

## Exercise 3

1. Describe what happens to $E^x[n]$, relatively speaking, when you increase or decrease the hop size $H$.
2. Describe what happens to $E^x[n]$, relatively speaking, when you increase or decrease the window length $L$.
3. What happens if $H > L$


Answers:

## Exercise 4a

Write the function `find_peaks(x, thresh)`, as described in Lecture, where:  

Inputs:
- `x` is the input signal
- `thresh` is a threshold value $\rho \in [0,1]$  

Output:
- an array of indices $P$ corresponding to the peak locations

Peaks are defined as:
$P = (p_1, p_2, \cdots, p_L)$ where $p_n$ is a peak if:
$$ x[p_n] > x[p_n - 1] \wedge x[p_n] > x[p_n + 1] $$

Additionally, filter out any peaks that are smaller than $ \max(x[n]) \cdot \rho $.

In [None]:
def find_peaks(x, thresh):
    pass

Test your code below and try different threshold values.

In [None]:
x4 = np.load('data/ex4.npy')
peaks = find_peaks(x4, 0)

plt.figure()
plt.plot(x4)
plt.plot(peaks, x4[peaks], 'ro');

## Exercise 4b

- What value of `thresh` filters out the "garbage" peaks near 0?
- What value of `thresh` returns just the highest peaks?

In [None]:
# answer:


## Exercise 5

The code below finds the number of zero crossings in a signal. Run the example and see how long it takes to run this code.

In [None]:
def num_zerocrossings_slow(x) :
    count = 0
    for i in range(len(x)-1) :
        if (x[i] > 0 and x[i+1] <= 0) or (x[i] < 0 and x[i+1] >=0):
            count += 1
    return count

In [None]:
x5 = fmp.load_wav('audio/kick_snare.wav', 1.4, 1.45)
plt.plot(x5)
plt.grid()
plt.show()

print( num_zerocrossings_slow(x5) )

In [None]:
%timeit -n200 num_zerocrossings_slow(x5)

Write a more optimized version of this function below. How much faster is your optimized version?

Hint1: You can avoid the python `for` loop entirely.  
Hint2: You can think of a zero-crossing as the signal changing sign (from positive to negative or negative to positive). What mathematical operation and inequality test can be used to determine a sign change? 

In [None]:
def num_zerocrossings(x) :
    pass

## Exercise 6

Write the function `calc_zc(x, win_len, hop_size)` to return a zero-crossing feature. It should take as inputs:
- the signal $x$
- the window length $N$
- the hop size $H$

The output should be a _normalized_ zero-crossing feature.  
Make sure that your function uses a _centered window_ by using the zero-padding trick.

In [None]:
def calc_zc(x, win_len, hop_size) :
    pass

Now listen to the test signal below and plot the zc feature signal generated by your function.

Use $N = 1000$ and $H = 500$.


In [None]:
x6 = fmp.load_wav("audio/zc_test.wav")
fs = 22050

## Exercise 7a

Listen to the following piece of audio. It has 4 different types of drum sounds (Kick Drum, Low Tom, High Tom, Snare).

In [None]:
x7 = fmp.load_wav("audio/drum_hits.wav")
fs = 22050
ipd.Audio(x7, rate=fs)

Write the function `calc_onsets(x)` that returns the locations of the onsets in the piece of audio. You can use existing functions for calculating energy, ENF, and finding peaks from lab2 and `fmplib`. Tune the parameters until you get all the onsets without any missing or extra values. Return the list in units of the audio sampling rate $F_s = 22050$ (so, for example, if the first drum hit happend exactly at time = 0.5s, then the `calc_onsets(x)[0] == 11025`) 

Then, plot the waveform and the location of the onsets on the same plot to make sure you got it right.

In [None]:
def calc_onsets(x):
    pass

## Exercise 7b

For the first 4 onsets, plot the waveform of the drum hit, starting from the onset location, and lasting 200 milliseconds

In [None]:
# make plots

## Exercise 8

Now that you can see what the waveforms looks like, create a classifier that distinguishes between the four different drum types. Use zero-crossing to help you in this task. However note that the zero-crossing feature is unreliable during periods of near-silence. Make your classifier avoid this problem for more accurate results.

Write the function `classify_onsets(x, onsets)`:

Inputs:  
- `x`: the original signal
- `onsets`: the onsets that you calculated in Exercise 6

Outputs:
- a list of integers $(0 <= i < 4)$ that identify the drum type for that onset: {0:kick, 1: low-tom, 2: high-tom, 3: snare} 

In [None]:
def classify_onsets(x, onsets):
    pass

In [None]:
onset_types = classify_onsets(x7, onsets)
print(onset_types)

## Exercise 9

Now that you have the locations and types of the drums, synthesize an audio track of these drum sounds at the proper locations by writing the function `synthesize_drums(onsets, types)`

Inputs:
- `onsets`: the locations of onsets, as returned by `calc_onsets()`
- `types`: an array of the types of onsets corresponding to `onsets`, as returned by `classify_onsets()`
- Use the `*.snd` sounds below as global variables to generate the synthetic sounds.

Output:
- an audio waveform that can be played with `ipd.Audio()`.

Then, 
- Play that synthesized drum track.
- Compare this to the original track. How did you do?

In [None]:
kick_snd = fmp.load_wav('audio/kick.wav')
low_snd = fmp.load_wav('audio/low.wav')
high_snd = fmp.load_wav('audio/high.wav')
snare_snd = fmp.load_wav('audio/snare.wav')

def synthesize_drums(onsets, types):
    pass

Now use `synthesize_drums()` to create the same drum pattern, but played twice as fast by modifying the locations of the onsets.

In [None]:
# answer:
