# 21M.387 Fundamentals of Music Processing
## Lab3

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import IPython.display as ipd
from ipywidgets import interact
import sys
sys.path.append("..")
import fmplib as fmp
from fmplib.pyqueue import connect_to_queue

plt.rcParams['figure.figsize'] = (12, 4)
fmp.documentation_button()

## Python Review

You should be comfortable with numpy vectors at this point. The following review may be helpful for numpy matrices.

You can create a matrix explicitly as shown below. You can also ask a matrix for its dimensions with `shape` (this is a property, not a function).

In [None]:
a = np.array( ((1,2,3,4), (5,6,7,8), (9,10,11,12)) )
print(a)
print('shape is', a.shape)

The dimensions of a matrix are always: Rows x Columns.

You can do matrix indexing and slicing very similarly to vector slicing.  
A colon by itself (`:`) selects an entire row or column.

In [None]:
print('a[1,2] = ')
print(a[1,2], '\n')

print('a[0:2,1:3] =')
print(a[0:2,1:3], '\n')

print('a[2, :] =')
print(a[2, :], '\n')

print('a[:, 2] =')
print(a[:, 2])
print('Note this presents itself as a vector, even though it is a column')

You can set matrix elements by slicing as well. For example:

In [None]:
a = np.random.random((4,6))
print(a, '\n')

# set the corners to specific values
a[0,0] = 1
a[0,-1] = 2
a[-1,-1] = 3
a[-1,0] = 4
print(a, '\n')

# set an entire row or column
a[1,:] = 0
a[:,2] = 0

print(a, '\n')

## Exercise 1

- Load the small bit of audio below and listen to it.
- Plot it.
- Identify the approximate start location of the first note ($n_1$). Actually, make $n_1$ a bit later than the exact start, so that you are past the note's transient.
- Create $x$: a window of length $N=1024$ starting at $n_1$ of the audio.
- Plot $x$.
- Create $x_w$: $x$ multiplied by a Hann window of the same length (see `np.hanning()`).
- Plot $x_w$.

In [None]:
snd1 = fmp.load_wav("audio/piano_arpeg.wav")
fs = 22050
ipd.Audio(snd1, rate=fs)

In [None]:
connect_to_queue()

## Exercise 2

- Create  $\lvert X \lvert$, the magnitude of the first 1/2 of DFT of $x_w$ using  `np.fft.rfft` module.
- Plot it.
- Find the top 4 peaks of this signal (using `fmp.find_peaks`). You may need to play around with the optional `thresh` parameter to get only the highest 4 peaks of $\lvert X \lvert$.
- Plot the peaks
- Print out the bin numbers for these top 4 peaks.

In [None]:
connect_to_queue()

## Exercise 3a

- Write the function `bin_to_freq` which returns the frequency of a given FFT bin ($k$). This function should work with inputs that are scalars or vectors. (Hint: what other inputs does this function need?)
- Print the frequencies of the peaks of your DFT.

In [None]:
def bin_to_freq(k):
    pass

# print peak frequencies

## Exercise 3b

- Write the function `freq_to_pitch` which returns the midi pitch (as a floating point value) from a given frequency. 
- Print the midi pitches of the peaks of your DFT.

Since that first note is an C4 played on piano, the first 4 pitches should correspond roughly to the first 4 harmonics: C4, C5, G5, C6.

How accurate are the pitches from your DFT analysis?

In [None]:
def freq_to_pitch(f):
    pass

# print midi pitches

In [None]:
connect_to_queue()

## Exercise 4

The frequency resolution of the DFT with the current $N$ is not that good. You can see that the pitches are not very accurate.

One way to help the accuracy is to increase $N$. In this exercise, lets increase $N$ by zero-padding the windowed signal $x_w$.

- Create a function `zpad` that zero-pads a vector to new length $N_{zp}$. You can use `np.concatenate` or `np.pad`.
- Repeat this process (Exercises 2 and 3) to arrive at a set of midi pitch values for $x_w$, while trying out increasing values of $N_{zp}$. It is generally good practice (but not mandatory) that $N_{zp}$ be a power of 2.
- Hint: it is probably easiest to write a function with $x$ and $N_{zp}$ as inputs that returns or prints out the MIDI pitches.

Do you observe the accuracy getting better?  
Do you get to a point where increasing $N_{zp}$ stops improving the accuracy of the results?


In [None]:
def zpad(x, nzp):
    pass

# run Ex 3-4 with different values of nzp

In [None]:
connect_to_queue()

## Exercise 5

Now load this bit of audio and listen to it.

In [None]:
snd2 = fmp.load_wav("audio/piano_diad.wav")
fs = 22050
ipd.Audio(snd2, rate=fs)

There are two notes played at the same time. Get $N=1024$ samples from the start of this 2-note-chord. Use the technique of Exercise 4 to find the pitches in this audio recording. As above, increase the zero-padding to achieve more accuracy.

The pitches played by the piano are B2 and C#3 (MIDI 47 and 49). Yet these results don't seem very good. Why is that?

In [None]:
connect_to_queue()

## Exercise 6

Let's increase $N$ in a different way - by using a larger initial window (grabbing a larger portion of the audio) instead of zero-padding.

Start with the same `snd2` of Exercise 5, but this time, slice off a larger and larger portion of the audio to create the initial DFT. Don't forget to apply the Hann window. Then, find the pitches corresponding to the DFT peaks.

- For what value of $N$ can you start to see two pitches? 
- For what value of $N$ do the pitches become accurate?


In [None]:
connect_to_queue('checkoff')