<a href="https://colab.research.google.com/github/youngmoo/ECES-435/blob/main/Class2-2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **ECES-435: Class 2.2**
Today: All about the DFT and STFT.



# *Optional*: install `ipympl` for  Matplotlib

* This module is needed to enable *interactive* Matplotlib figures
* This installs packages that will require a restart of the runtime, so run this cell first.

In [None]:
!pip install ipympl   # Also installs a more recent version of matplot (v3.5.3)

# *Optional stuff*

## Enable interactive Matplotlib figures

It's annoying that Colab's Matplotlib doesn't have this by default, but with a little work, we can enable basic figure interaction:
* Pan
* Zoom in/out
* Display of cursor coordinates

In [None]:
from google.colab import output
output.enable_custom_widget_manager()
%matplotlib widget

## My plotting style defaults
Feel free to adjust these to your liking.

In [None]:
from matplotlib import rc

rc('figure', figsize=(12,4))
rc('figure', facecolor='#aaaaaa')     # Better figure background for dark mode

rc('font', family='Liberation Serif') # Nicer font
rc('font', size=20)                   # Larger font size for labels

# The usual setup
Let's start by importing the "usual" modules we'll be using...

In [None]:
import numpy as np                # Load the NumPy module, using the abbreviation 'np'.
import matplotlib.pyplot as plt   # Load the Matplotlib module, abbreviated as 'plt'
import IPython.display as ipd     # Load the Interactive Python display module, abbreviated as 'ipd'
import soundfile as sf            # Load the 'librosa' module for working with sound files and more

# Define a variable that's the directory path to our shared Google Drive folder
path = '/content/drive/MyDrive/eces435-work/class2.2/'

In [None]:
# ENTER YOUR USERNAME
username = 'tk421'

from google.colab import drive
drive.mount('/content/drive')

# Helper functions


## `myPlot()` function
A quick time-domain signal plot function with my default figure settings and a time x-axis (in seconds).
* Required arguments:
  * `sig` Input signal (first argument)
* Optional arguments:
  * `N=#` Number of samples to plot (default: length of signal)
  * `fs=#` Sample rate of signal (default: 44100 Hz)
  * `fig_size=(W,H)` Change figure dimensions (width, height)
  * `x_ax=True/False` Show x-axis (default: True)
  * `y_ax=True/False` Show y-axis (default: True)
  * `lw=#` Change linewdith of signal (default: 1)
  * `fmt='...'` Plot format string (default: none)

In [None]:
def myPlot(sig, N=0, fs=44100, fig_size=(16,4), x_ax=True, y_ax=True, lw=1, fmt=''):
  if N==0:
    N = len(sig)

  fig = plt.figure(figsize=fig_size)
  t = np.arange(N)/fs

  plt.plot( t[:N], sig[:N], fmt, linewidth=lw)

  plt.xlabel('Time (sec)')
  ax = plt.gca()    # gca(): "Get current axis", the graph object that's currently plotted
  
  if x_ax == False:
    ax.xaxis.set_visible(False)
  if y_ax == False:
    ax.yaxis.set_visible(False)

  fig.tight_layout()
  plt.ion()

  # Returning the figure causes issues with interactive matplotlib
  #return fig
  # For saving the figure, use the interactive buton, instead.
  # For further customization and command-line saving, more changes are required.

# *From last class: Can we determine *when* a frequency happens?


## Let's reload the music...

In [None]:
filepath = path + 'aha.wav'
[aha, fs] = sf.read(filepath)
print('sampling rate:' , fs)

ipd.Audio(aha,rate=fs)

Using the entire music sample doesn't tell us much about *when* those frequencies occur (or how they change over time).
* Instead, what if we take a short clip (20 ms) of the music? Let's call this an analysis *frame*.
* Then, we know any frequencies present in the frame occur within that short amount of time (20 ms).

How many samples give us 20 ms?

In [None]:
dur = 0.02
dur * fs    # Number of samples?

Create a frame that we'll call `clip`.
* Have a listen (it's *really* short!)

In [None]:
frameSize = int(fs * dur)   # We want to use this as an array index, so we cast it to int (otherwise you get an error)
clip = aha[fs : fs + frameSize]   # Let's clip a 0.02 sec segment, starting at 1 sec

myPlot(clip)
ipd.Audio(clip, rate=fs)

## Now, what's our "fundamental" analysis frequency?

In [None]:
T0 = frameSize / fs   # Should be 20 ms
print("Period:",T0,"seconds")
f0 = 1/T0
print("Fundamental:",f0,"Hz")

## A literal implementation of the Fourier Transform
* The Fourier Transform is simply a Fourier Series with the following constraints...
  * Our fundamental analysis frequency, $f_0$ is determined by the number of samples we give it (our "period").
  * The Fourier Series coefficients give us the amount of frequency at the harmonics of $f_0$, so $f_k = k \cdot f_0$.


  * $X[\omega_k] = \sum_{n=0}^{N-1} x[n] e^{-j \omega_k n} =\sum_{n=0}^{N-1} x[n] \left(\cos[\omega_k n] - j \sin[\omega_k n]\right)$
    * $\omega_k = 2 \pi f_k / f_s$
    * $N$ is the period (in samples), then…
  * $a_k = X_{Re}[\omega_k] = \sum_{n=0}^{N-1} x[n] \cos\left[\frac{2\pi f_k n}{f_s}\right]$
  * $b_k = -X_{Im}[\omega_k] = \sum_{n=0}^{N-1} x[n] \sin\left[\frac{2\pi f_k n}{f_s}\right]$
  * Magnitude $c_k = |X[\omega_k]| = \sqrt{a_k^2 + b_k^2}$

Write a function that takes a signal `x`, `fs`, and `K` (the number of Fourier coefficients to compute) as inputs and computes `a_k, b_k, and c_k`, the Fourier coefficients as outputs. 


In [None]:
def myFourierTransform(x, fs, K=16):
  a_k = np.zeros(K)   # array for cos() weights
  b_k = np.zeros(K)   # array for sin() weights
  c_k = np.zeros(K)   # array for magnitude weights: sqrt(a**2 + b**2)

  fig = plt.figure(figsize = (20, 8))
  y_lim = 0.5;

  n = np.arange(len(x))

  for k in range(K):
    f_k = f0*k 
    sin_k = np.sin(2*np.pi*f_k*n / fs)
    cos_k = np.cos(2*np.pi*f_k*n / fs)

    a_k[k] = np.sum(x*cos_k)
    b_k[k] = np.sum(x*sin_k)
    c_k[k] = np.sqrt(a_k[k]**2 + b_k[k]**2)

    # Everything else in the loop is to make nice looking plots
    plt.subplot(3,K,k+1)      # Subplots are indexed starting at 1, a la MATLAB ?!?!?
    plt.plot(n,x*cos_k-0.1,'g')
    plt.fill_between(n,x*cos_k-0.1,-0.1,facecolor='g',alpha=0.5)
    plt.ylim(-y_lim,y_lim)
    plt.axis('off')
    
    plt.subplot(3,K,K + k+1)
    plt.plot(n,x,'c',n,0.1*cos_k+0.4 ,'g',n,0.1*sin_k-0.49,'r')
    plt.ylim(-0.6,0.6)
    plt.axis('off')

    plt.subplot(3,K,2*K + k+1)
    plt.plot(n,x*sin_k+0.1,'r')
    plt.fill_between(n,x*sin_k+0.1,0.1,facecolor='r',alpha=0.5)
    plt.ylim(-y_lim,y_lim)
    plt.axis('off')

  plt.show()

  # Plot the a and b arrays
  fig1 = plt.figure(figsize=(20,4))
  plt.bar(np.arange(K)-0.15, a_k, width=0.3,color='g')
  plt.bar(np.arange(K)+0.15, b_k, width=0.3,color='r')
  plt.show()

  # Plot the c array
  fig2 = plt.figure(figsize=(20,4))
  plt.bar(np.arange(K)*f0, c_k, width=30, color='b')
  plt.show()

  # Ouput c
  return a_k, b_k, c_k

## Let's use our Fourier Transform!

In [None]:
a_k, b_k, c_k = myFourierTransform(clip, fs)

Print your `c_k` output for later comparison.

In [None]:
c_k

# The (Discrete) Fourier Transform
* $X[k] = \sum_{n=0}^{N-1}x[n] e^\frac{-j 2 \pi f_k n}{N}$

Aside: Can you prove that using $\sin[2\pi f_k n]$ and $\cos[2\pi f_k n]$ is the same as $e^{-j2\pi f_k n}$ ?

In [None]:
def myDFT(x, fs, K=16):
  X_k = np.zeros(K) * 1j    # Initialize the output as an array of complex numbers
  n = np.arange(len(x))

  for k in range(K):
    f_k = f0*k
    X_k[k] =  # Compute the Fourier output for f_k
              # Try using np.exp() instead of np.sin() and np.cos()

  return X_k

In [None]:
%%time
C1 = myDFT(clip, fs)

In [None]:
np.abs(C1)

##Dude, isn't there an *easier* way to compute the Fourier transform?
Yes, the Fast Fourier Transform (FFT) is a *much* more efficient algorithm for computing the Discrete Fourier Transform (DFT). Lots of people say 'FFT' when they actually mean 'DFT'.

In [None]:
%%time
C2 = np.fft.fft(clip)     # FFT: 'Fast Fourier Transform'

In [None]:
np.abs(C2[:16])

##What does the full DFT look like?

* Plot the real and imaginary outputs of the DFT.
* Add the magnitude of the DFT, $|X[k]|$?
* Separately, plot the magnitude in decibels (dB)?
  * $20 \log_{10} |X[k]|$

In [None]:
N = len(clip)
fig0 = plt.figure(figsize=(20,4))
f = np.arange(N)*fs/N             # Frequency array, corresponding to Fourier frequencies (spaced at 50 Hz)

plt.plot(f,   # Fill in this statement

#plt.xlim(0,fs/2)   # Try uncommenting this line


When you're ready, save your DFT to the class folder.

In [None]:
fig0.savefig(path + 'DFT/' + username + '-DFT.png')

## I *need* more frequency resolution!

Our frequency values are based on our "fundamental" period (the length of our frame), which is currently 50 Hz.
* What happens if you zero-pad (add zeros to the end of the signal)?
* Try different amounts of zero-padding

In [None]:
N_z = 2048    # Zero-padded length of the frame
clip_z = np.append(clip, np.zeros(N_z - N)) # Zero-padded signal
f_z = np.arange(N_z) * fs / N_z   # Frequency vector (extended to zero-padded length)

C_z =  # Fill this in: the DFT of zero-padded signal

fig1 = plt.figure(figsize=(20,4))
plt.plot(clip_z)

fig2 = plt.figure(figsize=(20,4))
plt.plot(f_z, 20*np.log10(np.abs(C_z)))
plt.xlabel('Frequency (Hz)')
plt.xlim(0,fs/2)

## Periodic extension of a zero-padded frame

In [None]:
clip_z_rep = np.tile(clip_z,15)

ipd.Audio(clip_z_rep,rate=fs)

## Analysis windows
Let's apply a window function to *taper* the edges of the analysis frame to reduce sidelobes in the frequency output. Some window functions (all built into NumPy):
* `hanning` (Hann window)
* `bartlett` (Triangle window)
* `hamming` (Raised cosine variation)
* `blackman` (High sidelobe reduction)

In [None]:
clip_w = clip[:N] * np.hanning(N)       # Actually a 'Hann' function
#clip_w = clip[:N] * np.bartlett(N)     # Fancy name for a triangle
#clip_w = clip[:N] * np.hamming(N)      # Another sinusoidal window
#clip_w = clip[:N] * np.blackman(N)     # Another window with different tradeoffs

clip_wz = clip_z      # The whole frame (with zero-padding)
clip_wz[:N] = clip_w  # The windowed part of the frame

C_wz = np.fft.fft(clip_wz)

fig1 = plt.figure(figsize=(16,4))
plt.plot(clip_wz)

fig2 = plt.figure(figsize=(16,8))
#plt.plot(f, np.abs(X))
plt.plot(f_z, 20*np.log10(np.abs(C_wz)))

#plt.xlim(0,fs/2)
plt.xlim(0,10000)

In [None]:
fig2.savefig(path + 'winDFT/' + username + '-winDFT.png')

## Increase frequency resolution (again)
* `fft(..., n=N_fft)`: This automatically zero-pads to the requested length (`N_fft`), adding more frequency samples.

In [None]:
N_fft = 8192
C_w = np.fft.fft(clip_w)
C_wz = np.fft.fft(clip_w, n=N_fft)

fig = plt.figure(figsize=(16,4))
#ax = plt.axes(xlim=(-20,5020), ylim=(0, 90))

# Plot the zero-padded "high resolution" DFT (length N_fft)
plt.plot(np.arange(N_fft)*fs/N_fft, 20*np.log10(np.abs(C_wz)) )

# Plot our original DFT samples ("low res") as big orange dots
plt.plot(np.arange(N)*fs/N,20*np.log10(np.abs(C_w)),'.',markersize=15)
plt.xlim(0,5000)

#fig.savefig(path + 'Fourier Transform-windowed.png', dpi=200, transparent=True)

## DFT of a long signal

In [None]:
AHA =           # DFT (FFT) of the full 'aha' signal
N = len(aha)    # Number of samples in the 'aha' signal

fig1 = plt.figure()
n = np.arange(N)          # Sample index for signal
plt.plot(n/fs, aha)       # Plot the whole signal
plt.xlabel('Time (sec)')

fig2 = plt.figure()
plt.plot(np.arange(N)*fs/N, 20*np.log10(np.abs(AHA)))   # Plot the magnitude DFT
plt.xlim(0, fs/2)
plt.xlabel('Frequency (Hz)')

#Let's make a (DFT) movie!
The DFT of the entire music sample isn't all that helpful. We would like to know how the frequencies change over time.
* Instead, we take a series of short *frames* and plot each corresponding DFT.
* We advance the frame to overlap with the previous frame (by 50% or more), so we don't miss anything.
* We can compile each DFT frame into an animation and watch it as it evolves.


In [None]:
# Module required for animation
from matplotlib import animation
# Note: below is the part which makes it work on Colab
rc('animation', html='jshtml')

# Function to compile DFT frames into an animation
def fftMovie(input, num_frames=60, frame_rate=30, sample_rate=44100):

  # Compute and plot the DFT (called for each frame):
  def nextFrameFFT(f_num):
    win = np.hanning(frame_size)        # For now, we always use a Hanning (Hann) window    
    n_hop = int(sample_rate/frame_rate)
    n1 = int(n_hop * f_num)
    n2 = int(n_hop * f_num + frame_size)
    x_n = input[n1:n2]                  # Current frame of the input

    X_n = ...  # Fill this in: the FFT of the current frame (windowed)
    
    N = len(X_n)
    f = np.arange(N) * sample_rate / N
    X_mag = np.abs(X_n) + 1.0e-15   # Add a very small offset to avoid log(0) errors
    X_dB = 20*np.log10(X_mag)       # Freq. magnitude in dB

    fftLine.set_data(f, X_dB)    

    return fftLine,

  frame_size = int(fs*0.02)
  N_fft = 2048
  f = np.arange(N_fft) * fs / N_fft

  # First set up the figure, the axis, and the plot element we want to animate
  fftFig = plt.figure(figsize=(14,6))
  ax = plt.axes(xlim=(0,10000),ylim=(-100,50))
  fftLine, = ax.plot([], [])

  fftFig.tight_layout()
  plt.close()   # Don't output the final figure separately

  frame_period_in_ms = 1000 / frame_rate

  fftAnim = animation.FuncAnimation(fftFig, nextFrameFFT, frames=num_frames, interval=frame_period_in_ms, blit=True)

  return fftAnim

In [None]:
movie_dur = 5   # Movie length in seconds
fftMov = fftMovie(aha, num_frames=30*movie_dur, sample_rate=fs)
fftMov

## For now, let's magically add audio to the animation

Not going into the details of how this works, right now.

In [None]:
def animWithSound(anim_frames, audio_data, sample_rate=44100):
  # This is just a hack to create unique filenames (based on the current timestamp)
  dt_suffix = str( int( np.datetime64('now').astype(np.timedelta64) / np.timedelta64(1, 's') ) ) # Current date/time in seconds
  anim_filename = 'temp_anim_' + dt_suffix + '.mp4'
  audio_filename = 'temp_audio_' + dt_suffix + '.wav'
  output_filename = 'temp+sound_' + dt_suffix + '.mp4'
  
  anim_frames.save(anim_filename)
  sf.write(audio_filename, audio_data, sample_rate)
  !ffmpeg -i $anim_filename -i $audio_filename -map 0 -map 1:a -c:v copy -shortest $output_filename -hide_banner -loglevel error
  return output_filename  # Return the filename of the temp output

In [None]:
fftMov_file = animWithSound(fftMov, aha[:fs*movie_dur], fs)
ipd.Video(fftMov_file, embed=True)

# A different view of the same data: *spectrogram*

In [None]:
from scipy import signal

f1, t1, Sxx = signal.spectrogram(aha, fs, window='hann', nperseg=882, noverlap=441, nfft=1024)
#f1, t1, Sxx = signal.spectrogram(aha, fs, window='bartlett', nperseg=882, noverlap=441, nfft=1024)

fig = plt.figure(figsize=(16,6))

plt.pcolormesh(t1, f1, 20*np.log10(np.abs(Sxx))) #, shading='gouraud')
plt.ylabel('Frequency (Hz)')
plt.xlabel('Time (sec)')

#plt.xlim(0,4)
#plt.ylim(0,5000)

plt.show()
ipd.Audio(aha,rate=fs)