# Lab Sheet 6 (COM3502-4502-6502 Speech Processing)

This lab sheet is part of the lecture COM[3502](http://www.dcs.shef.ac.uk/intranet/teaching/public/modules/level3/com3502.html "Open web page for COM3502 module")-[4502](http://www.dcs.shef.ac.uk/intranet/teaching/public/modules/level4/com4502.html "Open web page for COM4502 module")-[6502](http://www.dcs.shef.ac.uk/intranet/teaching/public/modules/msc/com6502.html "Open web page for COM4502 module") Speech Processing at the [University of Sheffield](https://www.sheffield.ac.uk/ "Open web page of The University of Sheffield"), Dept. of [Computer Science](https://www.sheffield.ac.uk/dcs "Open web page of Department of Computer Science, University of Sheffield").

It is probably easiest to open this Jupyter Notebook with [Google Colab](https://colab.research.google.com/notebooks/intro.ipynb#recent=true "Open in Google Colab") since GitHub's Viewer does not always show all details correctly. <a href="https://colab.research.google.com/github/sap-shef/SpeechProcesssingLab/blob/main/Lab-Sheets/Lab-Sheet-6.ipynb"><img align="right" src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open Notebook in Google Colab" title="Open and Execute the Notebook directly in Google Colaboratory"></a>

Please put questions, comments and correction suggestions in the [Blackboard](https://vle.shef.ac.uk) discussion board or send an email to [s.goetze@sheffield.ac.uk](mailto:s.goetze@sheffield.ac.uk).

In [None]:
# Let's do the ususal necessary and nice-to-have imports
%matplotlib inline
import matplotlib.pyplot as plt  # plotting
import seaborn as sns; sns.set() # styling ((un-)comment if you want)
import numpy as np               # math

## The Chirp Signal, a.k.a. Sweep Signal

The function 
\begin{equation}
x_{\mathrm{chirp}}(t)=\mathrm{sin}\left(\pi t^2\right) \tag{1}
\end{equation}
defines the so-called **linear-frequency [chirp](https://en.wikipedia.org/wiki/Chirp "Click here for additional information on the Chirp signal on Wikipedia")** or simply linear chirp. It is a signal that sweeps through diffrent frequencies over time.

The instantaneous frequency $f(t)$ of this chirp signal increases exactly linearly with time, i.e. $f(t)=f_0+ a t$ with a constant $a=\frac{f_1-f_0}{t_1-t_0}$. More precisely, the instantaneous angular frequency $\omega(t)$ of the chirp signal at time $t$ is the derivate of the sinusoid's argument divided by $2\pi$, thus $\omega(t) = t$. 



<br>
<a id='task_1'></a>
<div style="border: 2px solid #999; padding: 10px; background: #eee;">
    
**Task 1: Create and analyse a Chirp Signal**
    
<ul>
<li> Implement a function <code>generate_chirp</code> that outputs a sampled chirp signal <code>x</code> for the time interval $[t_0,t_1]$ in seconds with $0\leq t_0<t_1$.
</li>    
<li>Compute the DFT of <code>x</code> for various input parameters $t_0$ and $t_1$. Plot the chirp signal as well as the resulting magnitude Fourier transform.
</li>
</ul>
</div>

In [None]:
fs=8000      # sampling frequency
#t1=???         # time 1 in seconds
#t2=???        # time 2 in seconds
length=t2-t1 # signlal length in seconds

# create chirp signal
# ...

# plot chirp signal
# ...

# plot spectrum of chirp signal
# ...

### Spectrogram vs. Spectrum of Chirp Signal

In [None]:
import scipy.signal as sig

fs=8000   # sampling frequency
length=10 # signlal length in seconds

t = np.linspace(0, length, length*fs) # time vector

f0 = 10      # frequency in Hz at time t_0=0
f1 = 6      # frequency in Hz at time t_1
t1 = 0.5*length # frequency in Hz at time t=0
chp1 = sig.chirp(t, f0=f0, f1=f1, t1=t1, method='linear')

# plot time domain chirp
plt.figure(figsize=(12,12))
plt.subplot(2,2,1)
plt.plot(t, chp1)
plt.title('Linear Chirp, $f(t=0)='+str(f0)+'$ Hz, $f(t='+str(t1)+')='+str(f1)+'$ Hz')
plt.xlabel('$t$ in sec')

# plot spectrogram of chirp
plt.subplot(2,2,2)
L = 9192        # DFT length (we need a relatively large number here for high frequency resolution)
overlap = 4096  # also large overlap to get some time resolution between segments
plt.specgram(chp1,NFFT=L, Fs=fs, noverlap=overlap)
plt.colorbar()  # add a colorbar to the spectrogram
plt.ylim(0,50)
plt.grid(False)
plt.title('Spectrogram of Linear Chirp, $f(t=0)='+str(f0)+'$ Hz, $f(t='+str(t1)+')='+str(f1)+'$ Hz')
plt.ylabel('$f$ in Hz')
plt.xlabel('$t$ in sec')

f0 = 100    # frequency in Hz at time t_0=0
f1 = 3500     # frequency in Hz at time t_1
t1 = length # frequency in Hz at time t=0
chp2 = sig.chirp(t, f0=f0, f1=f1, t1=length, method='logarithmic')


plt.subplot(2,2,3)
plt.plot(t[1:160], chp2[1:160])
#plot_spectrogram(f'Logarithmic Chirp, f(0)=1500, f({T})=250', w, fs)

# plot spectrogram of chirp
plt.subplot(2,2,4)
L = 1024        # DFT length (we need a relatively large number here for high frequency resolution)
overlap = 256   # also large overlap to get some time resolution between segments
plt.specgram(chp2,Fs=fs)
plt.colorbar()  # add a colorbar to the spectrogram
plt.ylim(0,50)
plt.grid(False)
plt.ylabel('$f$ in Hz')
plt.xlabel('$t$ in sec')
None # to suppress last output

In [None]:
## create 5 second chirp signal with the following parameters.

# chirp between 10 and 30 Hz
sf = 1000 # sampling frequency
dt = 1/sf
time = np.arange(0,5,dt)
Nyquist = sf/2 

# Define a chirp function like Udemy curse
f = (10,30) # frequencies in Hz
ff = np.linspace(f[0], np.mean(f), time.size)
signal1 = np.sin(2*np.pi*ff*time)

# chirp betwen 2 and 20 Hz
signal2 = chirp(time, f0=2, f1=20, dur =5)

# Fourier Transform of the first signal
fsignal1 = np.fft.fft(signal1)/signal1.size
Nsamples = int(np.floor(signal1.size/2))
hz = np.linspace(0, Nyquist, Nsamples +1)
amp1 = 2*np.abs(fsignal1)

# Fourier Transform of the second signal
fsignal2 = np.fft.fft(signal2)/signal2.size
amp2 = 2*np.abs(fsignal2)

fig, ax = plt.subplots(2,2, figsize = (16,4))#

ax[0,0].plot(time, signal1, lw = 1, color='C0')
ax[0,0].set_ylim(-1.5,1.5)
ax[0,0].set_xlabel('Time (sec)')
ax[0,0].set_title('Chirp 10 to 30 Hz')

ax[0,1].plot(time, signal2, lw =1, color='C1')
ax[0,1].set_ylim(-1.5,1.5)
ax[0,1].set_xlabel('Time (sec)')
ax[0,1].set_title('Chirp 2 to 20 Hz')
    
ax[1,0].plot(hz, amp1[:hz.size], color = 'C0', lw=1)
ax[1,0].set_xlim(xmin = 0, xmax =50)
ax[1,0].set_xlabel('Frequency (Hz)')

ax[1,1].plot(hz, amp2[:hz.size], color = 'C1', lw=1)
ax[1,1].set_xlim(xmin = 0, xmax =50)
ax[1,1].set_xlabel('Frequency (Hz)');

## Short-Time Fourier Transform

The discrete Fourier transform (DFT) is not very well suited for the analysis of instationary signals (such as an speech or music signal) since it is applied to the entire signal at once. Thus, the mean spectum is calculated in this case which might not be very insightful. 

To gain more insight in changes of the spectrum over time we split a long signal into segments and compute the DFT for each of these segments. This is known as the [short-time Fourier transform](https://en.wikipedia.org/wiki/Short-time_Fourier_transform) (STFT).

The STFT $x[n,\ell]$ of a time domain signal $x[k]$ is defined as

\begin{equation}
x[n, \ell] = \sum_{k = \ell L_{\mathrm{hop}}}^{\ell L_{\mathrm{hop}}+L_{\mathrm{DFT}}-1} x[k] \, w[k+\ell L_{\mathrm{hop}}] \;  \mathrm{e}^{\,-\mathrm{j}\,\frac{2 \pi}{L_{\mathrm{DFT}}} k n},  \quad 0 \leq \ell \leq ???
\end{equation}

where $w[k]$ denotes a window function of length $L_w$ (often of same length as the DFT length, i.e. $L_w=L_{\mathrm{DFT}}$) which is normalized by $\sum_{k=0}^{L_w-1} w[k] = 1$. Starting from $k=\ell L_{\mathrm{hop}}$, the signal $x[k]$ is windowed by $w[k]$ to a segment of length $L_w$. This windowed segment is then transformed by a DFT of length $L_{\mathrm{DFT}}$.

The STFT can be used for spectral analysis of signals or processing of instationary signals. The resulting spectrum $x[n,\ell]$ depends on the frequency index $n$ and the block index $\ell$ which is directly related to the time index $k$ by $k=\ell L_{\mathrm{hop}}$. The spectral domain is therefore also termed as [time-frequency domain](https://en.wikipedia.org/wiki/Time%E2%80%93frequency_representation) and techniques using the STFT as time-frequency processing.

The properties of the STFT depend on

* the DFT length $L_{\mathrm{DFT}}$ of the segments,
* the overlap between the segments $L_{\mathrm{hop}}$, and
* the window function $w[k]$.

The size $N$ of the segments and the window function influence the spectral and temporal resolution of the STFT. The block-time index $\ell$ of the STFT can be increased by an arbitrary step size. The step size determines the overlap between two consecutive STFTs. For instance, the spectra $x[n, \ell]$ and $x[n, \mu+1]$ have $L_{\mathrm{hop}}-1$ overlapping samples. The overlap is sometimes given as percentage of the segment length $L_{\mathrm{DFT}}$.

## The Spectrogram

The magnitude $|x[n, \ell]|$ of the STFT is known as the [spectrogram](https://en.wikipedia.org/wiki/Spectrogram) of a signal. It is frequently used to analyze signals in the time-frequency domain, for instance by a [spectrum analyzer](https://en.wikipedia.org/wiki/Spectrum_analyzer).