In [1]:
%matplotlib inline

# 592B, Class 7.1 (10/12). Spectral domain methods and representations: the cepstrum 

In [2]:
import numpy as np
import matplotlib.pyplot as plt
import scipy.io.wavfile as wavfile 
import scipy.signal as signal
from scipy import fftpack

from ipywidgets import interactive
from IPython.display import Audio, display

## The cepstrum

For this section, we will be using [python_speech_features](https://github.com/jameslyons/python_speech_features) by [James Lyon](https://maxwell.ict.griffith.edu.au/spl/staff/j_lyons/home.htm). Install it using:

```
pip install python_speech_features
```
We can examine the code at the [github repository](https://github.com/jameslyons/python_speech_features) and via the [documentation](https://python-speech-features.readthedocs.io/en/latest/).

James Lyon also has some nice write-ups on the [cepstrum](http://www.practicalcryptography.com/miscellaneous/machine-learning/tutorial-cepstrum-and-lpccs/) and [MFCCs](http://practicalcryptography.com/miscellaneous/machine-learning/guide-mel-frequency-cepstral-coefficients-mfccs/).

In [50]:
import python_speech_features as sf

We'll also be working with the Hmong `mu` sound clip, `hmong_mu.wav`:

In [54]:
(rate,sig) = wavfile.read("hmong_mu.wav")
display(Audio(data=sig, rate=rate))

What we'll be doing now is working through the tutorials using the [github code repository code for `python_speech_features.base.mfcc`](https://python-speech-features.readthedocs.io/en/latest/). This function calls a ton of other sub-functions we'll need to understand in order to figure out how the computation is proceeding. The basic steps are described in the tutorial [here](http://practicalcryptography.com/miscellaneous/machine-learning/guide-mel-frequency-cepstral-coefficients-mfccs/). Let's work through these. 

### Step 1. Frame the signal into short frames.

For this step, we'll need to understand `python_speech_features.sigproc.framesig`.

### Step 2. For each frame calculate the periodogram estimate of the power spectrum.

For this step, we'll need to understand `python_speech_features.sigproc.powspec`.

### Step 3. Apply the mel filterbank to the power spectra, sum the energy in each filter.

For this step, we'll need to understand `python_speech_features.base.fbank`.

### Step 4. Take the logarithm of all filterbank energies.

For this step, we'll need to understand `python_speech_features.base.logfbank`.

### Step 5. Take the DCT of the log filterbank energies.

For this step, we'll need to take a look at `scipy.sig.fftpack.dct`.

### Step 6. Keep DCT coefficients 2-13, discard the rest.

For this step, we'll need to understand `python_speech_features.base.lifter`.