## Cepstral Liftering
+ ###### Author: Dirk Van Compernolle   
+ ###### Modification History:   03/03/2023
+ ###### Requires:  pyspch>=0.7


#### Cepstral Features

A cepstrogram is similar in many ways to a spectrogram. The only difference is that each 'column' in the display is a cepstral slice and not a spectral slice as in a spectrogram. 

The cepstrum is obtained by inverse DFT of the log spectral magnitude.  Thus, the units in which a cepstrum is displayed is the same as time.  In order to avoid confusion and stress the derivation from the spectrum, we call the unit now *Quefrency*.

One of the great advantages of using the cepstrum is that we can represent most of the content with a small number of cepstral
coefficients obtained by truncating the cepstrum, i.e. we only use the lowest L coefficients from the N-dim spectral magnitude vector
with L << N.   This truncated spectrum is equivalent to the smoothed spectrum.



#### Cepstral Liftering

The effect of cepstral truncation is well illustrated in this cepstral liftering demo.

The low order cepstral coefficients relate to the spectral envelope, while we often observe  a prominent peak in the slice at the point corresponding to the signal's pitch. In the cepstrogram, the series of these pitch peaks appears as a smooth contour. Furthermore there seems to be little or no information in the higher order quefrencies  except the pitch. This observation suggests an interesting operation. We can divide the cepstrum in two parts: the low quefrency components and the high quefrency components and reconstruct a spectrogram from either part by applying a DFT on the selected coefficients.

The retained low order quefrency components is often called the 'liftered' cepstrum (cepstral equivalent of low-pass filtering) and after reconstruction results in a smooth spectrogram or *envelope spectrogram*. The high order components constitute the cepstral residue and may be reconstructed to a pitch spectrogram or cepstral *residue spectrogram*. with the two resulting spectrograms by interactively liftering the cepstrogram in different places.


#### The Demo

In the GUI below, you see plots of waveform, spectrogram, cepstrogram, envelope spectrogram and residue spectrogram.

You can select the liftering point with a slider.  Move a slider to observe more or less smoothing by the liftering operation.


#### Things to explore

- what happens if the liftering point is moved close to the minimum ?
- what happens if the liftering point is moved close to the maximum ?
- which representation is most suited as feature extraction for a speech recognizer ?
- what we call here the "residue spectrogram" is sometimes called the "pitch spectrogram"; explain.

## Mel Cepstral Coefficients

We can still perform this cepstral liftering starting on the mel spectrum.   For the low resolution mel spectrum the effect is somewhat less obvious.  For the high resolution mel spectrum, the pitch smoothing effect by truncation in the cepstrum becomes obvious again.
In the demo below, just 'check' the box 'use mel-scale'.

- can you get a spectral envelope of the high resolution mel spectrum by using cepstral truncation ?
- can you get a clean pitch spectrogram from the high mel cepstral coefficients ?
- check the female voice 'coding/f2.wav'.  (mel) Cepstral smoothing is not evident, in many situations you may observe some pitch leakage into the spectral envelope.  Pls. explain why things are easier with 'misc/friendly' from an average male voice.


In [1]:
#!pip install git+https://github.com/compi1234/pyspch.git
try:
    import pyspch
except ModuleNotFoundError:
    try:
        print(
        """
        To enable this notebook on platforms as Google Colab, 
        install the pyspch package and dependencies by running following code:

        !pip install git+https://github.com/compi1234/pyspch.git
        """
        )
    except ModuleNotFoundError:
        raise

In [2]:
%matplotlib inline
import os,sys, math, copy
import numpy as np
import librosa

import pyspch.sp as Sps
import pyspch.core as Spch
import pyspch.display as Spd

import matplotlib.pyplot as plt
import matplotlib as mpl
import ipywidgets as widgets
from ipywidgets import interact, interactive, Layout, HBox, VBox
from IPython.display import Audio

mpl.rcParams['figure.figsize'] = [12.0, 8.0]
mpl.rcParams['font.size'] = 12
mpl.rcParams['legend.fontsize'] = 'large'
mpl.rcParams['figure.titlesize'] = 'large'

### Load a Waveform
- Choose between male (eg. misc/friendly) or female (eg. coding/f2) samples   
- The next cell already computes a default spectrogram
- Come back to this cell if you want to change the loaded waveform

In [3]:
dir = 'https://homes.esat.kuleuven.be/~spchlab/data/'
# female examples -----------
#name = 'misc/bad_bead_booed'
#name = 'misc/expansionist'
name = 'coding/f2'
# male examples ----------
#name ='misc/b_8k'
#name = 'misc/friendly'

wavfname = os.path.join(dir,name+".wav")
wavdata, sr = Spch.load(wavfname)
# reload and downsample for data with sample rates above 8kHz
if sr > 8000:  wavdata, sr = Spch.load(wavfname,sample_rate=8000) 
wavlength = wavdata.size/float(sr)
shift = 0.01
spgdata = Sps.spectrogram(wavdata,sample_rate=sr,f_shift=shift)

### Define all the GUI Elements

In [4]:
# assumes a wavefile preloaded in *wavdata* and spectrogram precomputed in *spgdata*
#
def cepstral_lifter_plot(lifter=12,mel_flag=False):
    
    #n_mels = 20 if sr == 8000 else 24
    n_mels = 80

    #spg = Sps.spectrogram(wavdata,sample_rate=sr,f_shift=shift)
    if mel_flag:
        cep = Sps.melcepstrum(S=spgdata,n_cep=n_mels,n_mels=n_mels)
    else:
        cep = Sps.cepstrum(S=spgdata)
    spg_env, spg_res =  Sps.cep_lifter(cep,n_lifter=lifter,n_spec=cep.shape[0])
    # we plot the mean normalized cepstrum for visual clarity
    cep_n = Sps.mean_norm(cep[1:,:])
    #
    fig = Spd.PlotSpgFtrs(wavdata=wavdata,spgdata=spgdata,img_ftrs=[cep_n,spg_env,spg_res],row_heights=[1,1,1,1,1],sample_rate=sr,figsize=(14,8))
    fig.axes[1].set_ylabel("")
    fig.axes[1].set_title("SPECTROGRAM")
    fig.axes[2].set_title("CEPSTROGRAM")
    xlim = fig.axes[2].get_xlim()
    fig.axes[2].plot(xlim,[lifter,lifter],lw=3)
    fig.axes[3].set_title("ENVELOPE SPECTROGRAM")
    fig.axes[4].set_title("RESIDUE SPECTROGRAM")
    display(fig)
    
### GUI #####################
#############################
wg_lifter=widgets.IntSlider(value=12,min=1,max=128,step=1,
                            continous_update=False,description="Lifter",
                                  orientation = 'horizontal') #,width='50%', style={'description_width':'20%'})
wg_mel = widgets.Checkbox(value=False,description='mel') #,width='50%',Indent=False
output = widgets.interactive_output(cepstral_lifter_plot,{'lifter':wg_lifter,'mel_flag':wg_mel})
 
wg_instructions = widgets.HTML(
    "<b>CEPSTRAL LIFTERING</b><br> \
    Move the Liftering Slider to control the split between envelope and residue<br>\
    Check the 'mel'-box to work with mel-spectrograms")

ui = VBox([ wg_instructions, HBox([wg_lifter,wg_mel],layout=Spd.box_layout(width='100%',border='',align_items='flex-start'))],
          layout=Spd.box_layout(width='100%',padding='10px'))

In [5]:
# Run the GUI
display(VBox([ui, output]),Audio(data=wavdata,rate=sr))

VBox(children=(VBox(children=(HTML(value="<b>CEPSTRAL LIFTERING</b><br>     Move the Liftering Slider to contr…