# Spectrograms for Human Speech Analysis
### Finding Pitch and Formants

+ ###### Author: Dirk Van Compernolle   
+ ###### Modification History:   1/1/2020, 11/02/2023
+ ###### Requires:  pyspch>=0.7, verified on v0.8

In [1]:
# uncomment the pip install command to install pyspch -- it is required!
#
#!pip install git+https://github.com/compi1234/pyspch.git
#
try:
    import pyspch
except ModuleNotFoundError:
    try:
        print(
        """
        To enable this notebook on platforms as Google Colab, 
        install the pyspch package and dependencies by running following code:

        !pip install git+https://github.com/compi1234/pyspch.git
        """
        )
    except ModuleNotFoundError:
        raise

In [2]:
# Do the imports #
##################
#
%matplotlib inline
import os,sys 
import numpy as np
import pandas as pd
from IPython.display import display, Audio, HTML
#   
import pyspch.sp as Sps
import pyspch.core as Spch
import pyspch.display as Spd

import matplotlib.pyplot as plt



# make notebook cells stretch over the full screen
display(HTML(data="""
<style>
    div#notebook-container    { width: 95%; }
    div#menubar-container     { width: 65%; }
    div#maintoolbar-container { width: 99%; }
</style>
"""))

## Spectrogram analysis for pitch and formants


#### 1. Setup
+ use Spd.iSpectrogram2(), espcially to see spectral slices where the cursor is
+ use Spd.iSpectrogram(), especially if you want to zoom in and listen to shorter segments
+ you can switch between both as you like
+ use default parameters unless otherwise stated
+ suggested files to use:  female2.wav (female, narrowband),  male1.wav (male, narrowband), female1(female, wideband)
    
#### 2. Find pitch and formants in time and/or frequency domain
+ for 'female2': analyze the speech (vowels) at time=1.8sec and time=3.15sec
+ find pitch in three ways: 
    - finding the period in the time waveform
    - find harmonic distance in a spectral slice (e.g. count how many in a 1kHz range)
    - find harmonic distance in a spectrogram+ could you determine gender from the obtained pitch values
+ do your pitch estimation methods give consistent estimates ?
+ what is (roughly) the pitch range that you observe for this single speaker / single sentence ?
+ is pitch a clear indication of gender ?
[Pitch vs. Gender Histograms](../lab03_source_filter/PitchStatistics.html)
+ find vowel identity by finding first and second formant and then looking up in formant tables
    - think "spectral envelope" when trying to identify formants
+ repeat the above for the male voice
+ some of the above tasks are indeed (very) difficult, given the views that you have. What is lacking in your opinion, what is the problem (the information is there as you can recognize the sounds easily when you listen)


#### 3. Pitch and formants in the mel spectrum
+ make sure you are up to date on the mel-scale
+ add the mel spectrogram (and mel spectrum slice) to your view
+ is finding pitch easier in a high resolution mel spectrum than in the fourier spectrum ?
   - for narrowband speech (8kHz sampling) we suggest: 20 mel bands for a critical filterbank, 64 mel bands for a high resolution filterbank
   - for wideband speech (16kHz sampling) we suggest: 24 mel bands for a critical filterbank, 80 mel bands for a high resolution filterbank
+ Does you answer to the question above change if you lower the number of mel bands to 24 ?
+ in which representation is finding formants easiest ?

#### 4. Pitch and formants in the cepstrum / mel cepstrum

In [3]:
Spd.iSpectrogram2(figwidth=12,fname='female2.wav')     

iSpectrogram2(children=(VBox(children=(HBox(children=(Output(layout=Layout(width='66.0%')), Output(layout=Layo…

In [4]:
Spd.iSpectrogram(figwidth=12,fname='female2.wav',style='vertical') 

iSpectrogram(children=(VBox(children=(VBox(children=(FloatSlider(value=10.0, description='Shift(msec)', max=50…