Skip to content

florakth/DT2119

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DT2119 Speech and speaker recogintion

Lab1: Feature extraction

• compute MFCC features step-by-step • examine features • evaluate correlation between feature • compare utterances with Dynamic Time Warping • illustrate the discriminative power of the features with respect to words • perform hierarchical clustering of utterances • train and analyze a Gaussian Mixture Model of the feature vectors.

Lab2: Hidden Markov Models with Gaussian Emissions

• combine phonetic HMMs into word HMMs using a lexicon • implement the forward-backward algorithm, • use it compute the log likelihood of spoken utterances given a Gaussian HMM • perform isolated word recognition • implement the Viterbi algorithm, and use it to compute Viterbi path and likelihood • compare and comment Viterbi and Forward likelihoods • implement the Baum-Welch algorithm to update the parameters of the emission probability distributions

Lab3: Phoneme Recognition with Deep Neural Network

Train and test a phone recogniser based on digit speech material from the TIDIGIT database:

• using predefined Gaussian-emission HMM phonetic models, create time aligned phonetic transcriptions of the TIDIGITS database • define appropriate DNN models for phoneme recognition using Keras • train and evaluate the DNN models on a frame-by-frame recognition score • repeat the training by varying model parameters and input features

Project: Automatic music genre classification using deep learning technologies

Music genre classification based on CNN and LSTM netwotks

About

speech recogintion

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published