Skip to content
#

mfcc

Here are 105 public repositories matching this topic...

Front-end speech processing aims at extracting proper features from short- term segments of a speech utterance, known as frames. It is a pre-requisite step toward any pattern recognition problem employing speech or audio (e.g., music). Here, we are interesting in voice disorder classification. That is, to develop two-class classifiers, which can…

  • Updated Mar 3, 2023
  • Python

Lyrics-to-audio-alignement system. Based on Machine Learning Algorithms: Hidden Markov Models with Viterbi forced alignment. The alignment is explicitly aware of durations of musical notes. The phonetic model are classified with MLP Deep Neural Network.

  • Updated Mar 9, 2020
  • Python

In this work we propose two postprocessing approaches applying convolutional neural networks (CNNs) either in the time domain or the cepstral domain to enhance the coded speech without any modification of the codecs. The time domain approach follows an end-to-end fashion, while the cepstral domain approach uses analysis-synthesis with cepstral d…

  • Updated Mar 8, 2020
  • Python

Improve this page

Add a description, image, and links to the mfcc topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the mfcc topic, visit your repo's landing page and select "manage topics."

Learn more