Machine learning, in numpy
-
Updated
Oct 29, 2023 - Python
Machine learning, in numpy
Building and training Speech Emotion Recognizer that predicts human emotions using Python, Sci-kit learn and Keras
🔉 spafe: Simplified Python Audio Features Extraction
Front-end speech processing aims at extracting proper features from short- term segments of a speech utterance, known as frames. It is a pre-requisite step toward any pattern recognition problem employing speech or audio (e.g., music). Here, we are interesting in voice disorder classification. That is, to develop two-class classifiers, which can…
Audio feature extraction and classification
🔉 👦 👧Voice based gender recognition using Mel-frequency cepstrum coefficients (MFCC) and Gaussian mixture models (GMM)
Synchronize your subtitles using machine learning
A program for automatic speaker identification using deep learning techniques.
A simple audio feature extraction library
Lyrics-to-audio-alignement system. Based on Machine Learning Algorithms: Hidden Markov Models with Viterbi forced alignment. The alignment is explicitly aware of durations of musical notes. The phonetic model are classified with MLP Deep Neural Network.
The human speaks a language with an accent. A particular accent necessarily reflects a person's linguistic background. The model defines accent based audio record. The result of the model could be used to determine accents and help decrease accents to English learning students and improve accents by training.
🔉 👦 👧 👩 👨 Speaker identification using voice MFCCs and GMM
A implementation of Power Normalized Cepstral Coefficients: PNCC
基于DTW与MFCC特征进行数字0-9的语音识别,DTW,MFCC,语音识别,中英数据,端点检测,Digital Voice Recognition。
In this work we propose two postprocessing approaches applying convolutional neural networks (CNNs) either in the time domain or the cepstral domain to enhance the coded speech without any modification of the codecs. The time domain approach follows an end-to-end fashion, while the cepstral domain approach uses analysis-synthesis with cepstral d…
Implement a GRU/LSTM model using Keras, and train it to classify the languages using MFCC features
Deep Learning model for lexical stress detection in spoken English
Add a description, image, and links to the mfcc topic page so that developers can more easily learn about it.
To associate your repository with the mfcc topic, visit your repo's landing page and select "manage topics."