Speech-Sentiment-Analysis-

This project tackles the complex challenge of identifying emotions from voice recordings. Emotions are inherently subjective and typically inferred from visual cues like facial expressions and body language, making voice-based recognition a difficult task. Our goal is to create a model capable of effectively classifying the emotional tone in vocal expressions.

Datasets

Models

A CNN designed to analyze audio files' Mel Spectrograms.
A CNN focusing on Mel Frequency Cepstral Coefficients (MFCCs) of the audio files.
A CRNN that also works with MFCCs.

Project Structure

Gathering Data
Data Organization and Cleaning
Data Exploration, Preparation, and Visualization
Data Preprocessing
Model Implementation

All these components are detailed in the speech_emotion_recognition.ipynb Jupyter notebook.

Insights

The Mel Spectrogram CNN was effective but struggled to differentiate some emotions. The CNN using MFCCs was the most successful, suggesting MFCCs are better for emotion recognition in audio. The CRNN with MFCCs also showed good results but was prone to overfitting and didn't surpass the MFCCs CNN.

Evaluation Metrics

The models were assessed using Precision, Recall, and F1 scores, offering a more nuanced understanding of their effectiveness beyond mere accuracy. The MFCCs CNN model emerged as the top performer, as evidenced by its highest scores in these metrics.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
models		models
README.md		README.md
requirements.txt		requirements.txt
speech_sentiment_analysis.ipynb		speech_sentiment_analysis.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

models

models

README.md

README.md

requirements.txt

requirements.txt