Skip to content

The project focuses on identifying the speaker accent to be US or not US using binary classification. This project uses various Machine Learning classification methods like Logistic Regression, KNN, Binary Tree and Random Forests. Using the listed methods, evaluated the performance on the baseline models. To increase the accuracy and to prevent …

Notifications You must be signed in to change notification settings

ShreyaKulkarnii/Machine-Learning-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

#PROJECT TITLE "Classification algorithm for Speaker Accent Recognition Data Set (2020)"

#Project Objective The purpose of this project is to classify the Us or Non_US accent from six different languages speakers using various classification algorithms.

#PROJECT DESCRIPTION In this project, we used supervised machine learning classification algorithms for training the model and evaluated the testing performance to find out the best classification model.
Model trained are:

  1. Logistic Regression
  2. K-Nearest Neighbors
  3. Decision Tree
  4. Random Forest

In this project we have performed the follow operations

  1. Dataset Visualization
  2. Dataset Cleaning
  3. Feature Extraction
  4. Model Development
  5. Fine tuning
  6. Performance Evaluation

For every algorithms, we have evaluated and compared the Accuracies, ROC-AUC and Precision . Depending on the testing accuracy we inferred that Random Forest classification algorithm was the highest (1) among all other classification algorithms used in this project.

#LIBRARIES USED Following library were imported from the Anaconda and used further in the project.

  1. pyplot
  2. SNS
  3. Pands
  4. numpy
  5. Seaboard
  6. Matplotlib
  7. Sklearn

#GETTING STARTED

  1. Import CSV file - "accent-mfcc-data-1.csv" from the project folder.
  2. Read the CSV and store the dataset in variable "SAR_dt"
  3. The whole ipynb file would run at once without any interruption.

#References

https://github.com/lakshanakolur/Accent-Recognition-ML/tree/master/Code

https://github.com/stephenjkaplan/speech-accent-classifier/blob/master/notebooks/Speech%20Accent%20Classifier%20MVP%20-%20American%20vs%20Non-American%20Accents.ipynb

https://github.com/stephenjkaplan/speech-accent-classifier/blob/master/notebooks/analysis_utilities.py

https://www.ritchieng.com/machine-learning-evaluate-classification-model/

https://towardsdatascience.com/logistic-regression-model-tuning-with-scikit-learn-part-1-425142e01af5

https://github.com/MadhavShashi/Human-Activity-Recognition-Using-Smartphones-Sensor-DataSet/blob/master/1.HumanActivityRecognition_EDA.ipynb

https://www.pluralsight.com/guides/cleaning-up-data-from-outliers

https://www.semanticscholar.org/paper/A-Comparison-of-Classifiers-in-Performing-Speaker-Ma-Fokoue/666a2cb9589c0d2b46cd91f89e3d470d85aa3e1d

https://www.sciencedirect.com/topics/computer-science/cepstral-coefficient#:~:text=In%20practice%2C%20the%20first%208,may%20be%20beneficial%20%5B130%5D.

About

The project focuses on identifying the speaker accent to be US or not US using binary classification. This project uses various Machine Learning classification methods like Logistic Regression, KNN, Binary Tree and Random Forests. Using the listed methods, evaluated the performance on the baseline models. To increase the accuracy and to prevent …

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published