Comment-Classifier

A machine learning system which takes a comment as an input and ranks it as offensive or non-offensive (neutral). To measure its effectiveness, the following classification algorithms were used: Naive Bayes, SVM and Random Forest.

Libraries

NumPy
Pandas
scikit-learn

Input

A data set with 6182 comments, extracted from mulitple online platforms like Youtube, Twitter etc...

Procedure

Preprocess and Text-Cleaning.
Split data (train & test).
Train data with the procedure of k-fold cross validation, under the following classification algorithms:
1. Naive Bayes
2. SVM
3. Random Forrest
Use test data to evaluate algorithms.
Experimented with TF-IDF (term frequency-inverse document frequency) and POS (parts of speech).
Improvements
1. Lemmatization
2. Stopwords
3. Bigrams
4. Laplace Smoothing

Name	Name	Last commit message	Last commit date
Latest commit mansstiv Update commentClassifier.ipynb Nov 5, 2021 b575941 · Nov 5, 2021 History 12 Commits
README.md	README.md	Update README.md	Mar 10, 2021
commentClassifier.ipynb	commentClassifier.ipynb	Update commentClassifier.ipynb	Nov 5, 2021
instructions.pdf	instructions.pdf	Add files via upload	Mar 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Comment-Classifier

Libraries

Input

Procedure

Preprocess and Text-Cleaning.

Split data (train & test).

Train data with the procedure of k-fold cross validation, under the following classification algorithms:

Use test data to evaluate algorithms.

Experimented with TF-IDF (term frequency-inverse document frequency) and POS (parts of speech).

Improvements

About

Releases

Packages

Languages

mansstiv/Comment-Classifier

Folders and files

Latest commit

History

Repository files navigation

Comment-Classifier

Libraries

Input

Procedure

Preprocess and Text-Cleaning.

Split data (train & test).

Train data with the procedure of k-fold cross validation, under the following classification algorithms:

Use test data to evaluate algorithms.

Experimented with TF-IDF (term frequency-inverse document frequency) and POS (parts of speech).

Improvements

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages