Youtube Universe of Comments

A project to clean up the youtube comment section using the AI/ML algorithms.

A part of the Machine Learning project.

Installation

The code should run with no issues using Python versions 3.* Using Jupyter notebook from Anaconda is recommended. You may use other data visualization tools like Tableau for reference. The libraries required with appropriate versions can be found in requirements.txt.

Repository structure

Project Motivation

Years for now the comment section of Youtube has been plagued with random spam content and the youtube doesn't seem to doing anything about it. We here introduce a new comment section revamped to make the good comments float to the top and the spam ones to linger to the bottom.

File Descriptions

data - This data file, attached to the repository contains all the data. It contains different kinds of comments, classified into 3 categories - Non-offensive, Hate-Speech and Abusive. The data has been collected using Youtube API scraping. The categories ave been assigned manually.

data cleaning/preprocessing

Youtube API- Scraping Comments

Models

Models Used

Logistic regression:
Support Vector Machine:
Support Vector Machine with Linear Kernel:
Support Vector Machine using RBF Kernel:
Support Vector Machine using Polynomial Kernel:
Decision Tree Classifier:
K-Nearest Neighbour Classifier:
Extra Tree Classifier:
Random Forest Classifier:
Model Parameter Optimization using GridSearchCV:

Instructions To Run

First install the dependencies

pip install -r requirements.txt

Now to run the code on streamlit

streamlit run main.py

Results

The interactive web app is hosted on Streamlit and can be found :-

Acknowledgement, Author and Licensing

For the project, I give credit to

Dr. Ankit Bhurane for guiding us in this project
Dr. Andrew Ng for his insightful course on Coursera

The code can be freely used by any individual or organization for their needs.

Name		Name	Last commit message	Last commit date
Latest commit History 149 Commits
Data		Data
ModelData		ModelData
Model_Pickle_jar		Model_Pickle_jar
__pycache__		__pycache__
speedtest-cli		speedtest-cli
Data.csv		Data.csv
DataCleaning.py		DataCleaning.py
Display_Comments.py		Display_Comments.py
Hinglish_Profanity_List.csv		Hinglish_Profanity_List.csv
Mobile_Data_4G_Airtel.log		Mobile_Data_4G_Airtel.log
Procfile		Procfile
Procfile.txt		Procfile.txt
README.md		README.md
Sentimental_Analysis.py		Sentimental_Analysis.py
VideoPlaylist.py		VideoPlaylist.py
WordCloud.png		WordCloud.png
YoutubeAPI.py		YoutubeAPI.py
check.py		check.py
classifier_abusive.py		classifier_abusive.py
jsonFileFromGoogle.json		jsonFileFromGoogle.json
main.py		main.py
model.py		model.py
pickled_logistic.pkl		pickled_logistic.pkl
requirements.txt		requirements.txt
setup.sh		setup.sh
speedtest.log		speedtest.log
streamlit-cheat-sheet.png		streamlit-cheat-sheet.png
tfidf_pickle_fit		tfidf_pickle_fit
youtube_search.py		youtube_search.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Youtube Universe of Comments

A project to clean up the youtube comment section using the AI/ML algorithms.

A part of the Machine Learning project.

Table of Contents

Installation

Repository structure

Project Motivation

File Descriptions

Models Used

Instructions To Run

Results

Acknowledgement, Author and Licensing

About

Uh oh!

Releases

Packages

Languages

asxd-10/MachineLearningWebApp

Folders and files

Latest commit

History

Repository files navigation

Youtube Universe of Comments

A project to clean up the youtube comment section using the AI/ML algorithms.

A part of the Machine Learning project.

Table of Contents

Installation

Repository structure

Project Motivation

File Descriptions

Models Used

Instructions To Run

Results

Acknowledgement, Author and Licensing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages