GitHub - smrutijethwani/Covid-Vaccine-Data: Sentiment Analysis about the vaccine data

Project Overview

In this project, we pre-processed a dataset of 100K tweets in Python. We then implemented six different classification algorithms: Logistic Regression, Random Forest Classifier, Decision Tree Model, KNN Classifier, Linear SVC Model, and AdaBoost Classifier. We performed hyperparameter tuning on each algorithm to compare their accuracies. We also created a new dataset by scraping tweets from Twitter using the python package snscrape. Finally, we performed data visualization on each step to prominently distinguish data using Sci-kit, Seaborn, Matplotlib, and Plotly. We compared the different predictions and classification-based algorithms and achieved about 92% accuracy.

I learned a lot about the different challenges and opportunities that come with working with large datasets. How to identify and address data quality issues, how to choose the right classification algorithm for the task at hand, and how to interpret the results of your analysis are the few questions I am now able to answer.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Project.ipynb		Project.ipynb
Project.pdf		Project.pdf
README.md		README.md
Tweets_lem.csv.zip		Tweets_lem.csv.zip

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project Overview

Please download the other data set from here : https://www.kaggle.com/code/pratik0894/covid-vaccine-sentiment-analysis

About

Releases

Packages

Languages

smrutijethwani/Covid-Vaccine-Data

Folders and files

Latest commit

History

Repository files navigation

Project Overview

Please download the other data set from here : https://www.kaggle.com/code/pratik0894/covid-vaccine-sentiment-analysis

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages