Sentiment Analysis with Finetuned Models

Introduction

In the wake of the COVID-19 pandemic, our world has witnessed unprecedented challenges and changes in various aspects of life. Alongside the rapid spread of the virus, the internet and social media platforms have become critical sources of information and communication for people across the globe. These platforms have not only provided an avenue for disseminating vital updates but have also become a space for expressing emotions, opinions, and sentiments related to the pandemic.

The development and distribution of COVID-19 vaccines have been monumental milestones in our battle against the global pandemic. These vaccines have provided hope and a pathway to recovery, offering protection against the severe effects of the virus. As vaccination efforts continue to progress worldwide, it becomes increasingly important to gauge public sentiment and understand the prevailing attitudes towards COVID-19 vaccines.

The objective of this challenge is to develop a machine learning model to assess if a Twitter post related to vaccinations is positive, neutral, or negative. This model will be deployed using streamlit on a Docker Container.

Dataset

Tweets have been classified as pro-vaccine (1), neutral (0) or anti-vaccine (-1). The tweets have had usernames and web addresses removed.

Variable definition:

tweet_id: Unique identifier of the tweet

safe_tweet: Text contained in the tweet. Some sensitive information has been removed like usernames and urls

label: Sentiment of the tweet (-1 for negative, 0 for neutral, 1 for positive)

agreement: The tweets were labeled by three people. Agreement indicates the percentage of the three reviewers that agreed on the given label. You may use this column in your training, but agreement data will not be shared for the test set.

Files available for download are:

Train.csv - Labeled tweets on which to train your model

Test.csv - Tweets that you must classify using your trained model

SampleSubmission.csv - is an example of what your submission file should look like. The order of the rows does not matter, but the names of the ID must be correct. Values in the 'label' column should range between -1 and 1.

NLP_Primer_twitter_challenge.ipynb - is a starter notebook to help you make your first submission on this challenge.

Setup

Fork this repo and run the notebook on Google Colab. The Hugging face models are Deep Learning based, so will need a lot of computational GPU power to train them. Please use Colab to do it, or your other GPU cloud provider, or a local machine having NVIDIA GPU.

Note that Google Colab sessions have time limits and may disconnect after a period of inactivity. However, you can save your progress and re-establish the connection to the GPU when needed.

Hugging Face is an open-source and platform provider of machine learning technologies. You can use install their package to access some interesting pre-built models to use them directly or to fine-tune (retrain it on your dataset leveraging the prior knowledge coming with the first training), then host your trained models on the platform, so that you may use them later on other devices and apps.

Please, go to the website and sign-in to access all the features of the platform.

Read more about Text classification with Hugging Face

Evaluation

The evaluation metric for this challenge is the Root Mean Squared Error.

Screenshots

Streamlit App

Gradio App

](https://huggingface.co/spaces/AlbieCofie/sentiment-classification-Gradio-App)

Resources

👏 Support

If you found this article helpful, please give it a clap or a star on GitHub!

Author

Alberta Cofie

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
data		data
dev		dev
README.md		README.md
requirement.txt		requirement.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

dev

dev

README.md

README.md

requirement.txt

requirement.txt

Repository files navigation

Sentiment Analysis with Finetuned Models

Introduction

Dataset

Setup

Evaluation

Screenshots

Streamlit App

Gradio App

Resources

👏 Support

Author

About

Releases

Packages

Languages

AlbieCofie/sentimentanalysis

Folders and files

Latest commit

History

Repository files navigation

Sentiment Analysis with Finetuned Models

Introduction

Dataset

Setup

Evaluation

Screenshots

Streamlit App

Gradio App

Resources

👏 Support

Author

About

Resources

Stars

Watchers

Forks

Languages