Twitter Seniment Analysis Using XML-T

Mohammad H. Forouhesh

Metodata Inc ®

April 25, 2022

This model is fine-tuned on the checkpoints of XLM-RoBERTa model, which is trained on ~198M multilingual tweets from May ’18 to March ‘20, described and evaluated in the reference paper. It was first released in this repository. Consequently, it outperforms models trained on only one language in downstream tasks when used on new data as shown below. This solution draw its aspirations mainly from of RoBERTa-large (Liu et al. 2019) and partly used these papers:

J. Hartmann, et al. More than a Feeling: Accuracy and Application of Sentiment Analysis

Can, et al. Multilingual Sentiment Analysis: An RNN-Based Framework for Limited Data.

Problem Definition:

The aim is to train a classifier that can predict the sentiment of Persian tweets. Previous attempts to tackle this task involves labeling a sample of the whole dataset and then training a classifier to generalize the results, the classification phase fall into two categories based on the level of human supervision:

rule-based and hand-crafted feature extraction, However, such methods are not scalable to an overwhelming number of combinations in reviews. In fact, any variation in the dataset, demands a new round of feature engineering. Besides, this method performs poorly on the test sets with accuracy at about 60%, which in turn, hints again at the generalizability problem of hand-crafted based algorithms.
Using multilingual Bert based pre-trained models for the labelled sample. But these techniques demands a huge amount of unbiased and fare labelled data. As our experiments suggested, adaptive learning techniques using only pre-trained Bert based models labelled tweets results in poor quality sentiments.

In Addition, we did some experiments on using an English sentiment analysis corpus to train a model and then trough a multilingual word embedding embed Persian data and use the trained model. This model does not performs well.

Proposed Solution:

To tackle the obstacles mentioned above, the following solution is proposed as a means to reduce the need for both feature engineering and eschew reliance on labeled data. In short, we fine-tuned hugging-face model cardiffnlp/twitter-xlm-roberta-base for the downstream task of sentiment analysis on Persian tweets using a balanced labeled dataset. Then, exploiting the idea of proxy learning and by using this fine-tuned embedding model, we implement an LSTM classification head and trained it on a large labeled corpus of Persian Tweets using adaptive learning. In addition to maintainability and speed-ups, our model shows competitive results, as discussed bellow. This section can be split into three broad sections:

Fine-Tuning Dataset

It contains 4,500 tweets extracted using the twitter api . These tweets have been annotated (sad, meh, happy) and they can be used to detect sentiment. This dataset is balanced and each class consists of equal number of tweets.
text: The text of a tweet target: The sentiment of a tweet (sad, meh, happy).

A sample of the dataset is as follows:

Fine tuning XML-T

To make a domain specific sentiment analysis embedding, we fine-tuned cardiffnlp/twitter-xlm-roberta-base by using the state-of-the-art approach (adjusting the whole model to the downstream task via a linear classification head). The model converged after 4 days on an M1 MacBook Pro, and the hyper-parameters were:

Learning rate = 1e-4

# Train epochs = 35

Warm-up steps = 500

Weight decay = 0.01

Adaptive Training and Proxy Learning

The main task of this pipeline is tweet classification, with the main difference being the use of an adapted technique. In short, we freeze the fine-tuned LM and then only train an LSTM classification head on its checkpoints, thus reduce the memory usage and increase speed. In this way, we introduce a Proxy Learning approach in which we first adjust our embedding model for the task of sentiment classification, and then solve a proxy problem of classical deep learning with an LSTM classifier. The dataset that we train LSTM on it contains 450,000 balanced tweets that where labeled using translation methods to-and-from English corpus. The hyper-parameters of LSTM were:

Learning rate = 2e-5

# Train epochs = 350

Warm-up steps = 0

Hidden Layers = 100

Bias = True

Embedding dim = 384

Output dim = 3

Optimizer = Adam

Loss function = Cross Entropy

Inference & Testing

The results of the error analysis for different phases are as follows, note that these results are analysis of errors with regard to different metrics for test set:

Fine-tuning Phase

	Precision	Recall	F1-score	support
sad	0.91290	0.88938	0.9009	300
meh	0.93083	0.94859	0.9396	300
happy	0.88145	0.89971	0.8904	300

accuracy			0.9103	900
Macro average	0.88145	0.92415	0.9022	900
Weighted average	0.88145	0.92415	0.9022	900

Adaptive Proxy Learning

	Precision	Recall	F1-score	support
sad	0.83126	0.81190	0.8214	30,000
meh	0.84919	0.87111	0.86	30,000
happy	0.79981	0.82223	0.8108	30,000

accuracy			0.8307	90,000
Macro average	0.8268	0.83508	0.8309	90,000
Weighted average	0.8268	0.83508	0.8309	90,000

Results and Analysis

In this section, we seek to answer three research questions (RQ) when dealing with noisy, short, and unsolicited reviews: RQ1. Does fine-tuning XLM-RoBERTa on Persian political tweets, improves the results in a statistically significant way? And RQ2. Does Adaptive Proxy Learning reduce memory cost and yet do not underperform significantly?

Based on the above tables, we can answer RQ1, and RQ2. As can be seen, the performance of the model on the sample we labeled outperforms previous approaches. As we can see, this method significantly improve our results.

Name		Name	Last commit message	Last commit date
Latest commit History 1,166 Commits
twitter_sentiment		twitter_sentiment
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Sentiment Analysis Report.pdf		Sentiment Analysis Report.pdf
main.py		main.py
persian.txt		persian.txt
requirements.txt		requirements.txt
twitter-sentiment-bert.log		twitter-sentiment-bert.log

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

twitter_sentiment

twitter_sentiment

.gitattributes

.gitattributes

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

Sentiment Analysis Report.pdf

Sentiment Analysis Report.pdf

main.py

main.py

persian.txt

persian.txt

requirements.txt

requirements.txt

twitter-sentiment-bert.log

twitter-sentiment-bert.log

Repository files navigation

Twitter Seniment Analysis Using XML-T

Problem Definition:

Proposed Solution:

Fine-Tuning Dataset

Fine tuning XML-T

Adaptive Training and Proxy Learning

Inference & Testing

Fine-tuning Phase

Adaptive Proxy Learning

Results and Analysis

About

Releases

Packages

Languages

License

MohammadForouhesh/twitter-sentiment-bert

Folders and files

Latest commit

History

Repository files navigation

Twitter Seniment Analysis Using XML-T

Problem Definition:

Proposed Solution:

Fine-Tuning Dataset

Fine tuning XML-T

Adaptive Training and Proxy Learning

Inference & Testing

Fine-tuning Phase

Adaptive Proxy Learning

Results and Analysis

About

Resources

License

Stars

Watchers

Forks

Languages