I will perform basic sentiment analysis on collected COVID19-related Twitter data, using the TextBlob entiment analysis tool.
This tool uses a lexicon (i.e. a dictionary/vocabulary of words and their respective sentiment scores) and rule-based approach to classify text as either negative, neutral or positive.
However, the fact that sentiment is calculated based on the polarity score of each word in the text can lead to errors in sentiment classification. Such errors may arise due to a lack of understanding of various language traits such as sarcasm or even variations in the use of word the same word. 

In [None]:

import pandas as pd
import csv
import re
import numpy as np
import plotly.express as px
from plotly.offline import init_notebook_mode
!pip install vaderSentiment

from textblob import TextBlob
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting vaderSentiment
  Downloading vaderSentiment-3.3.2-py2.py3-none-any.whl (125 kB)
[K     |████████████████████████████████| 125 kB 6.0 MB/s 
Installing collected packages: vaderSentiment
Successfully installed vaderSentiment-3.3.2


We now read in our Twitter data that has been partially cleaned and put it into a pandas dataframe. 

In [None]:
# Read in the data and put it into a dataframe
df = pd.read_csv("/content/Datafram_CSV.csv")

# Have a quick look at the dataframe
df

Unnamed: 0.1,Unnamed: 0,tweet_id_str,date_time,location,tweet_text
0,0,1529612798241959943,2022-05-25 23:58:04,Kenya,Water melon for Ksh18.00 @ kg in Bura tana. Dm...
1,1,1529612784828305408,2022-05-25 23:58:01,,"true democracy is practised in china, people h..."
2,2,1529611298098618368,2022-05-25 23:52:06,Ethiopia,The Amhara regional state government has engag...
3,3,1529610794933305345,2022-05-25 23:50:06,Kenya,"pigs for Ksh6,000.00 @ piglet in Nakuru. high ..."
4,4,1529609511404818433,2022-05-25 23:45:00,Ethiopia,‘Amhara security forces launched a campaign to...
...,...,...,...,...,...
995,995,1529321937398386693,2022-05-25 04:42:17,Kenya,Kenya Kwanza fringe parties protest UDA's six-...
996,996,1529321879781003264,2022-05-25 04:42:04,Kenya,onions for Ksh40.00 @ kg in Karatina. \nOrder ...
997,997,1529320692038176768,2022-05-25 04:37:21,Kenya,Kenya needs an economy recovery plan. We need ...
998,998,1529320434805645319,2022-05-25 04:36:19,"Eldoret, Kenya","A diligent, honest and transparent leader who ..."


In [None]:
#Create a function to get the subjectivity
def getSubjectivity(text):
    return TextBlob(text).sentiment.subjectivity

#Create a function to get the polarity
def getPolarity(text):
    return TextBlob(text).sentiment.polarity

Let's try out these TextBlob tools by entering any text you'd like - try different examples, i.e. text which you think has positive or negative sentiment and see if TextBlob can get it right! Does punctuation make a difference? How does it handle sarcasm or common text slang such as 'lol'?

In [None]:
your_text = 'this is so cool'
getPolarity(your_text), getSubjectivity(your_text)

(0.35, 0.65)

In [None]:

df['subjectivity'] = df['tweet_text'].apply(getSubjectivity)
df['polarity'] = df['tweet_text'].apply(getPolarity)

df

Unnamed: 0.1,Unnamed: 0,tweet_id_str,date_time,location,tweet_text,subjectivity,polarity
0,0,1529612798241959943,2022-05-25 23:58:04,Kenya,Water melon for Ksh18.00 @ kg in Bura tana. Dm...,0.500000,0.500000
1,1,1529612784828305408,2022-05-25 23:58:01,,"true democracy is practised in china, people h...",0.330556,0.133333
2,2,1529611298098618368,2022-05-25 23:52:06,Ethiopia,The Amhara regional state government has engag...,0.000000,0.000000
3,3,1529610794933305345,2022-05-25 23:50:06,Kenya,"pigs for Ksh6,000.00 @ piglet in Nakuru. high ...",0.520000,0.330000
4,4,1529609511404818433,2022-05-25 23:45:00,Ethiopia,‘Amhara security forces launched a campaign to...,0.000000,-0.100000
...,...,...,...,...,...,...,...
995,995,1529321937398386693,2022-05-25 04:42:17,Kenya,Kenya Kwanza fringe parties protest UDA's six-...,0.000000,0.000000
996,996,1529321879781003264,2022-05-25 04:42:04,Kenya,onions for Ksh40.00 @ kg in Karatina. \nOrder ...,0.000000,0.000000
997,997,1529320692038176768,2022-05-25 04:37:21,Kenya,Kenya needs an economy recovery plan. We need ...,0.300000,1.000000
998,998,1529320434805645319,2022-05-25 04:36:19,"Eldoret, Kenya","A diligent, honest and transparent leader who ...",0.550000,0.400000


Next, let's create a function to add a sentiment label to each tweet, based on it's polarity score.

In [None]:
# Create a function to label postitive, neutral and negative tweets

def get_sentiment_label(score):
    if score < 0:
        return 'Negative'
    elif score == 0:
        return 'Neutral'
    else:
        return 'Positive'  

In [None]:
# Apply the get_sentiment_label function to the polarity column
# and add the sentiment results as a new column in our dataframe

df['TBsentiment'] = df['polarity'].apply(get_sentiment_label)
df

Unnamed: 0.1,Unnamed: 0,tweet_id_str,date_time,location,tweet_text,subjectivity,polarity,TBsentiment
0,0,1529612798241959943,2022-05-25 23:58:04,Kenya,Water melon for Ksh18.00 @ kg in Bura tana. Dm...,0.500000,0.500000,Positive
1,1,1529612784828305408,2022-05-25 23:58:01,,"true democracy is practised in china, people h...",0.330556,0.133333,Positive
2,2,1529611298098618368,2022-05-25 23:52:06,Ethiopia,The Amhara regional state government has engag...,0.000000,0.000000,Neutral
3,3,1529610794933305345,2022-05-25 23:50:06,Kenya,"pigs for Ksh6,000.00 @ piglet in Nakuru. high ...",0.520000,0.330000,Positive
4,4,1529609511404818433,2022-05-25 23:45:00,Ethiopia,‘Amhara security forces launched a campaign to...,0.000000,-0.100000,Negative
...,...,...,...,...,...,...,...,...
995,995,1529321937398386693,2022-05-25 04:42:17,Kenya,Kenya Kwanza fringe parties protest UDA's six-...,0.000000,0.000000,Neutral
996,996,1529321879781003264,2022-05-25 04:42:04,Kenya,onions for Ksh40.00 @ kg in Karatina. \nOrder ...,0.000000,0.000000,Neutral
997,997,1529320692038176768,2022-05-25 04:37:21,Kenya,Kenya needs an economy recovery plan. We need ...,0.300000,1.000000,Positive
998,998,1529320434805645319,2022-05-25 04:36:19,"Eldoret, Kenya","A diligent, honest and transparent leader who ...",0.550000,0.400000,Positive


We can have a quick look at the sentiment distribution of the tweets as follows:

In [None]:
df['TBsentiment'].value_counts()

Neutral     414
Positive    393
Negative    193
Name: TBsentiment, dtype: int64

Let's have a closer look at the tweets which TexBlob has classified as the most positive and most negative, to check the accuracy of the assigned sentiment. To do this, we will first sort all the tweets by polarity in descending order, i.e. from the most positive tweets to the most negative.

In [None]:
#We sort the tweets by their polarity value and put the sorted tweets into a new dataframe 

sorted_df = df.sort_values(by=['polarity'], ascending=False)

Next, let's print out the top 15 most negative tweets, which are now the last 15 tweets in the new dataframe. Do these tweets have a negative sentiment?

In [None]:
#Print out the text from the last 15 tweets in the sorted dataframe

for i, tweet in enumerate(sorted_df.tail(15)['tweet_text']):
    print(i+1, tweet, '\n')

1 broiler chicken for Ksh300.00 @ per Kg in Nairobi. location- Dagoreti-Nairobi.
Order : https://t.co/rZJ6bMjgFx
Use App : https://t.co/L7r2SzlhJl https://t.co/T2HcZOK9AC 

2 There are no virgins in kenya because UHURUTO economy has fucked us all😪😪 

3 Kienyeji Chicken for Ksh800.00 @ Bird in Nairobi. Quality Premium Meat
Order : https://t.co/NiFukn0RJT
Use App : https://t.co/L7r2Sz4eHl
AD: https://t.co/DdPensGWLk https://t.co/TK3Nfhl6Hk 

4 Kienyenji Chicken for Ksh500.00 @ Piece in Thika. John
Order : https://t.co/w6UFu1QxZW
Use App : https://t.co/L7r2Sz4eHl https://t.co/QlOx7HbTqC 

5 Chicken Feeders for Ksh400.00 @ Per Feeder in Imara Daima. 
Order : https://t.co/oZrn2uopVZ
Use App : https://t.co/L7r2Sz4eHl https://t.co/ONwUrQHWcv 

6 broiller chicken for Ksh500.00 @ bird in meru. 
Order : https://t.co/iqjI0SIOrm
Use App : https://t.co/L7r2Sz4eHl https://t.co/YkvFiYYSOZ 

7 Chicken Layer Battery Cages for Ksh29,999.00 @ Per Equipment in Nairobi. 
Order : https://t.co/zwpxsIK2iP
Use

In [None]:
init_notebook_mode(connected=True)
