<a href="https://colab.research.google.com/github/Korede2001/Capstone/blob/main/Nigerian_ISP_Sentiment_Analysis.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Nigerian ISP Sentiment Analysis Pipeline (18/06/2021)
-First Build-

**`Objectives:`**
- Setup a working sentiment analysis pipeline with basic functionality as provided by sentiment analysis libraries
- Build familiarity with the Twitter API
- Diagnose issues with publicly available sentiment analysizers when working with Nigerian data.

**`Key Findings:`**
- Accounting for Pidgin English and slangs is very important if we are to properly classify tweets.
- Find a way to pinpoint the target/subject of words in a tweet to avoid misattribution of positive/negative words.
- How do we handle connotations of sentences? E.g. Get IPNX! This will be an interesting one to tackle, and I can definitely see such connotative sentences coming up very frequently.

### 1. Library Importation

In [3]:
#Import relevant libraries
import tweepy
import pandas as pd
import numpy as np
import textblob

### 2. Setting up & connecting to the API

In [None]:
#Import the twitter credentials stored in a separate file
%run ./twitter_credentials 

In [4]:
#Create the authentication object
auth = tweepy.OAuthHandler(api_key,api_secret_key)

#Set the access token and access token secret
auth.set_access_token(access_token,access_token_secret)

#Create the API object
api = tweepy.API(auth)

#wait_on_rate_limit = True

### 2. Extracting the Tweets
#### Specifying the tweet search parameters

Note: I comment out to avoid rerunning and wasting my limited tweets

In [17]:
#Specify Lagos state geocode
#lagos_geocode = "6.5244,3.3792,500km"

#Specify the development environment (needed to access the full archive)
#dev_env = 'prod'


##SPECTRANET ISP

#Tweets containing 'spectranet' and exclude tweets from the official ISP Twitter handle
#These will be narrowed down to tweets in Lagos using the geocode
#spectranet_query = 'spectranet -from:spectranet_NG'

##IPNX ISP

#Tweets containing 'IPNX' and exclude tweets from the official ISP Twitter handles
#place argument narrows to Lagos, Nigeria
#IPNX_query = 'IPNX -from:ipNXTweet -from:IpnxSupport place:Lagos'


#### Fetching the tweets

Full archive search (which allows me to get tweets from previous years) actually works! Only, I have a lower limit of 5k tweets per month. The limit can definitely be worked with.

In [18]:
#Get tweets for Spectranet ISP. Defaults to 15 tweets
#spectranet_tweets = api.search(q = spectranet_query, geocode = lagos_geocode)

#Full archive search for IPNX tweets (This actually works!)
#IPNX_tweets = api.search_full_archive(dev_env, IPNX_query, fromDate = '202001010000', toDate='202012312359')

#### Sample texts from the tweets

**Spectranet:**

for num,tweet in enumerate(spectranet_tweets[:5]):
    print(num+1, '-' ,tweet.text + '\n')

**IPNX:**

In [None]:
for num,tweet in enumerate(IPNX_tweets[:6]):
    print(num+1, '-' ,tweet.text + '\n')

1 - @adefola09 They said they don’t have coverage at my side oo... thinking of getting ipnx

2 - @Olufems Yes. IPNX

3 - @fkabudu ipNX has never disgraced me, if they’re available in your area check them out.

4 - Here we go, Public School Students In Oyo Now Have 24hrs Access To Internet! @seyiamakinde @thecableng @ipNXTweet… https://t.co/pe8eAhU3bD

5 - @tundealuko Still wondering how IPNX said we used 330gb in 7 days .... average of 35-50gb per day

6 - Tizeti , pay for 1 month enjoy 5 days disconnected for 7days then reconnected for 3 then disconnected for 4days the… https://t.co/dhUEd71g4x



#### Compiling the tweets

#Getting the relevant properties from the tweets 
tweets = [{'Time':tweet.created_at, 'Subject':'Spectranet', 'Text':tweet.text,
          'Coordinates':tweet.coordinates, 'Place': tweet.place, 'Source':tweet.source
          } for tweet in spectranet_tweets]

#Add the IPNX tweets to tweets list
tweets.extend([{'Time':tweet.created_at, 'Subject':'IPNX', 'Text':tweet.text,
                'Coordinates':tweet.coordinates, 'Place': tweet.place, 'Source':tweet.source
          } for tweet in IPNX_tweets])

df= pd.DataFrame.from_dict(tweets)
df.to_csv('isp_tweets.csv',index=False)
df.head()

In [None]:
#Convert to CSV to save current tweets obtained from the API
#df.to_csv('./isp_tweets.csv')

#df = pd.read_csv('isp_tweets.csv')

In [9]:
df = pd.read_csv('isp_tweets.csv', index_col=0)

In [10]:
df[-6:-1]

Unnamed: 0,Time,Subject,Text,Coordinates,Place,Source
38,2020-06-12 14:27:48,IPNX,@FurtherMaf F*ck Tizeti IPNX Way!!!!!!,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for iPhone
39,2020-06-11 00:42:40,IPNX,"@SheriphSkills Ipnx, the best. I don’t even us...",,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for iPhone
40,2020-06-09 23:23:13,IPNX,@kelonline Ipnx oh,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for Android
41,2020-05-07 15:28:07,IPNX,Ipnx no dey fall hand https://t.co/vnFbqwYAup,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for iPhone
42,2020-03-28 20:36:39,IPNX,@MissIFY_ Ipnx maybe.,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for iPhone


In [20]:
#Make a copy of the dataframe to perform cleaning on
df2 = df.iloc[:28,:].copy()

### 3. Data Cleaning & Preprocessing

In [21]:
#Remove links
df2['Text'] = df2['Text'].str.replace(r'https?:/+\S+','')

#Remove the @ symbol and newlines
df2['Text'] = df2['Text'].str.replace(r'@|\n','')

#Remove hashtags
df2['Text'] = df2['Text'].str.replace(r'#\w+','')

for tweet in df2['Text']:
  print(tweet,'\n')

Spectranet_NG stop tormenting my life with mails and text messages from you. abeg now! You have one job, fix your…  

Spectranet_NG Una No get level o...😑 

wohley_jnr iSlimfit Spectranet_NG Well it’s only a matter of time before they join. VPN Dey, we move 🚀 

Spectranet finally joined the ban 😂😂 but VPN to the rescue 🚀 

eesawa chico_shamsz Actually I’m using wifi(Spectranet). It’s like they only banned it via network providers, bec…  

Glo don receive am ooo. I no dey fit access twitter direct again. Na Spectranet I dey use direct. God knows if dem…  

Spectranet needs to understand its no turning back for me. lol 

Who can get me spectranet please? 

How did my Spectranet reset itself and how can I fix this without finding my LAN cable ? 🤦 

FBN_help Spectranet_NG Done 

Spectranet_NG Sent you a DM. Kindly reply. Thanks. 

MTNNG GloWorld AirtelNigeria Spectranet_NG  

adefola09 They said they don’t have coverage at my side oo... thinking of getting ipnx 

Olufems Yes. IPNX 

fkabud

  df2['Text'] = df2['Text'].str.replace(r'https?:/+\S+','')
  df2['Text'] = df2['Text'].str.replace(r'@|\n','')
  df2['Text'] = df2['Text'].str.replace(r'#\w+','')


### 3. Sentiment Analysis

In [22]:
#We are not interested in how objective/subjective the sentence is so we focus
#on polarity
df2["Polarity"] = df2['Text'].apply(lambda x: textblob.TextBlob(x).sentiment.polarity)

In [23]:
df2

Unnamed: 0,Time,Subject,Text,Coordinates,Place,Source,Polarity
0,2021-06-12 19:01:33,Spectranet,Spectranet_NG stop tormenting my life with mai...,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for Mac,0.0
1,2021-06-05 20:20:39,Spectranet,Spectranet_NG Una No get level o...😑,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for iPhone,0.0
2,2021-06-05 11:13:33,Spectranet,wohley_jnr iSlimfit Spectranet_NG Well it’s on...,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for iPhone,0.0
3,2021-06-05 11:12:06,Spectranet,Spectranet finally joined the ban 😂😂 but VPN t...,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for iPhone,0.0
4,2021-06-05 09:46:44,Spectranet,eesawa chico_shamsz Actually I’m using wifi(Sp...,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for iPhone,0.0
5,2021-06-05 05:10:10,Spectranet,Glo don receive am ooo. I no dey fit access tw...,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for Android,0.2
6,2021-06-04 21:02:41,Spectranet,Spectranet needs to understand its no turning ...,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for Mac,0.4
7,2021-06-01 15:00:10,Spectranet,Who can get me spectranet please?,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for Android,0.0
8,2021-06-01 11:40:36,Spectranet,How did my Spectranet reset itself and how can...,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for Android,0.0
9,2021-06-01 10:37:56,Spectranet,FBN_help Spectranet_NG Done,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for iPhone,0.0


In [24]:
def getSentiment(score):

  """
  Return the sentiment (positive, neutral or negative) based on the polarity score
  """

  if score < 0:
    return 'Negative'
  elif score == 0:
    return 'Neutral'
  else:
    return 'Positive'

df2['Sentiment'] = df2['Polarity'].apply(getSentiment)

In [28]:
df2

Unnamed: 0,Time,Subject,Text,Coordinates,Place,Source,Polarity,Sentiment
0,2021-06-12 19:01:33,Spectranet,Spectranet_NG stop tormenting my life with mai...,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for Mac,0.0,Neutral
1,2021-06-05 20:20:39,Spectranet,Spectranet_NG Una No get level o...😑,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for iPhone,0.0,Neutral
2,2021-06-05 11:13:33,Spectranet,wohley_jnr iSlimfit Spectranet_NG Well it’s on...,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for iPhone,0.0,Neutral
3,2021-06-05 11:12:06,Spectranet,Spectranet finally joined the ban 😂😂 but VPN t...,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for iPhone,0.0,Neutral
4,2021-06-05 09:46:44,Spectranet,eesawa chico_shamsz Actually I’m using wifi(Sp...,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for iPhone,0.0,Neutral
5,2021-06-05 05:10:10,Spectranet,Glo don receive am ooo. I no dey fit access tw...,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for Android,0.2,Positive
6,2021-06-04 21:02:41,Spectranet,Spectranet needs to understand its no turning ...,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for Mac,0.4,Positive
7,2021-06-01 15:00:10,Spectranet,Who can get me spectranet please?,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for Android,0.0,Neutral
8,2021-06-01 11:40:36,Spectranet,How did my Spectranet reset itself and how can...,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for Android,0.0,Neutral
9,2021-06-01 10:37:56,Spectranet,FBN_help Spectranet_NG Done,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for iPhone,0.0,Neutral


### 4. Validating Sentiment Results Manually

I decided to go through each of the tweets and assign a sentiment myself. This could help diagnose issues the sentiment analyzer might be struggling with and suggest areas for improvement

In [29]:
my_sentiment_opinions = ['Negative','Negative','Neutral','Neutral','Neutral',
                         'Neutral','Neutral','Positive','Neutral','Neutral',
                         'Neutral','Neutral','Neutral','Neutral','Positive',
                         'Neutral','Negative','Neutral','Positive','Positive',
                         'Positive','Positive','Positive','Positive','Negative',
                         'Positive','Neutral','Neutral']

In [30]:
for opin,text in zip(my_sentiment_opinions,df2['Text']):
  print(opin,':',text)

Negative : Spectranet_NG stop tormenting my life with mails and text messages from you. abeg now! You have one job, fix your… 
Negative : Spectranet_NG Una No get level o...😑
Neutral : wohley_jnr iSlimfit Spectranet_NG Well it’s only a matter of time before they join. VPN Dey, we move 🚀
Neutral : Spectranet finally joined the ban 😂😂 but VPN to the rescue 🚀
Neutral : eesawa chico_shamsz Actually I’m using wifi(Spectranet). It’s like they only banned it via network providers, bec… 
Neutral : Glo don receive am ooo. I no dey fit access twitter direct again. Na Spectranet I dey use direct. God knows if dem… 
Neutral : Spectranet needs to understand its no turning back for me. lol
Positive : Who can get me spectranet please?
Neutral : How did my Spectranet reset itself and how can I fix this without finding my LAN cable ? 🤦
Neutral : FBN_help Spectranet_NG Done
Neutral : Spectranet_NG Sent you a DM. Kindly reply. Thanks.
Neutral : MTNNG GloWorld AirtelNigeria Spectranet_NG 
Neutral : adefol

In [31]:
#Add my sentiment opinions to the dataframe. This will be used for comparison
df2['My_Sentiment'] = my_sentiment_opinions

In [32]:
from sklearn.metrics import accuracy_score, precision_recall_fscore_support
from sklearn.metrics import classification_report, confusion_matrix, ConfusionMatrixDisplay

#### Looking in-depth at classifications

In [34]:
print(classification_report(df2['My_Sentiment'],df2['Sentiment']))

              precision    recall  f1-score   support

    Negative       1.00      0.25      0.40         4
     Neutral       0.56      0.67      0.61        15
    Positive       0.44      0.44      0.44         9

    accuracy                           0.54        28
   macro avg       0.67      0.45      0.48        28
weighted avg       0.58      0.54      0.52        28



#### Looking at a dataframe of the subsets

**Negative Tweets**

In [107]:
sentiment_subsets['negative']['subset']

Unnamed: 0,Time,Subject,Text,Coordinates,Place,Source,Polarity,Sentiment,My_Sentiment
0,2021-06-12 19:01:33,Spectranet,Spectranet_NG stop tormenting my life with mai...,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for Mac,0.0,Neutral,Negative
1,2021-06-05 20:20:39,Spectranet,Spectranet_NG Una No get level o...😑,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for iPhone,0.0,Neutral,Negative
16,2020-09-11 11:59:40,IPNX,tundealuko Still wondering how IPNX said we us...,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for iPhone,-0.15,Negative,Negative
24,2020-06-09 23:23:13,IPNX,kelonline Ipnx oh,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for Android,0.0,Neutral,Negative


Textblob appears to fail because of the language. These tweets are in Nigerian lingo and Pidgin English and so, unsurprisingly, a sentiment analyzer built solely on English will struggle to accurately evaluate the sentiment of such tweets.

**Positive Tweets**

In [108]:
sentiment_subsets['positive']['subset']

Unnamed: 0,Time,Subject,Text,Coordinates,Place,Source,Polarity,Sentiment,My_Sentiment
7,2021-06-01 15:00:10,Spectranet,Who can get me spectranet please?,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for Android,0.0,Neutral,Positive
14,2020-11-11 14:30:40,IPNX,"fkabudu ipNX has never disgraced me, if they’r...",,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for iPhone,0.4,Positive,Positive
18,2020-08-13 17:01:33,IPNX,IPNX finally in this estate. Ope o,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for iPhone,0.0,Neutral,Positive
19,2020-06-26 13:24:04,IPNX,Lol glo nah war na... ipnx is the best,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for Android,0.9,Positive,Positive
20,2020-06-25 21:37:17,IPNX,itstopsss Ah... Use ipnx,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for Android,0.0,Neutral,Positive
21,2020-06-21 21:08:02,IPNX,MrAdeWest Ah sucks. They’re really good. Or tr...,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for iPhone,0.2,Positive,Positive
22,2020-06-12 14:27:48,IPNX,FurtherMaf F*ck Tizeti IPNX Way!!!!!!,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for iPhone,0.0,Neutral,Positive
23,2020-06-11 00:42:40,IPNX,"SheriphSkills Ipnx, the best. I don’t even use...",,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for iPhone,1.0,Positive,Positive
25,2020-05-07 15:28:07,IPNX,Ipnx no dey fall hand,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for iPhone,0.0,Neutral,Positive


Here TextBlob misclassifies because it doesn't take into account the connotations of sentences (which objectively, likely carry no sentiment) e.g. who can get me spectranet? This implies that Spectranet is good and desirable. Textblob again also misclassifies because of the use of Pidgin English e.g. IPNX no dey fall hand, which translates to IPNX does not disappoint.

**Neutral Tweets**

In [109]:
sentiment_subsets['neutral']['subset']

Unnamed: 0,Time,Subject,Text,Coordinates,Place,Source,Polarity,Sentiment,My_Sentiment
2,2021-06-05 11:13:33,Spectranet,wohley_jnr iSlimfit Spectranet_NG Well it’s on...,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for iPhone,0.0,Neutral,Neutral
3,2021-06-05 11:12:06,Spectranet,Spectranet finally joined the ban 😂😂 but VPN t...,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for iPhone,0.0,Neutral,Neutral
4,2021-06-05 09:46:44,Spectranet,eesawa chico_shamsz Actually I’m using wifi(Sp...,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for iPhone,0.0,Neutral,Neutral
5,2021-06-05 05:10:10,Spectranet,Glo don receive am ooo. I no dey fit access tw...,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for Android,0.2,Positive,Neutral
6,2021-06-04 21:02:41,Spectranet,Spectranet needs to understand its no turning ...,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for Mac,0.4,Positive,Neutral
8,2021-06-01 11:40:36,Spectranet,How did my Spectranet reset itself and how can...,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for Android,0.0,Neutral,Neutral
9,2021-06-01 10:37:56,Spectranet,FBN_help Spectranet_NG Done,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for iPhone,0.0,Neutral,Neutral
10,2021-05-29 12:06:29,Spectranet,Spectranet_NG Sent you a DM. Kindly reply. Tha...,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for Android,0.4,Positive,Neutral
11,2021-05-23 09:08:05,Spectranet,MTNNG GloWorld AirtelNigeria Spectranet_NG,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for Android,0.0,Neutral,Neutral
12,2020-11-20 01:13:42,Spectranet,adefola09 They said they don’t have coverage a...,,Place(_api=<tweepy.api.API object at 0x7fe0862...,Twitter for iPhone,0.0,Neutral,Neutral


Here TextBlob misclassifies because of misattribution. There are certain positive words in the sentence e.g. 'happy' in "Happy mother's day" or 'kindly' in "kindly reply", which do not directly relate to the ISP but are still present in the sentence. Again, it also misclassifies because of the use of pidgin English.

## Conclusion

The sentiment analyzer can definitely be improved upon. Here are some important issues to look into:

- Accounting for Pidgin English and slangs is very important if we are to properly classify tweets.

- Find a way to pinpoint the target/subject of words in a tweet to avoid misattribution of positive/negative words.

- How do we handle connotations of sentences? E.g. Get IPNX! This will be an interesting one to tackle, and I can definitely see such connotative sentences coming up very frequently.