## Import Libraries

All libraries required to create a model capable of classifying tweets by category are imported.  These are described in the comments below.

In [77]:
## Pickle allows Python objects to be saved for later use, and retrieved
import pickle

## Works with Pickle to load sklearn models
from sklearn.externals import joblib

## Pandas required to manipulate data into user-friendly data structure
import pandas as pd

import numpy as np

## Set Pandas Display Options

Pandas display settings are chosen to ensure that the full contents of each column can be seen.

In [61]:
## Set width of pandas dataframe to ensure entire Tweet is displayed
pd.set_option('display.max_colwidth', 3000)

## Create Colour Class to Format Text

A class is created to enable text to be printed in bold and with an underline.  This is used later in this workbook.

In [106]:
## Define class to enable text to be emboldened and underlined
class colour:
   BOLD = '\033[1m'
   UNDERLINE = '\033[4m'
   END = '\033[0m'

## Load Saved Model

The logistic regression model, compiled during **[Step 4 - Model Data](https://github.com/isobeldaley/categorising-tweets/blob/master/Step%204%20-%20Model%20Data.ipynb)** is loaded using Pickle.

In [62]:
## Load logistic regression model using Pickle
model = joblib.load('model.pkl')

## Load Test Data

Next, to understand how this model performs when classifying tweets, the test data (obtained during **[Step 4 - Model Data](https://github.com/isobeldaley/categorising-tweets/blob/master/Step%204%20-%20Model%20Data.ipynb)** is loaded using Pickle.  

In [63]:
## Load X_test, y_test and X_test vectorized using tf-idf vectorization in step 4
X_test = pickle.load(open("X_test.pkl", "rb"))
y_test = pickle.load(open("y_test.pkl", "rb"))
tf_idf_X_test = pickle.load(open("tf_idf_X_test.pkl", "rb"))

In addition, the table containing the original tweet and sentiment rating is loaded (this was created in **[Step 2 - Scrub Data](https://github.com/isobeldaley/categorising-tweets/blob/master/Step%202%20-%20Scrub%20Data.ipynb)**).

In [64]:
## Load table containing original tweet and sentiment rating
df = pd.read_pickle('cleaned_labelled_tweets')

## Select a Random Sample of Tweets

To understand how well this model works, a sample of 10 tweets are selected from the test data.  Predicted categories will then be generated for these tweets.

In [65]:
## Select a sample of 10 tweets from the test dataset (NB: What is shown below if the cleaned/lemmatized tweet)
X_test[215:225]

9460                                         customer warned charged 35.65 turning airplane mode appox 4mins whilst visiting u last week absolute disgrace justify charge expect retain loyal customer overcharged
10443    ordered new phone yesterday received email confirming order say full swing order number.. paid next day delivery still heard nothing check ive tried call customer service attempt live chat nothing work
6881                                                                                                                                            messing around yet wifi tv box lit like xmas tree fml thank god 4g
5787                                                                                                                                                                                            yes ive moved voxi
778                                                                                                                                                         

To provide a complete comparison, the original tweet, alongside the TextBlob sentiment rating is extracted:

In [66]:
## Select original tweet and sentiment for each of the tweets above
df = df.loc[[9460,10443,6881,5787,778,9793,5627,929,5779,5614],['original_tweet','sentiment']]

## Process Tweets Through Model

Next, the model is used to generate predictions for each of the selected test tweets.

In [67]:
## Isolate the vectorized test tweets for which predictions will be made
tweets_to_process = tf_idf_X_test[215:225]

In [68]:
## Generate predictions for each of the test tweets using the model
predictions = model.predict(tweets_to_process)

## Compare Predicted Category & Sentiment to Original Tweet

Before comparing the original tweet with the predicted category and sentiment, the results are combined in a single dataframe.

In [71]:
## Combine predicted category/sentiment and original tweet in single dataframe
results_df = pd.DataFrame({'Tweet': df['original_tweet'], 'Actual Category': y_test[215:225], 
                   'Predicted Category': predictions, 
                   'Predicted Sentiment':df['sentiment']})

Then, the results are printed for comparison.

In [72]:
## Display results 
results_df

Unnamed: 0,Tweet,Actual Category,Predicted Category,Predicted Sentiment
9460,O2 customers be warned!! I was charged £35.65 for turning airplane mode off for appox 4mins whilst visiting the US last week!! 😡 @O2 This is an absolute disgrace - how can you justify these charges and expect to retain loyal customers? #o2 #overcharged,contract,customer service,0.177778
10443,@O2 ordered a new phone yesterday received an email confirming the order to say it’s in full swing with an order number.. i paid for next day delivery and still have heard nothing can you check this for me? I’ve tried to call customer services and attempt live chat nothing works!,device,device,0.164205
6881,@virginmedia messing around yet again. No wifi. TV box lit up like a Xmas tree. Fml. Thank god for @ThreeUK 4G,network,network,0.0
5787,"@EE Yes, I’ve moved to Voxi.",other,other,0.0
778,Would @VodafoneUK treat customers the same? https://t.co/2xIaJudZbu,other,customer service,0.0
9793,"@O2 hey, I’m struggling with your signal. I continuously have 4G yet never can get on apps, download or send messages. Have to connect to WiFi to work.",network,network,0.0
5627,@DannyStradomsky @EE Ahh ok dan!! Thought so 👍🏻,other,other,0.78125
929,@VodafoneUK Vodafone has appalling customer service. Am trying to unlock my phone with code U sent me which ISN'T WORKING. I have tried all routes to make direct contact and it is impossible! Your rip off rates and 'can't do' approach delivers the worst/most frustrating customer experience.,customer service,customer service,-0.370833
5779,@William31567 @EE @netflix @actionfrauduk @CumbrianRambler @glocky9 @ChrisJCoates @Catstycam @theJeremyVine @PaulKingstonITV @WalksBritain @walkingbookscom @BBC_Cumbria @Mounta1n_Mike I googled the number at the bottom of the email there are some slimy so-and-so’s https://t.co/4mKr2ibhMh,other,other,0.0
5614,@EE Sorted now. It was 16 hours ago I messaged u,customer service,customer service,0.0


The performance of the model appears to be reasonable.  On 8 out of 10 occasions the category is correctly predicted. 

There are two tweets that are labelled "customer service" when the human has labelled them "contract" and "other".  However, upon deeper reading of the tweet, it could be argued that either or both could have been labelled "customer service".

Performance of the sentiment rating is more difficult to judge.  The first tweet is clearly negative, and yet has a mildly positive sentiment rating of 0.18.  Meanwhile, the majority of tweets have been given a neutral rating.  More work may be needed to fine tune this element of the project.

## On-the-Pulse Measure of Customer Satisfaction

At the outset, it was stated that an objective of this project was to provide mobile networks with an on-the-pulse measure of customer satisfaction.  To understand how this would work, measures are provided for the demo data above.

### Subject

First, a dataframe is created showing the proportion of tweets relating to each subject category.

In [99]:
## Create a dataframe to show the proportion of tweets relating to each category
categories = pd.DataFrame({'Proportion of Tweets (%)':results_df['Predicted Category'].value_counts(normalize=True)})
categories['Proportion of Tweets (%)'] = (categories['Proportion of Tweets (%)']*100).astype(int)

In [104]:
## Display the proportion of tweets relating to each category
print(colour.BOLD+colour.UNDERLINE +'Proportion of Tweets by Category'+colour.END)
categories

[1m[4mProportion of Tweets by Category[0m


Unnamed: 0,Proportion of Tweets (%)
customer service,40
other,30
network,20
device,10


### Sentiment

Next, the mean sentiment for all tweets is assessed:

In [107]:
## Print the mean sentiment for all tweets
print(colour.BOLD+colour.UNDERLINE +'Mean Sentiment of Tweets'+colour.END)
round(np.mean(results_df['Predicted Sentiment']),2)

[1m[4mMean Sentiment of Tweets[0m


0.08

## Recommendations

Having demonstrated its potential, we recommend that this model is used in the following ways:

1. To pre-categorize tweets.  This will enable all questions/issues to be directed to the correct person/department and handled efficiently.

2. To provide a measure of the proportion of tweets relating to each subject category.  This will help mobile network operators to quickly identify and address specific (e.g. persistent problems with customer service, or poor network coverage).

3.  To provide an on-the pulse measure of customer satisfaction, by considering the distribution of sentiment ratings.

This approach will enable more effective management of customer communications via social media, rapid assessment of customer sentiment/issues.


## Future Work

There are a number of areas that would merit further investigation in the future.  These are detailed below.


### Labelling & Categorisation

For this project, we had access to 4,377 labelled tweets.  It is highly likely that model performance could be improved if the labelled dataset was significantly expanded.  Given the appropriate time and resources, it is recommended that this is done prior to deploying the model.  

Moreover, it was noted that a large proportion of tweets were categorised as "Other".  This suggests that more detailed categorisation may be needed.  

### Tweets & Replies

This model considered tweets individually.  However, tweets are often part of a longer conversation between a service agent and a customer.  It may be useful to broaden the model to consider original tweets alongside any replies.  This would help operators understand how many tweets and how much time was required to resolve an issue/question.  This is a crucial component of customer service.  

### Sentiment Analysis

Finally, it was noted that the sentiment analysis was often inaccurate.  An alternative means of modelling sentiment should be considered to improve performance of this element of the model. 