# Twitter Data Analysis
With downloaded tweets data, this notebook aims to do some initial analysis.

In [32]:
# import necessary libraries
#%matplotlib inline
import json
import pandas as pd
import pprint
import re
from textblob import TextBlob
from datetime import datetime
from IPython.display import display


## Step 1: Read Downloaded Twitter Files
Read tweets from downloaded files into a list called **tweets_data**

In [2]:
# a list of files
file_paths = ['./data/search_data_07.txt','./data/search_data_06.txt','./data/search_data_05.txt','./data/search_data_04.txt','./data/search_data_03.txt',
              './data/search_data_02.txt', './data/search_data_01.txt']
# initialize data set
tweets_data = []
# loop on each file
for fp in file_paths:
    # open a file to read
    with open(fp,"r") as tweet_file:
        # read tweets into tweets_data
        for line in tweet_file:
            if len(line) > 10:
                try:
                    tweet = json.loads(line)
                    tweets_data.append(tweet)
                except ValueError as err:
                    print(err)
                    break

#print the number of tweets read
print("{} tweets have been read suceessfully.".format(len(tweets_data)))

33115 tweets have been read suceessfully.


Following command print out one example of the loaded tweets, which is the first tweet in the list.

In [50]:
%%html 
<iframe border=0 frameborder=0 height=250 width=550  src="http://twitframe.com/show? url=https://twitter.com/USPSHelp/status/894922531827720193">

Command below print out underlying information

In [52]:
pprint.pprint(tweets_data[0],depth = 10, indent=2)

{ 'contributors': None,
  'coordinates': None,
  'created_at': 'Tue Aug 08 14:05:33 +0000 2017',
  'entities': { 'hashtags': [],
                'symbols': [],
                'urls': [],
                'user_mentions': [ { 'id': 34010328,
                                     'id_str': '34010328',
                                     'indices': [0, 16],
                                     'name': 'KRYONCÈ👸🏾BABY!',
                                     'screen_name': 'LashezNLipstick'},
                                   { 'id': 386507775,
                                     'id_str': '386507775',
                                     'indices': [17, 22],
                                     'name': 'U.S. Postal Service',
                                     'screen_name': 'USPS'}]},
  'favorite_count': 0,
  'favorited': False,
  'geo': None,
  'id': 894922531827720193,
  'id_str': '894922531827720193',
  'in_reply_to_screen_name': 'LashezNLipstick',
  'in_reply_to_status_id': 89470314

## Step 2: Retrieve Features

Because there are so many information inclued in one tweet, we just picked some useful features for initial analysis. Thsee features have been retrieved and loaded into a pandas dataframe.
* **id** - unique identification of a tweet
* **created_at** - time stamp when a tweet was posted
* **text** - content of a tweet
* **country** - country of the world where the tweet posted
* **retweet_count** - number of times a tweet was retweeted
* **favorite_count** - number of time a tweet was liked
* **userId** - id of the user who posted the tweet
* **retweeted_status** - most imported indicator of if a tweet has been retweeted or not
* **userName** - screen name of the user who posted the tweet
* **followers_count** - number of followers of the user who posted the tweet

Note: the `created_at` time stamp looks like `'Fri Aug 04 20:37:50 +0000 2017'`, from which the date info was extracted and the feild name changed to `created_date` instead.

In [53]:
df = pd.DataFrame()
df['id'] = list(map(lambda t : t['id'], tweets_data))
df['created_date'] = list(map(lambda t : datetime.strptime(t['created_at'], '%a %b %d %H:%M:%S %z %Y').date(), tweets_data))
# df['created_at'] = list(map(lambda t : t['created_at'][0:10]+', '+t['created_at'][-4:], tweets_data))
df['text'] = list(map(lambda t : t['text'], tweets_data))
df['country'] = list(map(lambda t : t['place']['country'] if t['place'] != None else 'None', tweets_data))
df['retweet_count'] = list(map(lambda t : t['retweet_count'], tweets_data))
df['favorite_count'] = list(map(lambda t : t['favorite_count'], tweets_data))
df['retweeted_status'] = list(map(lambda t : t['retweeted_status']['id'] if t.get('retweeted_status') != None else 0, tweets_data))
df['userId'] = list(map(lambda t : t['user']['id'], tweets_data))
df['userName'] = list(map(lambda t : t['user']['screen_name'], tweets_data))
df['followers_count'] = list(map(lambda t : t['user']['followers_count'], tweets_data))

#show table sample
df.head()

Unnamed: 0,id,created_date,text,country,retweet_count,favorite_count,retweeted_status,userId,userName,followers_count
0,894922531827720193,2017-08-08,@LashezNLipstick @USPS Please see my DMs. ^JJS,,0,0,0,2872571512,USPSHelp,40316
1,894922466858086400,2017-08-08,#Ship #UPS! We will insure your #high #value #...,,0,0,0,1127797604,theupsstore6290,139
2,894921717230878720,2017-08-08,From Homeless to the Army to Senior Recruiter:...,,0,0,0,2805675691,angies_jobs,175
3,894921697727361027,2017-08-08,Newborns in Crisis: @FedEx Operations Manager ...,,0,0,0,2805675691,angies_jobs,175
4,894921608267038720,2017-08-08,https://t.co/8hVYFf00TP \0/ Success \n@airchin...,,0,0,0,2844932766,Delphinusdelph,917


## Step 3: Do Analysis

We did some analysis with the data sets and tried to answer some questions:

### Q1: how many tweets talked bout each of FedEx, UPS, DHL, and USPS every day?

To answer this question, we need to determine which company(or companies) each tweet was related to.

Defined a funciton named `word_in_text(word, text)` that checks if a `word` can be found in the `text`. 

In [70]:
def word_in_text(word, text):
    '''A function that tests if a word included in the text.'''
    word = word.lower()
    text = text.lower()
#     match = if any (re.search(w, text) for w in word.split(','))
    if any (re.search(w.strip(), text) for w in word.split(',')):
        return 1
    return 0

Add four new columns,**`FedEx`**, **`UPS`**, **`DHL`**, and **`USPS`** to the dataframe to flag if the tweet related to each of the companies.

In [55]:
df['FedEx'] = df['text'].apply(lambda t: word_in_text('FedEx', t))
df['UPS'] = df['text'].apply(lambda t: word_in_text('UPS', t))
df['DHL'] = df['text'].apply(lambda t: word_in_text('DHL', t))
df['USPS'] = df['text'].apply(lambda t: word_in_text('USPS', t))
#show the new dataframe
df.head()

Unnamed: 0,id,created_date,text,country,retweet_count,favorite_count,retweeted_status,userId,userName,followers_count,FedEx,UPS,DHL,USPS
0,894922531827720193,2017-08-08,@LashezNLipstick @USPS Please see my DMs. ^JJS,,0,0,0,2872571512,USPSHelp,40316,0,0,0,1
1,894922466858086400,2017-08-08,#Ship #UPS! We will insure your #high #value #...,,0,0,0,1127797604,theupsstore6290,139,0,1,0,0
2,894921717230878720,2017-08-08,From Homeless to the Army to Senior Recruiter:...,,0,0,0,2805675691,angies_jobs,175,1,0,0,0
3,894921697727361027,2017-08-08,Newborns in Crisis: @FedEx Operations Manager ...,,0,0,0,2805675691,angies_jobs,175,1,0,0,0
4,894921608267038720,2017-08-08,https://t.co/8hVYFf00TP \0/ Success \n@airchin...,,0,0,0,2844932766,Delphinusdelph,917,1,0,0,0


Through steps belows, we can get number of tweets of each companies and total number of all tweets on each day. 
First, a new dataframe was built through selecting fields of **`'created_date', 'FedEx', 'UPS', 'DHL', 'USPS'`** and adding a new field **`Count`** for total number of tweets calculation. After that, a **`groupby - sum`** operation on **`created_date`** to get total numbers.

In [56]:
# built a new dataframe
df_q01 = df.loc[:,['created_date', 'FedEx', 'UPS', 'DHL', 'USPS']]
df_q01['Count'] = 1
# get the result
twt_by_date = df_q01.groupby('created_date').sum()
twt_by_date

Unnamed: 0_level_0,FedEx,UPS,DHL,USPS,Count
created_date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2017-07-21,664,944,38,551,2131
2017-07-22,496,962,46,582,1988
2017-07-23,452,349,16,335,1088
2017-07-24,567,762,33,508,1820
2017-07-25,562,1005,67,557,2081
2017-07-26,634,985,80,747,2342
2017-07-27,596,771,59,685,2062
2017-07-28,606,801,61,600,1967
2017-07-29,347,928,20,472,1736
2017-07-30,439,374,23,389,1216


With above table, it is easy for us to get **overall tweets distribution among four companies**. Figures below is plotted through **Bokeh**, a python interactive visualization library, which targets modern web browsers. The figures can be embeded in a web page for a dynamic visualization.

In [7]:
twt_by_company = twt_by_date[['FedEx', 'UPS', 'DHL', 'USPS']].sum().rename_axis('Company').reset_index(name='Count')
twt_by_company= twt_by_company.sort_values(by='Count', ascending= False)
print(twt_by_company)

# import bokeh library
from bokeh.charts import Bar, output_notebook, show
from bokeh.charts.attributes import CatAttr
# output to notebook
output_notebook()

# plot grouped barcharts
p = Bar(data=twt_by_company, label=CatAttr(columns=['Company'], sort=False),  values='Count',  legend=None,
        plot_width=600, plot_height=300, title='Company Tweet Counts')
show(p)

  Company  Count
1     UPS  12910
0   FedEx  11092
3    USPS   9470
2     DHL    801


The code block below plots a figure of **daily total and company tweet counts**, where total tweet counts over days are shown as a dotted line and grouped bars present tweet counts of four companies at each day.

In [9]:
# import bokeh library
from bokeh.charts import Bar, Line, output_notebook, show
from bokeh.models.ranges import Range1d
from bokeh.models import ColumnDataSource
from bokeh.models.glyphs import Line as Line_glyph
from bokeh.models.markers import Square

# output to notebook
output_notebook()

# prepare plotting data
tot_twt_by_date = twt_by_date['Count'].reset_index()
tot_twt_by_date['created_date']=tot_twt_by_date['created_date'].apply(lambda x: str(x))

cmp_twt_by_date = twt_by_date[['FedEx', 'UPS', 'DHL', 'USPS']].sort_values(by=twt_by_date.index[0], axis=1, ascending = False)
cmp_twt_by_date =cmp_twt_by_date.stack().rename_axis(['created_date','Companies']).reset_index(name='Count')
cmp_twt_by_date['created_date']=cmp_twt_by_date['created_date'].apply(lambda x: str(x))

# plot grouped barcharts
p = Bar(data=cmp_twt_by_date, label = 'created_date',  values='Count', group = 'Companies', legend='top_right',
        color=['#ffcd00', '#4B1388', '#644117','#004A87' ],
        plot_width=900, plot_height=450, title='Daily Total and Company Tweet Counts')

# build a columndatasource for line
source = ColumnDataSource(tot_twt_by_date)
# create a line glyph object which references columns from source data
line = Line_glyph(x='created_date', y='Count', line_dash='dashed',line_color="#F46D43",line_alpha=0.6, line_width=2)
# add the glyph to the chart
p.add_glyph(source, line)

square = Square(x='created_date', y='Count', size=5, line_color="#F46D43", fill_color="white")
p.add_glyph(source, square)

# reset y range
p.y_range = Range1d(0, 2500)

# show plot
show(p)

Next take a further look at FedEx tweet daily volume.

In [58]:
# import bokeh library
from bokeh.plotting import figure, show, output_notebook
from bokeh.models import BoxAnnotation
from bokeh.models import DatetimeTickFormatter
from math import pi
# output to notebook
output_notebook()

# prepare plotting data
fedex_daily_volume = twt_by_date['FedEx'].reset_index().rename(columns={'FedEx':'Count'})
fedex_daily_volume['created_date'] = fedex_daily_volume['created_date']
x = fedex_daily_volume['created_date']
y = fedex_daily_volume['Count']


des = y.describe()
mean = des['mean']
std = des['std']
UL = mean +  1.9*std
LL = mean - 1.9*std

# build the figure
p = figure(plot_width=600, plot_height=400, 
           x_axis_type='datetime', 
           title='FedEx Tweet Daily Volume',
           x_axis_label ='Date',
          y_axis_label='Tweet Counts')

# add tweet count line 
p.line(x=x, y=y,line_color='#4B1388',line_alpha=0.6, line_width=2)
# add diamond market
p.circle(x=x, y=y,line_color='#4B1388',size=10, color="white", line_width=2)

# build control lines
low_box = BoxAnnotation(bottom= LL, top=LL )
mid_box = BoxAnnotation(bottom=LL, top=UL, fill_alpha=0.1, fill_color='green')
mean_line = BoxAnnotation(bottom=mean, top=mean, line_dash='dashed',line_color="black",line_width=2 )
high_box = BoxAnnotation(bottom=UL, top=UL )
# add control lines
p.add_layout(low_box)
p.add_layout(mid_box)
p.add_layout(high_box)
p.add_layout(mean_line)

# format datatime
p.xaxis.formatter=DatetimeTickFormatter(
        days=["%Y-%m-%d"]
    )
# set orientation
p.xaxis.major_label_orientation = pi/4
p.y_range = Range1d(0, des['max']+400)

# set grid 
p.xgrid[0].grid_line_alpha=0.3
p.ygrid[0].grid_line_alpha=0.3

# show results
show(p)


Some potential observations can be made:
1. More than one thousand daily total tweets should be hard for a human to read manually
2. The total tweets volume goes up and down in a cyclical pattern(less in weekends and more in the middle of weeks)
3. Each individual company's tweet volume also follows a similar cyclical up and down pattern

However, on August 06(Sunday) and August 07(Monday), the volumes of tweets related to FedEx were almost doubled, let's take a look at the tweets on the two days!

**NOTE: Analysis below are based on only FedEx tweets on the two days, but the same anaysis process and approach are appliable to other tweet datasets.**

### Q2: What happened on Aug 06, 2017 and Aug07, 2017? What topics people have talked about FedEx on the two days?
To answer this question, we can take a look at what are the most popular original tweets (with most retweeted count) during the two days.

In [59]:
# select FedEx tweet on the day of aug06, 2017
is_aug0607 = df['created_date'].map(lambda x: (str(x)=='2017-08-06') | (str(x)=='2017-08-07'))
aug0607_FedEx_Tweets = df[is_aug0607 & (df['FedEx'] ==1)]
aug0607_FedEx_Original_Tweets =  aug0607_FedEx_Tweets[aug0607_FedEx_Tweets['retweeted_status'] == 0] 
# sort by retweet_count 
sorted_by_retweet_count = aug0607_FedEx_Original_Tweets.sort_values(by='retweet_count',ascending=False)
# print out the top 10 tweet
i = 0
for idx, row in sorted_by_retweet_count.head(10).iterrows():
    print("Tweet({0}) with author '{1}' (has {2} followers), date '{3}' and had been retweeted {4} times, has text: \n \t{5}".format(i+1, row['userName'], row['followers_count'], row['created_date'], row['retweet_count'], row['text']))
    print()
    i = i+1
    

Tweet(1) with author 'ShujaRabbani' (has 299505 followers), date '2017-08-06' and had been retweeted 611 times, has text: 
 	Hi @FedEx Dubai, your delivery is an absolute nightmare! Your drivers are lost &amp; confused! One delivery has been bouncing around for 5 days!

Tweet(2) with author 'h0t_p0ppy' (has 14126 followers), date '2017-08-06' and had been retweeted 16 times, has text: 
 	"im leaving on a jet plane" and i am NEVER going home @FedEx You are complicit in dolphin slavery 
#OpSeaWorld https://t.co/ZeunaDfiMT

Tweet(3) with author 'h0t_p0ppy' (has 14126 followers), date '2017-08-06' and had been retweeted 15 times, has text: 
 	#OpFunKill
Its Time for companies to be accountable for their actions. 
#FedEx profit from #extinction #FDX #NYSE https://t.co/Zp3n2SeamA

Tweet(4) with author 'h0t_p0ppy' (has 14126 followers), date '2017-08-06' and had been retweeted 12 times, has text: 
 	.@FedEx could save sharks in one simple yet powerful way: banning shark fin shipments~ Yet the

We can also calculate the most frequent keywords in these tweets using code block below.

In [60]:
import string
from collections import Counter 
from nltk.tokenize import TweetTokenizer 
from nltk.corpus import stopwords

# define a pre-processing function
def process(text, tokenizer=TweetTokenizer(), stopwords=[]):
    """Process the text of a tweet:
    - Lowercase
    - Tokenize
    - Stopword removal
    - Digits removal
    
    Return: list of strings
    """ 
    text = text.lower()
    tokens = tokenizer.tokenize(text)
    return [tok for tok in tokens if tok not in stopwords and not tok.isdigit()] 

# define the tokennizer
tweet_tokenizer = TweetTokenizer() 
punct = list(string.punctuation)
# define stop words list
stopword_list = stopwords.words('english') + punct + ['rt','via', '...','…','hi','#fedex','#fdx','@fedex', '@fedexhelp','one']

# initialize the counter dict
tf = Counter()
# loop through all fedex tweet on aug 06, 2017
for _, row in aug0607_FedEx_Tweets.iterrows():
    tokens = process(row['text'], tokenizer=tweet_tokenizer, stopwords=stopword_list)
    tf.update(tokens) 

# print result
print("Keyword : Count")
print("----------------")
for tag, count in tf.most_common(30): 
    print("{} : {}".format(tag, count))

Keyword : Count
----------------
delivery : 1250
lost : 611
drivers : 611
absolute : 603
confused : 602
nightmare : 601
bouncing : 601
@shujarabbani : 600
dubai : 600
@angelciraq214 : 296
@repdonbeyer : 296
@hillaryclinton : 296
@barackobama : 296
@va8thcddems : 296
@lowkell : 296
@timkaine : 296
@scooterocket : 281
tracking : 164
#opseaworld : 129
package : 105
stop : 91
#opfunkill : 86
time : 84
today : 80
system : 73
fedex : 69
get : 59
@dennyhamlin : 58
going : 57
#boycottfedex : 53


In [72]:
from bokeh.plotting import figure, show, output_notebook

output_notebook

# prepare data
top_key_words = tf.most_common(30)
keyword_list = [kw[0] for kw in top_key_words[::-1]]
cout_list = [kw[1] for kw in top_key_words[::-1]]


p = figure(plot_width=400, plot_height=400, y_range = keyword_list, y_axis_label="Key Word", x_axis_label='Frequency')
p.hbar(y=keyword_list, height=0.5, left=0, right=cout_list, color="navy")

show(p)



### Q3: Among  these tweets, which are positive or negative comments?
Following steps applies the `textblob`- a sentiment analysis library built on top of NLTK, to classify sentiment of tweet text.

In [73]:
# Step 1: prepare dataset
aug0607_FedEx_Tweets = aug0607_FedEx_Tweets.reset_index()

# Step 2: clean up text for sentiment analysis
def clean_text(text):
        '''
        Utility function to clean tweet text by removing links, special characters
        using simple regex statements.
        '''
        return ' '.join(re.sub("(@[A-Za-z0-9]+)|([^0-9A-Za-z \t])|(\w+:\/\/\S+)", " ", text).split())
    
aug0607_FedEx_Tweets['Cleaned_Text'] = aug0607_FedEx_Tweets['text'].apply(lambda t: clean_text(t))

# Step 3: define and apply sentiment analysis
def get_tweet_sentiment(tweet):
        '''
        Utility function to classify sentiment of passed tweet text
        using textblob's sentiment method
        '''
        # create TextBlob object of passed tweet text
        analysis = TextBlob(clean_text(tweet))
        # set sentiment
        if analysis.sentiment.polarity > 0.4:
            return 'strong positive'
        elif analysis.sentiment.polarity > 0.1 and analysis.sentiment.polarity <=0.4:
            return 'weak positive'
        elif analysis.sentiment.polarity > -0.1 and analysis.sentiment.polarity <=0.1:
            return 'neutral'
        elif analysis.sentiment.polarity >-0.4 and analysis.sentiment.polarity <=-0.1:
            return 'weak negative'
        else:
            return 'strong negative'
aug0607_FedEx_Tweets['sentiment'] = aug0607_FedEx_Tweets['Cleaned_Text'].apply(lambda t: get_tweet_sentiment(t))
aug0607_FedEx_Tweets.head(5)

Unnamed: 0,index,id,created_date,text,country,retweet_count,favorite_count,retweeted_status,userId,userName,followers_count,FedEx,UPS,DHL,USPS,Cleaned_Text,sentiment
0,812,894708162212876289,2017-08-07,@AmazonHelp Thank you! I called @fedex and the...,,0,0,0,35654011,Ghelardini,968,1,0,0,0,Thank you I called and their system is down so...,weak negative
1,815,894707533759303680,2017-08-07,Apparently the @FedEx driver couldn't pick up ...,,0,0,0,701509362074984449,toridor324_,45,1,0,0,0,Apparently the driver couldn t pick up my pack...,neutral
2,818,894707177037889536,2017-08-07,RT @jclaiborne311: @RyanAsh60 @CloydRivers @UP...,,7,0,824791231930707968,760010266251780096,MARGARETFlana18,1002,1,1,0,1,RT I ve worked at UPS since I graduated HS and...,neutral
3,819,894707082783596544,2017-08-07,RT @RyanAsh60: @CloydRivers Sorry @UPS and @US...,,56,0,824773244691501056,760010266251780096,MARGARETFlana18,1002,1,1,0,1,RT Sorry and but the next thing I ship will be...,weak negative
4,838,894705128367939584,2017-08-07,@Scooterocket @AngelCIraq214 @FedEx @RepDonBey...,,0,0,0,4715692053,2016MikeWebbVA8,1434,1,0,0,0,,neutral


In [74]:
# picking strong positive tweets from FedEx tweets
strng_pos_tweets = aug0607_FedEx_Tweets[aug0607_FedEx_Tweets['sentiment'] == 'strong positive']
# percentage of strong positive tweets
print("{0} Strong Positive tweets counts for {1:.2f}% of total {2} tweets".format(strng_pos_tweets.shape[0], \
        100*strng_pos_tweets.shape[0]/aug0607_FedEx_Tweets.shape[0], aug0607_FedEx_Tweets.shape[0]))

# picking weak positive tweets from FedEx tweets
weak_pos_tweets = aug0607_FedEx_Tweets[aug0607_FedEx_Tweets['sentiment'] == 'weak positive']
# percentage of weak positive tweets
print("{0} Weak Positive tweets counts for {1:.2f}% of total {2} tweets".format(weak_pos_tweets.shape[0], \
        100*weak_pos_tweets.shape[0]/aug0607_FedEx_Tweets.shape[0], aug0607_FedEx_Tweets.shape[0]))

# picking neutral tweets from FedEx tweets
neu_tweets = aug0607_FedEx_Tweets[aug0607_FedEx_Tweets['sentiment'] == 'neutral']
# percentage of neutral tweets
print("{0} Neutral tweets counts for {1:.2f}% of total {2} tweets".format(neu_tweets.shape[0], \
        100*neu_tweets.shape[0]/aug0607_FedEx_Tweets.shape[0], aug0607_FedEx_Tweets.shape[0]))

# picking weak negative tweets from FedEx tweets
weak_neg_tweets = aug0607_FedEx_Tweets[aug0607_FedEx_Tweets['sentiment'] == 'weak negative']
# percentage of weak negative tweets
print("{0} Weak Negative tweets counts for {1:.2f}% of total {2} tweets".format(weak_neg_tweets.shape[0], \
        100*weak_neg_tweets.shape[0]/aug0607_FedEx_Tweets.shape[0], aug0607_FedEx_Tweets.shape[0]))

# picking strong negative tweets from FedEx tweets
strng_neg_tweets = aug0607_FedEx_Tweets[aug0607_FedEx_Tweets['sentiment'] == 'strong negative']
# percentage of weak negative tweets
print("{0} Strong Negative tweets counts for {1:.2f}% of total {2} tweets".format(strng_neg_tweets.shape[0], \
        100*strng_neg_tweets.shape[0]/aug0607_FedEx_Tweets.shape[0], aug0607_FedEx_Tweets.shape[0]))


74 Strong Positive tweets counts for 3.60% of total 2055 tweets
179 Weak Positive tweets counts for 8.71% of total 2055 tweets
938 Neutral tweets counts for 45.64% of total 2055 tweets
782 Weak Negative tweets counts for 38.05% of total 2055 tweets
82 Strong Negative tweets counts for 3.99% of total 2055 tweets


In [75]:
# printing first 10 positive tweets
print("Some Strong Positive tweet examples:\n")
for idx, row in strng_pos_tweets.head(5).iterrows():
    print("Tweet " + str(idx) + ":")
    print(row['text'])

Some Strong Positive tweet examples:

Tweet 39:
Jay Cutler is more accurate with the ball than @FedEx is with their delivery and customer service.
Tweet 51:
RT @ZubinErnestt: Directly from #FedEx office karachi :) celebrating the joy! Representing #Christ at MarketPlace (keep sending in... https…
Tweet 68:
RT @MCrapisi1985: @dennyhamlin @FedEx @ToyotaRacing @RaceSonoma #2 for sure
Tweet 91:
Survived my birthday weekend....
#GameOfThrones made it amazing granted #Fedex made my Monday feel like a hangover… https://t.co/IaD4p7SsVN
Tweet 93:
Did you know that @Intuit customers get special discounts on @FedEx shipping? Learn more here &gt;&gt;… https://t.co/LOtQDQqobs


In [16]:
# printing first 10 positive tweets
print("Some Weak Positive tweet examples:\n")
for idx, row in weak_pos_tweets.head(5).iterrows():
    print("Tweet " + str(idx) + ":")
    print(row['text'])

Some Weak Positive tweet examples:

Tweet 11:
RT @FedEx: We are addressing some functionality issues and are working to resolve as soon as possible.  We sincerely regret any inconvenien…
Tweet 13:
Newborns in Crisis: @FedEx Operations Manager Fosters More Than 75 Critical Need Babies https://t.co/8mNEdJ7mT6… https://t.co/569DOCmBVp
Tweet 23:
@FedEx Awesome your outage had me waste a whole day waiting at home.
Tweet 25:
Decline of #CustomerService - #FedEx take heed. Better train your reps! And show #professionalism too. Oh, and maybe give a straight answer?
Tweet 29:
#fedex you really messed up today. I and everyone else better get a refund. I’ll request senders use DHL and UPS next time.


In [17]:
# printing first 10 neutral tweets
print("Some Neutral tweet examples:\n")
for idx, row in neu_tweets.head(5).iterrows():
    print("Tweet " + str(idx) + ":")
    print(row['text'])

Some Neutral tweet examples:

Tweet 0:
@AmazonHelp Thank you! I called @fedex and their system is down so I can't reschedule my delivery that was supposed to be here days ago :(
Tweet 1:
Apparently the @FedEx driver couldn't pick up my package today because of an incorrect address, yet my address was 100% correct? @FedExHelp
Tweet 2:
RT @jclaiborne311: @RyanAsh60 @CloydRivers @UPS @USPS @FedEx I've worked at UPS since I graduated HS and I've got tons of respect for this…
Tweet 4:
@Scooterocket @AngelCIraq214 @FedEx @RepDonBeyer @HillaryClinton @BarackObama @VA8thCDDems @lowkell @timkaine… https://t.co/FIDqPb4gpg
Tweet 5:
@SBellDC @FedEx @FedExHelp They promise express, not all in one piece! Jeez you are demanding!!!!


In [18]:
# printing first 10 weak negative tweets
print("Some Weak Negtive tweet examples:\n")
for idx, row in weak_neg_tweets.head(5).iterrows():
    print("Tweet " + str(idx) + ":")
    print(row['text'])

Some Weak Negtive tweet examples:

Tweet 3:
RT @RyanAsh60: @CloydRivers Sorry @UPS and @USPS but the next thing I ship will be through @FedEx because of this guy.
Tweet 7:
@FedEx you guys are the worst. Not only did the driver not come to my home and lied  I just got a call saying the new time may not happen
Tweet 10:
@FedEx why has FedEx delivered my belongings to the wrong address twice. Then lied about coming to the door. I ship and receive firearms.
Tweet 19:
@FedEx talk about a completely broken system. My package could have been delivered Friday and today but NOOO you wa… https://t.co/PZkf4dUiOT
Tweet 27:
RT @JeniferBrucker: @ShipSticks @FedEx VERY unhappy with service.My husband's new customized golf clubs are lost?The driver is coincidental…


In [19]:
# printing first 10 weak negative tweets
print("Some Strong Negtive tweet examples:\n")
for idx, row in strng_neg_tweets.head(5).iterrows():
    print("Tweet " + str(idx) + ":")
    print(row['text'])

Some Strong Negtive tweet examples:

Tweet 32:
Never do any shipping with @FedEx they are worthless. If I had customer service like them I would be fired
Tweet 120:
@FedEx is the worst company on planet earth. I avoid them like the plague https://t.co/4iRGzmWSsQ
Tweet 152:
@Kayeri I am so tired of their crap I stopped buying from places that ship @FedEx.
Tweet 153:
@Kayeri Waiting on @FedEx is a lost cause. They are the worst of the worst.
Tweet 172:
@FedEx GET IT TOGETHER FEDEX!!!!!! So annoying.  When will this be resolved?


In [51]:
%%html 
<iframe border=0 frameborder=0 height=250 width=550  src="http://twitframe.com/show? url=https://twitter.com/USPSHelp/status/894922531827720193">