# Sentiment and Emotion Analysis

Notebook 4 of 4

We will analyse the posts from each of the subreddits as well as some of the major topics from each to get understanding on the communities sentiment and emotion. In order to do so, we will utilise the Hugging Face pre-trained models for sentiment analysis as well as emotion analysis.
 
The topics for each subreddit we will explore are:
- Dunkin Donuts
 1. dunkin donuts 
 2. cold brew 
 3. butter pecan
 4. local dunkin
 5. reward (free drinks, free beverages, point) 
 6. service (mobile order, drive through, staff) 
 7. barista
 
- Starbucks
 1. dress code 
 2. pumpkin spice
 3. cold brew
 4. fall launch
 5. reward (free drinks, free beverages, point) 
 6. service (mobile order, drive through, staff) 
 7. barista
 
Dunkin Donuts and Starbucks have common beverages as well as their signature unique products. We conducted analysis on the top 3 most popular products for each subreddit based on the frequency of the words appear in the subreddit. The local and upcoming launch of product are also hot topics in both subreddits. Reward system and service are areas to check for review. Barista was amazingly being mentioned more in Starbucks tha Dunkin. 


## Import Clean Data

In [1]:
# import libraries
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from nltk.tokenize import word_tokenize, RegexpTokenizer
from transformers import pipeline

In [2]:
combined_df = pd.read_csv('./datasets/combined_cleaned.csv')
combined_df.shape

(4997, 6)

In [3]:
combined_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4997 entries, 0 to 4996
Data columns (total 6 columns):
 #   Column          Non-Null Count  Dtype 
---  ------          --------------  ----- 
 0   Unnamed: 0      4997 non-null   int64 
 1   subreddit       4997 non-null   int64 
 2   selftext        3279 non-null   object
 3   title           4996 non-null   object
 4   created_utc     4997 non-null   int64 
 5   title_selftext  4997 non-null   object
dtypes: int64(3), object(3)
memory usage: 234.4+ KB


In [4]:
combined_df.head(3)

Unnamed: 0.1,Unnamed: 0,subreddit,selftext,title,created_utc,title_selftext
0,0,1,,my coworker placing the hash browns like army ...,1663204910,my coworker placing the hash browns like army ...
1,1,1,,whats the deal with these?,1663196066,whats the deal with these? nan
2,2,1,I know I asked about this before but I'm just ...,working for dunkin,1663193081,working for dunkin i know i asked about this b...


In [5]:
# drop unnecessary columns
combined_df = combined_df.drop(columns = ['Unnamed: 0', 'selftext', 'title_selftext'], axis=1)

In [6]:
# check for null values
combined_df.isnull().sum()

subreddit      0
title          1
created_utc    0
dtype: int64

In [7]:
# drop null
combined_df = combined_df.dropna()

In [8]:
# check for empty values
combined_df[combined_df['title'] == ''].sum()

subreddit      0.0
title          0.0
created_utc    0.0
dtype: float64

There is no missing values in dataset now.

### Tokenize words and join back into a sentence to remove unwanted characters

In [9]:
tokenizer = RegexpTokenizer(r'\w+')

In [10]:
combined_df['tokenized'] = combined_df['title'].apply(lambda x: tokenizer.tokenize(x.lower()))
combined_df.head(3)

Unnamed: 0,subreddit,title,created_utc,tokenized
0,1,my coworker placing the hash browns like army ...,1663204910,"[my, coworker, placing, the, hash, browns, lik..."
1,1,whats the deal with these?,1663196066,"[whats, the, deal, with, these]"
2,1,working for dunkin,1663193081,"[working, for, dunkin]"


In [11]:
combined_df['title'] = combined_df['tokenized'].apply(lambda x: " ".join(x))
combined_df.head(3)

Unnamed: 0,subreddit,title,created_utc,tokenized
0,1,my coworker placing the hash browns like army ...,1663204910,"[my, coworker, placing, the, hash, browns, lik..."
1,1,whats the deal with these,1663196066,"[whats, the, deal, with, these]"
2,1,working for dunkin,1663193081,"[working, for, dunkin]"


### Separate into Starbucks and Dunkin Donuts datasets for analysis

In [12]:
ddonuts_text_df = combined_df[combined_df['subreddit'] == 1]
ddonuts_text_df.shape

(2498, 4)

In [13]:
ddonuts_text_df.head(3)

Unnamed: 0,subreddit,title,created_utc,tokenized
0,1,my coworker placing the hash browns like army ...,1663204910,"[my, coworker, placing, the, hash, browns, lik..."
1,1,whats the deal with these,1663196066,"[whats, the, deal, with, these]"
2,1,working for dunkin,1663193081,"[working, for, dunkin]"


In [14]:
sbucks_text_df = combined_df[combined_df['subreddit'] == 0]
sbucks_text_df.shape

(2498, 4)

In [15]:
sbucks_text_df.head(3)

Unnamed: 0,subreddit,title,created_utc,tokenized
2498,0,interview tips,1663212467,"[interview, tips]"
2499,0,we had horses come through the drive thru rece...,1663212017,"[we, had, horses, come, through, the, drive, t..."
2500,0,having horses in the drive thru makes everythi...,1663211903,"[having, horses, in, the, drive, thru, makes, ..."


### Create separate dataframe for each of the subtopics for analysis

In [96]:
searchfor = ['dunkin donut', 'dunkin doughnut']
dunkin_donuts = ddonuts_text_df[ddonuts_text_df['title'].str.contains('|'.join(searchfor))]
dunkin_donuts = dunkin_donuts.copy()
print(f'Dataframe of dunkin_donuts has shape {dunkin_donuts.shape}')
dunkin_c_brew = ddonuts_text_df[ddonuts_text_df['title'].str.contains('cold brew')]
dunkin_c_brew = dunkin_c_brew.copy()
print(f'Dataframe of dunkin_c_brew has shape {dunkin_c_brew.shape}')
dunkin_i_coffee = ddonuts_text_df[ddonuts_text_df['title'].str.contains('ice coffee')]
dunkin_i_coffee = dunkin_i_coffee.copy()
print(f'Dataframe of dunkin_i_coffee has shape {dunkin_i_coffee.shape}')
dunkin_b_pecan = ddonuts_text_df[ddonuts_text_df['title'].str.contains('butter pecan')]
dunkin_b_pecan = dunkin_b_pecan.copy()
print(f'Dataframe of dunkin_b_pecan has shape {dunkin_b_pecan.shape}')
dunkin_local = ddonuts_text_df[ddonuts_text_df['title'].str.contains('local')]
dunkin_local = dunkin_local.copy()
print(f'Dataframe of dunkin_local has shape {dunkin_local.shape}')
searchfor_1 = ['free drink', 'free bev', 'free beverag', 'rewards', 'reward']
dunkin_reward = ddonuts_text_df[ddonuts_text_df['title'].str.contains('|'.join(searchfor_1))]
dunkin_reward = dunkin_reward.copy()
print(f'Dataframe of dunkin_reward has shape {dunkin_reward.shape}')
searchfor_2 = ['mobile order', 'mobile orders', 'drive thru', 'drive through','app', 'workers', 'staff']
dunkin_service = ddonuts_text_df[ddonuts_text_df['title'].str.contains('|'.join(searchfor_2))]
dunkin_service = dunkin_service.copy()
print(f'Dataframe of dunkin_service has shape {dunkin_service.shape}')
searchfor_3 = ['barista', 'baristas']
dunkin_barista = ddonuts_text_df[ddonuts_text_df['title'].str.contains('|'.join(searchfor_3))]
dunkin_barista = dunkin_barista.copy()
print(f'Dataframe of dunkin_barista has shape {dunkin_barista.shape}')

Dataframe of dunkin_donuts has shape (60, 4)
Dataframe of dunkin_c_brew has shape (90, 4)
Dataframe of dunkin_i_coffee has shape (5, 4)
Dataframe of dunkin_b_pecan has shape (36, 4)
Dataframe of dunkin_local has shape (23, 4)
Dataframe of dunkin_reward has shape (58, 4)
Dataframe of dunkin_service has shape (233, 4)
Dataframe of dunkin_barista has shape (2, 4)


In [95]:
sbucks_dress = sbucks_text_df[sbucks_text_df['title'].str.contains('dress')]
sbucks_dress = sbucks_dress.copy()
print(f'Dataframe of sbucks_dress has shape {sbucks_dress.shape}')
sbucks_p_spice = sbucks_text_df[sbucks_text_df['title'].str.contains('pumpkin spice')]
sbucks_p_spice = sbucks_p_spice.copy()
print(f'Dataframe of sbucks_p_spice has shape {sbucks_p_spice.shape}')
sbucks_c_brew = sbucks_text_df[sbucks_text_df['title'].str.contains('cold brew')]
sbucks_c_brew = sbucks_c_brew.copy()
print(f'Dataframe of sbucks_c_brew has shape {sbucks_c_brew.shape}')
sbucks_i_coffee = sbucks_text_df[sbucks_text_df['title'].str.contains('ice coffee')]
sbucks_i_coffee = sbucks_i_coffee.copy()
print(f'Dataframe of sbuck_i_coffee has shape {sbucks_i_coffee.shape}')
sbucks_a_crisp = sbucks_text_df[sbucks_text_df['title'].str.contains('apple crisp')]
sbucks_a_crisp = sbucks_a_crisp.copy() 
print(f'Dataframe of sbucks_a_crisp has shape {sbucks_a_crisp.shape}')
searchfor_4 = ['fall launch', 'fall drink', 'fall refresher', 'fall drinks']
sbucks_f_launch = sbucks_text_df[sbucks_text_df['title'].str.contains('|'.join(searchfor_4))]
sbucks_f_launch = sbucks_f_launch.copy()
print(f'Dataframe of sbucks_f_launch has shape {sbucks_f_launch.shape}')
searchfor_5 = ['free drink', 'free bev', 'free beverag', 'rewards', 'reward']
sbucks_reward = sbucks_text_df[sbucks_text_df['title'].str.contains('|'.join(searchfor_1))]
sbucks_reward = sbucks_reward.copy()
print(f'Dataframe of sbucks_reward has shape {sbucks_reward.shape}')
searchfor_6 = ['mobile order', 'mobile orders', 'drive thru', 'drive through', 'app', 'workers', 'staff']
sbucks_service = sbucks_text_df[sbucks_text_df['title'].str.contains('|'.join(searchfor_6))]
sbucks_service = sbucks_service.copy()
print(f'Dataframe of sbucks_service has shape {sbucks_service.shape}')
searchfor_7 = ['barista', 'baristas']
sbucks_barista = sbucks_text_df[sbucks_text_df['title'].str.contains('|'.join(searchfor_7))]
sbucks_barista = sbucks_barista.copy()
print(f'Dataframe of sbucks_barista has shape {sbucks_barista.shape}')

Dataframe of sbucks_dress has shape (36, 4)
Dataframe of sbucks_p_spice has shape (43, 4)
Dataframe of sbucks_c_brew has shape (46, 4)
Dataframe of sbuck_i_coffee has shape (0, 4)
Dataframe of sbucks_a_crisp has shape (15, 4)
Dataframe of sbucks_f_launch has shape (18, 4)
Dataframe of sbucks_reward has shape (13, 4)
Dataframe of sbucks_service has shape (219, 4)
Dataframe of sbucks_barista has shape (119, 4)


## Sentiment Analysis

Model used: twitter-XLM-roBERTa-base for Sentiment Analysis. This model wastrained on about 124M tweets from Jan 2018 to Dec 2021, and finetuned for sentiment analysis with TweetEval benchmark. [source](https://huggingface.co/cardiffnlp/twitter-roberta-base-sentiment)

In [19]:
from nltk.sentiment import SentimentIntensityAnalyzer
import operator

In [20]:
from transformers import AutoModelForSequenceClassification
from transformers import TFAutoModelForSequenceClassification
from transformers import AutoTokenizer, AutoConfig

In [21]:
model_name = "cardiffnlp/twitter-roberta-base-sentiment"
model = AutoModelForSequenceClassification.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

In [22]:
senti_classifier = pipeline("sentiment-analysis", 
                            model=model,
                            tokenizer = tokenizer)

In [23]:
# import pickle
# import sys

In [24]:
# p = pickle.dumps(senti_classifier)
# print(sys.getsizeof(p))

500046474


In [23]:
# define function to extract sentiments
def sentiments(dataset):
    dataset['sentiment'] = dataset['title'].apply(senti_classifier)
    dataset['sentiments'] = dataset['sentiment'].apply(lambda x: x[0]['label'])
    dataset['sentiments'] = dataset['sentiments'].map({'LABEL_0': 'negative', 'LABEL_1': 'neutral', 'LABEL_2': 'positive'})
    return dataset

## Dunkin Sentiments

### Overall

In [24]:
dunkin_sample_senti = sentiments(ddonuts_text_df.sample(2000))

In [25]:
dunkin_sample_senti.head(3)

Unnamed: 0,subreddit,title,created_utc,tokenized,sentiment,sentiments
746,1,free beverage reward on mobile monday,1658150232,"[free, beverage, reward, on, mobile, monday]","[{'label': 'LABEL_2', 'score': 0.6509920954704...",positive
868,1,question for workers,1657243071,"[question, for, workers]","[{'label': 'LABEL_1', 'score': 0.6887745857238...",neutral
648,1,a regular who pronounces macchiato mach iato,1658882786,"[a, regular, who, pronounces, macchiato, mach,...","[{'label': 'LABEL_1', 'score': 0.8214698433876...",neutral


In [26]:
dunkin_sample_senti['sentiments'].value_counts()

neutral     1376
negative     413
positive     211
Name: sentiments, dtype: int64

In [27]:
dunkin_sentiments = pd.DataFrame(dunkin_sample_senti['sentiments'].value_counts())
dunkin_sentiments

Unnamed: 0,sentiments
neutral,1376
negative,413
positive,211


### Dunkin Donuts

In [32]:
ddonuts_senti = sentiments(dunkin_donuts)

In [33]:
ddonuts_senti.head(3)

Unnamed: 0,subreddit,title,created_utc,tokenized,sentiment,sentiments
18,1,2 dunkin donuts card first one to save pic and...,1663030303,"[2, dunkin, donuts, card, first, one, to, save...","[{'label': 'LABEL_1', 'score': 0.7781171798706...",neutral
33,1,dunkin donuts interview,1662960022,"[dunkin, donuts, interview]","[{'label': 'LABEL_1', 'score': 0.8438280820846...",neutral
35,1,dunkin donuts t mobile tuesday 2 codes,1662954206,"[dunkin, donuts, t, mobile, tuesday, 2, codes]","[{'label': 'LABEL_1', 'score': 0.8879836797714...",neutral


In [34]:
ddonuts_senti['sentiments'].value_counts()

neutral     44
negative    11
positive     5
Name: sentiments, dtype: int64

In [35]:
ddonuts_sentiments = pd.DataFrame(ddonuts_senti['sentiments'].value_counts())
ddonuts_sentiments

Unnamed: 0,sentiments
neutral,44
negative,11
positive,5


### Dunkin Cold Brew

In [28]:
dunkin_c_brew_senti = sentiments(dunkin_c_brew)

In [29]:
dunkin_c_brew_senti.head(3)

Unnamed: 0,subreddit,title,created_utc,tokenized,sentiment,sentiments
41,1,this pumpkin cream cold brew is amazing weary_...,1662915853,"[this, pumpkin, cream, cold, brew, is, amazing...","[{'label': 'LABEL_2', 'score': 0.9880346655845...",positive
52,1,anyone know what s the next coffee promo after...,1662825134,"[anyone, know, what, s, the, next, coffee, pro...","[{'label': 'LABEL_1', 'score': 0.9409533739089...",neutral
59,1,is it weird that my small town dunkin for week...,1662739875,"[is, it, weird, that, my, small, town, dunkin,...","[{'label': 'LABEL_0', 'score': 0.5979610681533...",negative


In [30]:
dunkin_c_brew_senti['sentiments'].value_counts()

neutral     63
negative    15
positive    12
Name: sentiments, dtype: int64

In [31]:
dunkin_c_brew_sentiments = pd.DataFrame(dunkin_c_brew_senti['sentiments'].value_counts())
dunkin_c_brew_sentiments

Unnamed: 0,sentiments
neutral,63
negative,15
positive,12


### Dunkin Iced Coffee

In [36]:
dunkin_i_coffee_senti = sentiments(dunkin_i_coffee)

In [37]:
dunkin_i_coffee_senti.head(3)

Unnamed: 0,subreddit,title,created_utc,tokenized,sentiment,sentiments
361,1,did ice coffee prices increase,1660759244,"[did, ice, coffee, prices, increase]","[{'label': 'LABEL_1', 'score': 0.6943466067314...",neutral
463,1,why is this the same price as an ice coffee i ...,1660242331,"[why, is, this, the, same, price, as, an, ice,...","[{'label': 'LABEL_0', 'score': 0.4981566965579...",negative
677,1,psa the 48 oz bottles of unsweetened ice coffe...,1658664957,"[psa, the, 48, oz, bottles, of, unsweetened, i...","[{'label': 'LABEL_1', 'score': 0.5948571562767...",neutral


In [38]:
dunkin_i_coffee_senti['sentiments'].value_counts()

neutral     3
negative    1
positive    1
Name: sentiments, dtype: int64

In [39]:
dunkin_i_coffee_sentiments = pd.DataFrame(dunkin_i_coffee_senti['sentiments'].value_counts())
dunkin_i_coffee_sentiments

Unnamed: 0,sentiments
neutral,3
negative,1
positive,1


### Dunkin Butter Pecan

In [40]:
dunkin_b_pecan_senti = sentiments(dunkin_b_pecan)

In [41]:
dunkin_b_pecan_senti.head(3)

Unnamed: 0,subreddit,title,created_utc,tokenized,sentiment,sentiments
322,1,just thought i might make my last dunkin run o...,1660929445,"[just, thought, i, might, make, my, last, dunk...","[{'label': 'LABEL_1', 'score': 0.5366223454475...",neutral
360,1,please sell me your butter pecan,1660766067,"[please, sell, me, your, butter, pecan]","[{'label': 'LABEL_1', 'score': 0.7656268477439...",neutral
442,1,my favorite drink is med cold brew 1 butter pe...,1660384707,"[my, favorite, drink, is, med, cold, brew, 1, ...","[{'label': 'LABEL_2', 'score': 0.9485169649124...",positive


In [42]:
dunkin_b_pecan_senti['sentiments'].value_counts()

neutral     26
positive     7
negative     3
Name: sentiments, dtype: int64

In [43]:
dunkin_b_pecan_sentiments = pd.DataFrame(dunkin_b_pecan_senti['sentiments'].value_counts())
dunkin_b_pecan_sentiments

Unnamed: 0,sentiments
neutral,26
positive,7
negative,3


### Dunkin Local

In [48]:
dunkin_local_senti = sentiments(dunkin_local)

In [49]:
dunkin_local_senti.head(3)

Unnamed: 0,subreddit,title,created_utc,tokenized,sentiment,sentiments
34,1,can someone explain to me why my local dunkin ...,1662958423,"[can, someone, explain, to, me, why, my, local...","[{'label': 'LABEL_0', 'score': 0.9047766923904...",negative
268,1,local dunkin scrapes metal pan on ground at dr...,1661276540,"[local, dunkin, scrapes, metal, pan, on, groun...","[{'label': 'LABEL_1', 'score': 0.7235899567604...",neutral
278,1,i ve been going to dunk ever since the establi...,1661183920,"[i, ve, been, going, to, dunk, ever, since, th...","[{'label': 'LABEL_1', 'score': 0.5064250230789...",neutral


In [50]:
dunkin_local_senti['sentiments'].value_counts()

neutral     12
negative     8
positive     3
Name: sentiments, dtype: int64

In [51]:
dunkin_local_sentiments = pd.DataFrame(dunkin_local_senti['sentiments'].value_counts())
dunkin_local_sentiments

Unnamed: 0,sentiments
neutral,12
negative,8
positive,3


### Dunkin Reward

In [52]:
dunkin_reward_senti = sentiments(dunkin_reward)

In [53]:
dunkin_reward_senti.head(3)

Unnamed: 0,subreddit,title,created_utc,tokenized,sentiment,sentiments
11,1,dunkin rewards update,1663153599,"[dunkin, rewards, update]","[{'label': 'LABEL_1', 'score': 0.8598471879959...",neutral
55,1,birthday reward sucks,1662808190,"[birthday, reward, sucks]","[{'label': 'LABEL_0', 'score': 0.9674400687217...",negative
100,1,free drink expires today 9 4,1662334297,"[free, drink, expires, today, 9, 4]","[{'label': 'LABEL_1', 'score': 0.7958027720451...",neutral


In [54]:
dunkin_reward_senti['sentiments'].value_counts()

neutral     35
positive    14
negative     9
Name: sentiments, dtype: int64

In [55]:
dunkin_reward_sentiments = pd.DataFrame(dunkin_reward_senti['sentiments'].value_counts())
dunkin_reward_sentiments

Unnamed: 0,sentiments
neutral,35
positive,14
negative,9


### Dunkin Service

In [56]:
dunkin_service_senti = sentiments(dunkin_service)

In [57]:
dunkin_service_senti.head(3)

Unnamed: 0,subreddit,title,created_utc,tokenized,sentiment,sentiments
7,1,does anyone know if the apple cider is coming ...,1663163942,"[does, anyone, know, if, the, apple, cider, is...","[{'label': 'LABEL_1', 'score': 0.9185714125633...",neutral
39,1,ben affleck and henry cavill have signed exten...,1662929899,"[ben, affleck, and, henry, cavill, have, signe...","[{'label': 'LABEL_1', 'score': 0.9103999137878...",neutral
40,1,does anybody know if you can add the employee ...,1662918465,"[does, anybody, know, if, you, can, add, the, ...","[{'label': 'LABEL_1', 'score': 0.9097266197204...",neutral


In [58]:
dunkin_service_senti['sentiments'].value_counts()

neutral     151
negative     58
positive     24
Name: sentiments, dtype: int64

In [59]:
dunkin_service_sentiments = pd.DataFrame(dunkin_service_senti['sentiments'].value_counts())
dunkin_service_sentiments

Unnamed: 0,sentiments
neutral,151
negative,58
positive,24


### Dunkin Barista

In [60]:
dunkin_barista_senti = sentiments(dunkin_barista)

In [61]:
dunkin_barista_senti.head()

Unnamed: 0,subreddit,title,created_utc,tokenized,sentiment,sentiments
1056,1,question from a starbys barista,1655913312,"[question, from, a, starbys, barista]","[{'label': 'LABEL_1', 'score': 0.8456373810768...",neutral
2158,1,would the barista making my coffee everyday li...,1646751101,"[would, the, barista, making, my, coffee, ever...","[{'label': 'LABEL_1', 'score': 0.6669984459877...",neutral


In [62]:
dunkin_barista_senti['sentiments'].value_counts()

neutral    2
Name: sentiments, dtype: int64

In [63]:
dunkin_barista_sentiments = pd.DataFrame(dunkin_barista_senti['sentiments'].value_counts())
dunkin_barista_sentiments

Unnamed: 0,sentiments
neutral,2


### Summary

In [64]:
dunkin_cols = ["Overall", "Dunkin_Donuts", "Cold_Brew", "Iced_Coffee", "Butter_Pecan", "Dunkin_Local", "Dunkin_Reward", "Dunkin_Service", "Dunkin_Barista"]
dunkin_sentiments = pd.concat([dunkin_sentiments, ddonuts_sentiments, dunkin_c_brew_sentiments, dunkin_i_coffee_sentiments, dunkin_b_pecan_sentiments, dunkin_local_sentiments, dunkin_reward_sentiments, dunkin_service_sentiments, dunkin_barista_sentiments], axis=1, ignore_index=True)
dunkin_sentiments.columns = dunkin_cols

In [65]:
dunkin_sentiments

Unnamed: 0,Overall,Dunkin_Donuts,Cold_Brew,Iced_Coffee,Butter_Pecan,Dunkin_Local,Dunkin_Reward,Dunkin_Service,Dunkin_Barista
neutral,1376,44,63,3,26,12,35,151,2.0
negative,413,11,15,1,3,8,9,58,
positive,211,5,12,1,7,3,14,24,


In [66]:
dunkin_sentiments_pct = round(dunkin_sentiments/dunkin_sentiments.sum(), 2)
dunkin_sentiments_pct

Unnamed: 0,Overall,Dunkin_Donuts,Cold_Brew,Iced_Coffee,Butter_Pecan,Dunkin_Local,Dunkin_Reward,Dunkin_Service,Dunkin_Barista
neutral,0.69,0.73,0.7,0.6,0.72,0.52,0.6,0.65,1.0
negative,0.21,0.18,0.17,0.2,0.08,0.35,0.16,0.25,
positive,0.11,0.08,0.13,0.2,0.19,0.13,0.24,0.1,


Points to note:
- Most posts are neutral, most likely are questions and discussions over the topics
- Rewards has highest percenatge of positive posts, which including free beverages, free drinks, reward
- "Dunkin Donuts" and cold brew have more negative posts than positive ones (possibly bad reviews)
- Services from Dunkin tend to have more negative posts. which including mobile order, app, and staffs.

## Starbucks Sentiments

### Overall

In [67]:
sbucks_sample_senti = sentiments(sbucks_text_df.sample(2000))

In [68]:
sbucks_sample_senti.head(3)

Unnamed: 0,subreddit,title,created_utc,tokenized,sentiment,sentiments
4440,0,just customer things,1661702150,"[just, customer, things]","[{'label': 'LABEL_1', 'score': 0.6978900432586...",neutral
3286,0,i would like to introduce everyone to one of m...,1662517473,"[i, would, like, to, introduce, everyone, to, ...","[{'label': 'LABEL_2', 'score': 0.9830620288848...",positive
4925,0,tip question for employees,1661265219,"[tip, question, for, employees]","[{'label': 'LABEL_1', 'score': 0.8111510872840...",neutral


In [69]:
sbucks_sample_senti['sentiments'].value_counts()

neutral     1415
negative     411
positive     174
Name: sentiments, dtype: int64

In [70]:
sbucks_sentiments = pd.DataFrame(sbucks_sample_senti['sentiments'].value_counts())
sbucks_sentiments

Unnamed: 0,sentiments
neutral,1415
negative,411
positive,174


### Dress Code

In [78]:
sbucks_dress_senti = sentiments(sbucks_dress)

In [79]:
sbucks_dress_senti.head(3)

Unnamed: 0,subreddit,title,created_utc,tokenized,sentiment,sentiments
2551,0,lax new ceo attends the one young world confer...,1663181231,"[lax, new, ceo, attends, the, one, young, worl...","[{'label': 'LABEL_2', 'score': 0.9124915599822...",positive
2740,0,dress code question,1663017656,"[dress, code, question]","[{'label': 'LABEL_1', 'score': 0.7115855813026...",neutral
2807,0,dresscode hats,1662946355,"[dresscode, hats]","[{'label': 'LABEL_1', 'score': 0.7062446475028...",neutral


In [80]:
sbucks_dress_senti['sentiments'].value_counts()

neutral     31
negative     4
positive     1
Name: sentiments, dtype: int64

In [81]:
sbucks_dress_sentiments = pd.DataFrame(sbucks_dress_senti['sentiments'].value_counts())
sbucks_dress_sentiments

Unnamed: 0,sentiments
neutral,31
negative,4
positive,1


### Pumpkin Spice

In [75]:
sbucks_p_spice_senti = sentiments(sbucks_p_spice)

In [76]:
sbucks_p_spice_senti.head(3)

Unnamed: 0,subreddit,title,created_utc,tokenized,sentiment,sentiments
2636,0,happy pumpkin spice season jack o lantern,1663109346,"[happy, pumpkin, spice, season, jack, o, lantern]","[{'label': 'LABEL_2', 'score': 0.9534860253334...",positive
2698,0,is this kj s correct i know syrups have heaps ...,1663047373,"[is, this, kj, s, correct, i, know, syrups, ha...","[{'label': 'LABEL_1', 'score': 0.5993539094924...",neutral
2906,0,your favorite custom off menu drinks using pum...,1662852146,"[your, favorite, custom, off, menu, drinks, us...","[{'label': 'LABEL_1', 'score': 0.8353580832481...",neutral


In [82]:
sbucks_p_spice_senti['sentiments'].value_counts()

neutral     35
positive     5
negative     3
Name: sentiments, dtype: int64

In [83]:
sbucks_p_spice_sentiments = pd.DataFrame(sbucks_p_spice_senti['sentiments'].value_counts())
sbucks_p_spice_sentiments

Unnamed: 0,sentiments
neutral,35
positive,5
negative,3


### Cold Brew

In [84]:
sbucks_c_brew_senti = sentiments(sbucks_c_brew)

In [85]:
sbucks_c_brew_senti.head(3)

Unnamed: 0,subreddit,title,created_utc,tokenized,sentiment,sentiments
2508,0,cold brew and brewed coffee pls,1663208386,"[cold, brew, and, brewed, coffee, pls]","[{'label': 'LABEL_1', 'score': 0.6978935599327...",neutral
2652,0,what happened to starbucks cold brew,1663097976,"[what, happened, to, starbucks, cold, brew]","[{'label': 'LABEL_1', 'score': 0.7445796132087...",neutral
2658,0,this was 9 a tall cold brew with no add ins at...,1663095561,"[this, was, 9, a, tall, cold, brew, with, no, ...","[{'label': 'LABEL_2', 'score': 0.9462835788726...",positive


In [88]:
sbucks_c_brew_senti['sentiments'].value_counts()

neutral     39
negative     4
positive     3
Name: sentiments, dtype: int64

In [89]:
sbucks_c_brew_sentiments = pd.DataFrame(sbucks_c_brew_senti['sentiments'].value_counts())
sbucks_c_brew_sentiments

Unnamed: 0,sentiments
neutral,39
negative,4
positive,3


### Apple Crisp

In [97]:
sbucks_a_crisp_senti = sentiments(sbucks_a_crisp)

In [98]:
sbucks_a_crisp_senti.head(3)

Unnamed: 0,subreddit,title,created_utc,tokenized,sentiment,sentiments
2720,0,why does the apple crisp taste like straight c...,1663031317,"[why, does, the, apple, crisp, taste, like, st...","[{'label': 'LABEL_0', 'score': 0.9693011045455...",negative
2735,0,apple crisp cold brew,1663023008,"[apple, crisp, cold, brew]","[{'label': 'LABEL_1', 'score': 0.7091082930564...",neutral
2967,0,omg the iced apple crisp mach,1662786469,"[omg, the, iced, apple, crisp, mach]","[{'label': 'LABEL_1', 'score': 0.7784996032714...",neutral


In [99]:
sbucks_a_crisp_senti['sentiments'].value_counts()

neutral     12
negative     2
positive     1
Name: sentiments, dtype: int64

In [100]:
sbucks_a_crisp_sentiments = pd.DataFrame(sbucks_a_crisp_senti['sentiments'].value_counts())
sbucks_a_crisp_sentiments

Unnamed: 0,sentiments
neutral,12
negative,2
positive,1


### Fall Launch

In [105]:
sbucks_f_launch_senti = sentiments(sbucks_f_launch)

In [106]:
sbucks_f_launch_senti.head(3)

Unnamed: 0,subreddit,title,created_utc,tokenized,sentiment,sentiments
3113,0,fall drink that s not too sweet,1662682169,"[fall, drink, that, s, not, too, sweet]","[{'label': 'LABEL_0', 'score': 0.4885034263134...",negative
3124,0,fall drink recs without coffee,1662676345,"[fall, drink, recs, without, coffee]","[{'label': 'LABEL_1', 'score': 0.8270258307456...",neutral
3355,0,fall drinks central ohio,1662473112,"[fall, drinks, central, ohio]","[{'label': 'LABEL_1', 'score': 0.8739259243011...",neutral


In [107]:
sbucks_f_launch_senti['sentiments'].value_counts()

neutral     14
positive     3
negative     1
Name: sentiments, dtype: int64

In [108]:
sbucks_f_launch_sentiments = pd.DataFrame(sbucks_f_launch_senti['sentiments'].value_counts())
sbucks_f_launch_sentiments

Unnamed: 0,sentiments
neutral,14
positive,3
negative,1


### Reward

In [101]:
sbucks_reward_senti = sentiments(sbucks_reward)

In [102]:
sbucks_reward_senti.head(3)

Unnamed: 0,subreddit,title,created_utc,tokenized,sentiment,sentiments
2649,0,why do some starbucks stores not offer rewards...,1663099313,"[why, do, some, starbucks, stores, not, offer,...","[{'label': 'LABEL_0', 'score': 0.5769333243370...",negative
2813,0,delicious free drink,1662939213,"[delicious, free, drink]","[{'label': 'LABEL_2', 'score': 0.9467070102691...",positive
2870,0,free drink recommendation,1662890001,"[free, drink, recommendation]","[{'label': 'LABEL_1', 'score': 0.7021298408508...",neutral


In [103]:
sbucks_reward_senti['sentiments'].value_counts()

neutral     8
positive    3
negative    2
Name: sentiments, dtype: int64

In [104]:
sbucks_reward_sentiments = pd.DataFrame(sbucks_reward_senti['sentiments'].value_counts())
sbucks_reward_sentiments

Unnamed: 0,sentiments
neutral,8
positive,3
negative,2


### Service

In [109]:
sbucks_service_senti = sentiments(sbucks_service)

In [110]:
sbucks_service_senti.head(3)

Unnamed: 0,subreddit,title,created_utc,tokenized,sentiment,sentiments
2499,0,we had horses come through the drive thru rece...,1663212017,"[we, had, horses, come, through, the, drive, t...","[{'label': 'LABEL_1', 'score': 0.8244970440864...",neutral
2500,0,having horses in the drive thru makes everythi...,1663211903,"[having, horses, in, the, drive, thru, makes, ...","[{'label': 'LABEL_2', 'score': 0.9039303064346...",positive
2524,0,does the apple drizzle or brown sugar go good ...,1663198156,"[does, the, apple, drizzle, or, brown, sugar, ...","[{'label': 'LABEL_2', 'score': 0.7906030416488...",positive


In [111]:
sbucks_service_senti['sentiments'].value_counts()

neutral     141
negative     41
positive     37
Name: sentiments, dtype: int64

In [112]:
sbucks_service_sentiments = pd.DataFrame(sbucks_service_senti['sentiments'].value_counts())
sbucks_service_sentiments

Unnamed: 0,sentiments
neutral,141
negative,41
positive,37


### Barista

In [113]:
sbucks_barista_senti = sentiments(sbucks_barista)

In [114]:
sbucks_barista_senti.head(3)

Unnamed: 0,subreddit,title,created_utc,tokenized,sentiment,sentiments
2511,0,baristas what made you smile at work today,1663206743,"[baristas, what, made, you, smile, at, work, t...","[{'label': 'LABEL_2', 'score': 0.8832364678382...",positive
2520,0,question for baristas,1663200077,"[question, for, baristas]","[{'label': 'LABEL_1', 'score': 0.7694525718688...",neutral
2550,0,it s not the baristas fault,1663182394,"[it, s, not, the, baristas, fault]","[{'label': 'LABEL_1', 'score': 0.5979842543601...",neutral


In [115]:
sbucks_barista_senti['sentiments'].value_counts()

neutral     83
negative    20
positive    16
Name: sentiments, dtype: int64

In [116]:
sbucks_service_sentiments = pd.DataFrame(sbucks_service_senti['sentiments'].value_counts())
sbucks_service_sentiments

Unnamed: 0,sentiments
neutral,141
negative,41
positive,37


In [None]:
# pip install --ignore-installed --upgrade tensorflow-gpu