# Analyzing Product Sentiment Assignment

In this module, we focused on classifiers, applying them to analyzing product sentiment, and understanding the types of errors a classifier makes. We also built an exciting iPython notebook for analyzing the sentiment of real product reviews.

In this assignment, we are going to explore this application further, training a sentiment analysis model using a set of key polarizing words, verify the weights learned to each of these words, and compare the results of this simpler classifier with those of the one using all of the words. These techniques will be a core component in your capstone project.

In the IPython notebook above, we used the word counts for all words in the reviews to train the sentiment classifier model. Now, we are going to follow a similar path, but only use this subset of the words:

In [1]:
selected_words = ['awesome', 'great', 'fantastic', 'amazing', 'love',
                  'horrible', 'bad', 'terrible', 'awful', 'wow', 'hate']

Often, ML practitioners will throw out words they consider “unimportant” before training their model. This procedure can often be helpful in terms of accuracy. Here, we are going to throw out all words except for the very few above. Using so few words in our model will hurt our accuracy, but help us interpret what our classifier is doing. 

In [2]:
import graphlab
import pandas as pd

In [3]:
# Importamos el dataset

products = graphlab.SFrame('amazon_baby.gl/')

This non-commercial license of GraphLab Create for academic use is assigned to miguelfzafra@gmail.com and will expire on July 28, 2019.


[INFO] graphlab.cython.cy_server: GraphLab Create v2.1 started. Logging: /tmp/graphlab_server_1533141951.log


In [4]:
# Creamos el campo word_count

products['word_count'] = graphlab.text_analytics.count_words(products['review'])

Definimos el sentiment:

In [5]:
# ignore all 3* reviews
products = products[products['rating'] != 3]

# positive sentiment = 4* or 5* reviews
products['sentiment'] = products['rating'] >=4

In [6]:
products.head(2)

name,review,rating,word_count,sentiment
Planetwise Wipe Pouch,it came early and was not disappointed. i love ...,5.0,"{'and': 3, 'love': 1, 'it': 2, 'highly': 1, ...",1
Annas Dream Full Quilt with 2 Shams ...,Very soft and comfortable and warmer than it ...,5.0,"{'and': 2, 'quilt': 1, 'it': 1, 'comfortable': ...",1


In [7]:
len(products)

166752

## 1. Use .apply() to build a new feature with the counts for each of the selected_words

Our first goal is to create a column products[‘awesome’] where each row contains the number of times the word ‘awesome’ showed up in the review for the corresponding product, and 0 if the review didn’t show up. One way to do this is to look at the each row ‘word_count’ column and follow this logic: 

In [8]:
def awesome_count(dictionary):
    if 'awesome' in dictionary:
        return dictionary['awesome']
    else:
        return 0

In [9]:
products['awesome'] = products['word_count'].apply(awesome_count)

In [10]:
products.head(2)

name,review,rating,word_count,sentiment,awesome
Planetwise Wipe Pouch,it came early and was not disappointed. i love ...,5.0,"{'and': 3, 'love': 1, 'it': 2, 'highly': 1, ...",1,0
Annas Dream Full Quilt with 2 Shams ...,Very soft and comfortable and warmer than it ...,5.0,"{'and': 2, 'quilt': 1, 'it': 1, 'comfortable': ...",1,0


Repeat this process for the other 11 words in selected_words.

In [11]:
for word in selected_words:
    products[word] = products['word_count'].apply(lambda x: x[word] if word in x else 0)

In [12]:
products.head(2)

name,review,rating,word_count,sentiment,awesome,great
Planetwise Wipe Pouch,it came early and was not disappointed. i love ...,5.0,"{'and': 3, 'love': 1, 'it': 2, 'highly': 1, ...",1,0,0
Annas Dream Full Quilt with 2 Shams ...,Very soft and comfortable and warmer than it ...,5.0,"{'and': 2, 'quilt': 1, 'it': 1, 'comfortable': ...",1,0,0

fantastic,amazing,love,horrible,bad,terrible,awful,wow,hate
0,0,1,0,0,0,0,0,0
0,0,0,0,0,0,0,0,0


 Out of the selected_words, which one is most used in the dataset? Which one is least used?

In [13]:
# Creamos un diccionario con las palabras y la suma de la columna

d = {}
for word in selected_words:
    d[word] = products[word].sum()
    
# Creamos un dataframe y lo rellenamos con el diccionario

In [14]:
df = pd.DataFrame()

df['Word'] = d.keys()
df['Count'] = d.values()

In [15]:
df_sorted = df.sort_values('Count', ascending=False)

In [16]:
df_sorted

Unnamed: 0,Word,Count
4,great,42420
1,love,40277
2,bad,3197
3,awesome,2002
6,amazing,1305
9,hate,1057
0,fantastic,873
5,terrible,673
7,horrible,659
8,awful,345


In [18]:
print "Most used: " + df_sorted.iloc[0]['Word']
print "Less used: " + df_sorted.iloc[-1]['Word']

Most used: great
Less used: wow


## 2. Create a new sentiment analysis model using only the selected_words as features

Use the same train/test split as in the IPython Notebook from lecture:

In [19]:
train_data,test_data = products.random_split(.8, seed=0)

Train a logistic regression classifier (use graphlab.logistic_classifier.create) using just the selected_words. Hint: you can use this parameter in the .create() call to specify the features used to be exactly the new columns you just created:

In [20]:
selected_words_model = graphlab.logistic_classifier.create(train_data,
                                                     target='sentiment',
                                                     features=selected_words,
                                                     validation_set=test_data)

Using this approach, sort the learned coefficients according to the ‘value’ column using .sort(). Out of the 11 words in selected_words, which one got the most positive weight? Which one got the most negative weight? Do these values make sense for you?

In [27]:
coefs_sorted = selected_words_model['coefficients'].sort(sort_columns='value', 
                                          ascending=False)

In [29]:
coefs_sorted.head(1)

name,index,class,value,stderr
love,,1,1.39989834302,0.0287147460124


In [30]:
coefs_sorted.tail(1)

name,index,class,value,stderr
terrible,,1,-2.09049998487,0.0967241912229


## 3. Comparing the accuracy of different sentiment analysis model

What is the accuracy of the selected_words_model on the test_data? What was the accuracy of the sentiment_model that we learned using all the word counts in the IPython Notebook above from the lectures? What is the accuracy majority class classifier on this task? How do you compare the different learned models with the baseline approach where we are just predicting the majority class? 

In [33]:
# Accuracy del modelo con selected_words

selected_words_model.evaluate(test_data)['accuracy']

0.8431419649291376

In [36]:
# Accuracy del modelo de la explicación (con todas las features)
sentiment_model = graphlab.logistic_classifier.create(train_data,
                                                     target='sentiment',
                                                     features=['word_count'],
                                                     validation_set=test_data)

In [37]:
sentiment_model.evaluate(test_data)['accuracy']

0.916256305548883

In [43]:
# Accuracy de un modelo que prediga la clase mayoritaria

n1 = len(products[products['sentiment'] == 1])
n0 = len(products[products['sentiment'] == 0])

print "| 1: " + str(n1) + "| 0: " + str(n0) + " |"


| 1: 140259| 0: 26493 |


In [53]:
print float(n1)/float((n0+n1))

0.841123344847


## 4. Interpreting the difference in performance between the models

To understand why the model with all word counts performs better than the one with only the selected_words, we will now examine the reviews for a particular product.

We will investigate a product named ‘Baby Trend Diaper Champ’. (This is a trash can for soiled baby diapers, which keeps the smell contained.)Just like we did for the reviews for the giraffe toy in the IPython Notebook in the lecture video, before we start our analysis you should select all reviews where the product name is ‘Baby Trend Diaper Champ’. Let’s call this table diaper_champ_reviews

In [69]:
diaper_champ_reviews = products[products['name'] == 'Baby Trend Diaper Champ']

Use the sentiment_model to predict the sentiment of each review in diaper_champ_reviews and sort the results according to their ‘predicted_sentiment’.

In [70]:
diaper_champ_reviews['predicted_sentiment'] = sentiment_model.predict(diaper_champ_reviews, 
                                                                      output_type = 'probability')

In [71]:
diaper_champ_reviews_sorted = diaper_champ_reviews.sort('predicted_sentiment', ascending = False)

What is the ‘predicted_sentiment’ for the most positive review for ‘Baby Trend Diaper Champ’ according to the sentiment_model from the IPython Notebook from lecture?

In [72]:
diaper_champ_reviews_sorted.head(1)

name,review,rating,word_count,sentiment,awesome
Baby Trend Diaper Champ,Baby Luke can turn a clean diaper to a dirty ...,5.0,"{'all': 1, 'less': 1, ""friend's"": 1, '(which': ...",1,0

great,fantastic,amazing,love,horrible,bad,terrible,awful,wow,hate,predicted_sentiment
0,0,0,0,0,0,0,0,0,0,0.999999937267


Now use the selected_words_model you learned using just the selected_words to predict the sentiment most positive review you found above

In [73]:
diaper_champ_reviews['predicted_sentiment_select'] = selected_words_model.predict(diaper_champ_reviews, 
                                                                      output_type = 'probability')

In [75]:
diaper_champ_reviews_sorted = diaper_champ_reviews.sort('predicted_sentiment', ascending = False)

In [76]:
diaper_champ_reviews_sorted.head(1)

name,review,rating,word_count,sentiment,awesome
Baby Trend Diaper Champ,Baby Luke can turn a clean diaper to a dirty ...,5.0,"{'all': 1, 'less': 1, ""friend's"": 1, '(which': ...",1,0

great,fantastic,amazing,love,horrible,bad,terrible,awful,wow,hate,predicted_sentiment
0,0,0,0,0,0,0,0,0,0,0.999999937267

predicted_sentiment_selec t ...
0.796940851291


Why is the predicted_sentiment for the most positive review found using the model with all word counts (sentiment_model) much more positive than the one using only the selected_words (selected_words_model)? Hint: examine the text of this review, the extracted word counts for all words, and the word counts for each of the selected_words, and you will see what each model used to make its prediction.

In [77]:
diaper_champ_reviews_sorted.head(1)['review']

dtype: str
Rows: 1
['Baby Luke can turn a clean diaper to a dirty diaper in 3 seconds flat. The diaper champ turns the smelly diaper into "what diaper smell" in less time than that. I hesitated and wondered what I REALLY needed for the nursery. This is one of the best purchases we made. The champ, the baby bjorn, fluerville diaper bag, and graco pack and play bassinet all vie for the best baby purchase.Great product, easy to use, economical, effective, absolutly fabulous.UpdateI knew that I loved the champ, and useing the diaper genie at a friend's house REALLY reinforced that!! There is no comparison, the chanp is easy and smell free, the genie was difficult to use one handed (which is absolutly vital if you have a little one on a changing pad) and there was a deffinite odor eminating from the genieplus we found that the quick tie garbage bags where the ties are integrated into the bag work really well because there isn't any added bulk around the sealing edge of the champ.']

Vemos que no tiene ninguna de las palabras que habíamos fijado como características.