In [1]:
# Modules

import turicreate as tc

In [2]:
# Load data
data = tc.SFrame('amazon_baby.sframe')

In the Jupyter notebook above, we used the word counts for all words in the reviews to train the sentiment classifier model. Now, we are going to follow a similar path, but only use this subset of the words:

In [3]:
selected_words = ['awesome', 'great', 'fantastic', 'amazing',
                  'love', 'horrible', 'bad', 'terrible', 'awful',
                  'wow', 'hate']

**Use .apply() to build a new feature with the counts for each of the selected_words:** In the notebook above, we created a column ‘word_count’ with the word counts for each review. Our first task is to create a new column in the products SFrame with the counts for each selected_word above, and, in the process, we will see how the method .apply() can be used to create new columns in our data (our features) and how to use a Python function, which is an extremely useful concept to grasp! 

Our first goal is to create a column products[‘awesome’] where each row contains the number of times the word ‘awesome’ showed up in the review for the corresponding product, and 0 if the review didn’t show up. One way to do this is to look at the each row ‘word_count’ column and follow this logic: 

+ If ‘awesome’ shows up in the word counts for a particular product (row of the products SFrame), then we know how often ‘awesome’ appeared in the review, 
+ if ‘awesome’ doesn’t appear in the word counts, then it didn’t appear in the review, and we should set the count for ‘awesome’ to 0 in this review.
+ First, you will use a Python function to define the logic above. You will write a function called awesome_count which takes in the word counts and returns the number of times ‘awesome’ appears in the reviews.
+ Next, you will use .apply() to iterate awesome_count for each row of products[‘word_count’] and create a new column called ‘awesome’ with the resulting counts.
+ Repeat this process for the other 11 words in selected_words. (Here, we described a simple procedure to obtain the counts for each selected_word. There are other more efficient ways of doing this, and we encourage you to explore this further.)
+ Using the .sum() method on each of the new columns you created, answer the following questions: Out of the selected_words, which one is most used in the dataset? Which one is least used? Save these results to answer the quiz at the end.

In [4]:
data['word_count'] = tc.text_analytics.count_words(data['review'])

In [5]:
for word in selected_words:
    data[word] = data['word_count'].apply(lambda counts: counts.get(word, 0))

In [6]:
sum_values = {}
for word in selected_words:
    sum_values[word] = data[word].sum()

In [7]:
print(f'Most used word:  {max(sum_values, key=lambda k: sum_values[k])}')
print(f'Least used word: {min(sum_values, key=lambda k: sum_values[k])}')

Most used word:  great
Least used word: wow


**Create a new sentiment analysis model using only the selected_words as features:** In the Jupyter Notebook above, we used word counts for all words as features for our sentiment classifier. Now, you are just going to use the selected_words:
+ Use the same train/test split as in the Jupyter Notebook from lecture
+ Train a logistic regression classifier (use turicreate.logistic_classifier.create) using just the selected_words. Hint: you can use this parameter in the .create() call to specify the features used to be exactly the new columns you just created
+ You will now examine the weights the learned classifier assigned to each of the 11 words in selected_words and gain intuition as to what the ML algorithm did for your data using these features. In Turi Create, a learned model, such as the selected_words_model, has a field 'coefficients', which lets you look at the learned coefficients.
+ Using this approach, sort the learned coefficients according to the ‘value’ column using .sort(). Out of the 11 words in selected_words, which one got the most positive weight? Which one got the most negative weight? Do these values make sense for you? Save these results to answer the quiz at the end.

In [8]:
# Sentiment based on review starts
# Remove undefined sentiment
data = data[data['rating'] != 3]
# Positive sentiment 4- 5-star reviews
data['sentiment'] = data['rating'] >= 4

In [9]:
train_data, test_data = data.random_split(0.8, seed=0)

In [10]:
selected_words_model = tc.logistic_classifier.create(train_data,
                                                     features=selected_words,
                                                     target='sentiment',
                                                     validation_set=test_data)

In [11]:
selected_words_model.coefficients.sort('value', ascending=False) \
    .print_rows(num_rows=selected_words_model.coefficients.shape[0],
                num_columns=selected_words_model.coefficients.shape[1])

+-------------+-------+-------+-----------------------+----------------------+
|     name    | index | class |         value         |        stderr        |
+-------------+-------+-------+-----------------------+----------------------+
|     love    |  None |   1   |   1.3592688669225153  | 0.028068300152099435 |
| (intercept) |  None |   1   |   1.3365913848877558  | 0.008929969787656753 |
|   awesome   |  None |   1   |   1.133534666034145   | 0.08399643983187526  |
|   amazing   |  None |   1   |   1.1000933113660283  | 0.09954776260465983  |
|  fantastic  |  None |   1   |   0.8858047568814295  | 0.11167591293399656  |
|    great    |  None |   1   |   0.8630655001196618  | 0.018955052444377323 |
|     wow     |  None |   1   | -0.009538236067678897 |  0.1604641122471166  |
|     bad     |  None |   1   |  -0.9914778800650565  | 0.03848428664699063  |
|     hate    |  None |   1   |  -1.3484407222463124  | 0.07715698604297333  |
|    awful    |  None |   1   |  -2.0529082040313513

In [12]:
data['predicted_sentiment'] = selected_words_model.predict(data,
                                                           output_type='probability')

**Comparing the accuracy of different sentiment analysis model:** Using the method `.evaluate(test_data)` What is the accuracy of the selected_words_model on the test_data? What was the accuracy of the sentiment_model that we learned using all the word counts in the Jupyter Notebook above from the lectures? What is the accuracy majority class classifier on this task? How do you compare the different learned models with the baseline approach where we are just predicting the majority class? Save these results to answer the quiz at the end.

In [13]:
selected_words_model.evaluate(test_data)

{'accuracy': 0.8463848186404036,
 'auc': 0.6936022046674926,
 'confusion_matrix': Columns:
 	target_label	int
 	predicted_label	int
 	count	int
 
 Rows: 4
 
 Data:
 +--------------+-----------------+-------+
 | target_label | predicted_label | count |
 +--------------+-----------------+-------+
 |      1       |        0        |  159  |
 |      0       |        0        |  371  |
 |      0       |        1        |  4957 |
 |      1       |        1        | 27817 |
 +--------------+-----------------+-------+
 [4 rows x 3 columns],
 'f1_score': 0.9157860082304526,
 'log_loss': 0.3962265467087378,
 'precision': 0.8487520595594068,
 'recall': 0.9943165570488991,
 'roc_curve': Columns:
 	threshold	float
 	fpr	float
 	tpr	float
 	p	int
 	n	int
 
 Rows: 100001
 
 Data:
 +-----------+--------------------+-----+-------+------+
 | threshold |        fpr         | tpr |   p   |  n   |
 +-----------+--------------------+-----+-------+------+
 |    0.0    |        1.0         | 1.0 | 27976 | 532

In [14]:
# Majority classifier
neg_num = test_data[test_data['rating'] == 1].num_rows() \
              + test_data[test_data['rating'] == 2].num_rows()
zero_num = test_data[test_data['rating'] == 3].num_rows()
pos_num = test_data[test_data['rating'] == 4].num_rows() \
              + test_data[test_data['rating'] == 5].num_rows()
total = test_data.num_rows()

In [20]:
print(f'Number of negative reviews   = {neg_num}')
print(f'Number of undefined opinions = {zero_num}')
print(f'Number of positive reviews   = {pos_num}')
print(f'Total of reviews             = {total}')
print(f'Accuracy of the majority     = {pos_num / total}')

Number of negative reviews   = 5328
Number of undefined opinions = 0
Number of positive reviews   = 27976
Total of reviews             = 33304
Accuracy of the majority     = 0.8400192169108815


**Interpreting the difference in performance between the models:** To understand why the model with all word counts performs better than the one with only the selected_words, we will now examine the reviews for a particular product.
+ We will investigate a product named ‘Baby Trend Diaper Champ’. (This is a trash can for soiled baby diapers, which keeps the smell contained.)
+ Just like we did for the reviews for the giraffe toy in the Jupyter Notebook in the lecture video, before we start our analysis you should select all reviews where the product name is ‘Baby Trend Diaper Champ’. Let’s call this table diaper_champ_reviews.
+ Again, just as in the video, use the sentiment_model to predict the sentiment of each review in diaper_champ_reviews and sort the results according to their ‘predicted_sentiment’.
+ What is the ‘predicted_sentiment’ for the most positive review for ‘Baby Trend Diaper Champ’ according to the sentiment_model from the Jupyter Notebook from lecture? Save this result to answer the quiz at the end.
+ Now use the selected_words_model you learned using just the selected_words to predict the sentiment most positive review you found above. Save this result to answer the quiz at the end.
+ Why is the predicted_sentiment for the most positive review found using the model with all word counts (sentiment_model) much more positive than the one using only the selected_words (selected_words_model)? Hint: examine the text of this review, the extracted word counts for all words, and the word counts for each of the selected_words, and you will see what each model used to make its prediction.Save this result to answer the quiz at the end. 

In [16]:
diaper_champ_reviews = data[data['name'] == 'Baby Trend Diaper Champ']

In [17]:
diaper_champ_reviews = diaper_champ_reviews.sort('predicted_sentiment',
                                                 ascending=False)

In [18]:
diaper_champ_reviews['predicted_sentiment'][0]

0.9981253623335122

In [19]:
diaper_champ_reviews['review'][0]

'I LOVE LOVE LOVE this product! It is SO much easier to use than the Diaper Genie, (you need a PHD in poopy to figure out how to use the darn thing!) and it even takes the same bags as my kitchen trash can, shich is super convenient, and cost efficient as I can buy them in bulk.The only reason for not rating it a 5 star was that I did have one small problem with it. The foam gasket in the barrell which keeps the poopy smell inside the unit ripped somehow, and it got VERY stinky. HOWEVER, I contacted the manufacturer though their website, and received an email back the same day stating that this was unusual, and that replacement gaskets were on their way to me. They arrived inside of a week and after replacing, it works great again! (They even sent me extras should it happen again)I HIGHLY reccomend this diaper pail over ANY competitors, you will not be sorry!'

In [49]:
# Most positive review in sentiment_model
for idx, rev in enumerate(diaper_champ_reviews):
    if diaper_champ_reviews['review'][idx][:3] == 'Ok ':
        print(diaper_champ_reviews[idx]['review'])
        print(f'idx = {idx}')

Ok - newsflash.  Diapers are just smelly.  We've had this pail for 2.5 years now.  It was our first and primary one.  There were no major smell problems until after one year, when our son started eating solids.  Also, we change the bag twice weekly as 3 days is about the max for smell-containment.  Around 20-22 months we started shopping for a container that would be less smelly and didn't find one as good.  (We have a cheaper one upstairs which broke immediately and always stunk!)  We finally just put the Diaper Champ in the attic a few months ago and use the cheap one with the flip-up lid - mainly since the cheapo fits inside the cabinet and we didn't notice a big difference in smell-control.  (The most helpful action is to tie the dirty diapers inside a small plastic bag before putting them in the pail.)A couple of our friends have this pail and were pleased until the children started eating solid food and things got stinkier - but that's pretty much the consensus according to many 

In [50]:
diaper_champ_reviews['predicted_sentiment'][240]

0.7919288370624453