#Predicting sentiment from product reviews

#Fire up GraphLab Create

In [39]:
import graphlab
from functools import partial
from collections import OrderedDict

#Read some product review data

Loading reviews for a set of baby products. 

In [42]:
products = graphlab.SFrame('amazon_baby.gl/')

#Let's explore this data together

Data includes the product name, the review text and the rating of the review. 

In [43]:
products.head()

name,review,rating
Planetwise Flannel Wipes,"These flannel wipes are OK, but in my opinion ...",3.0
Planetwise Wipe Pouch,it came early and was not disappointed. i love ...,5.0
Annas Dream Full Quilt with 2 Shams ...,Very soft and comfortable and warmer than it ...,5.0
Stop Pacifier Sucking without tears with ...,This is a product well worth the purchase. I ...,5.0
Stop Pacifier Sucking without tears with ...,All of my kids have cried non-stop when I tried to ...,5.0
Stop Pacifier Sucking without tears with ...,"When the Binky Fairy came to our house, we didn't ...",5.0
A Tale of Baby's Days with Peter Rabbit ...,"Lovely book, it's bound tightly so you may no ...",4.0
"Baby Tracker&reg; - Daily Childcare Journal, ...",Perfect for new parents. We were able to keep ...,5.0
"Baby Tracker&reg; - Daily Childcare Journal, ...",A friend of mine pinned this product on Pinte ...,5.0
"Baby Tracker&reg; - Daily Childcare Journal, ...",This has been an easy way for my nanny to record ...,4.0


#Build the word count vector for each review

In [44]:
products['word_count'] = graphlab.text_analytics.count_words(products['review'])

In [45]:
products.head()

name,review,rating,word_count
Planetwise Flannel Wipes,"These flannel wipes are OK, but in my opinion ...",3.0,"{'and': 5L, 'stink': 1L, 'because': 1L, 'order ..."
Planetwise Wipe Pouch,it came early and was not disappointed. i love ...,5.0,"{'and': 3L, 'love': 1L, 'it': 2L, 'highly': 1L, ..."
Annas Dream Full Quilt with 2 Shams ...,Very soft and comfortable and warmer than it ...,5.0,"{'and': 2L, 'quilt': 1L, 'it': 1L, 'comfortable': ..."
Stop Pacifier Sucking without tears with ...,This is a product well worth the purchase. I ...,5.0,"{'ingenious': 1L, 'and': 3L, 'love': 2L, ..."
Stop Pacifier Sucking without tears with ...,All of my kids have cried non-stop when I tried to ...,5.0,"{'and': 2L, 'parents!!': 1L, 'all': 2L, 'puppe ..."
Stop Pacifier Sucking without tears with ...,"When the Binky Fairy came to our house, we didn't ...",5.0,"{'and': 2L, 'cute': 1L, 'help': 2L, 'doll': 1L, ..."
A Tale of Baby's Days with Peter Rabbit ...,"Lovely book, it's bound tightly so you may no ...",4.0,"{'shop': 1L, 'be': 1L, 'is': 1L, 'it': 1L, ' ..."
"Baby Tracker&reg; - Daily Childcare Journal, ...",Perfect for new parents. We were able to keep ...,5.0,"{'feeding,': 1L, 'and': 2L, 'all': 1L, 'right': ..."
"Baby Tracker&reg; - Daily Childcare Journal, ...",A friend of mine pinned this product on Pinte ...,5.0,"{'and': 1L, 'help': 1L, 'give': 1L, 'is': 1L, ..."
"Baby Tracker&reg; - Daily Childcare Journal, ...",This has been an easy way for my nanny to record ...,4.0,"{'journal.': 1L, 'all': 1L, 'standarad': 1L, ..."


In [46]:
def positive_count(word, word_count):
    if word in word_count:
        return word_count[word]
    else:
        return 0

In [47]:
selected_words = ['awesome', 'great', 'fantastic', 'amazing', 'love', 'horrible', 'bad', 'terrible', 'awful', 'wow', 'hate']

In [48]:
# Use .apply() to build a new feature with the counts for each of the selected_words
products['awesome'] = products['word_count'].apply(partial(positive_count, 'awesome'))

In [49]:
awesome_products = products[products['awesome'] > 0]

In [50]:
awesome_products

name,review,rating,word_count,awesome
Pedal Farm Tractor,I bought this for my son when he was 3 years old. ...,5.0,"{'and': 3L, 'this': 2L, 'old': 1L, 'purchased': ...",1
Thomas &amp; Friends - 3 Piece Dinnerware Set- ...,This dining ware set is awesome for the Thomas ...,5.0,"{'and': 1L, 'thomas': 1L, 'set': 1L, 'awesome': ...",1
Munchkin Mozart Magic Cube ...,The Mozart magic cube is an AWESOME toy for my ...,5.0,"{'and': 3L, 'old': 2L, 'classic': 1L, 'enough': ...",1
Munchkin Mozart Magic Cube ...,Our daughter got this toy for her first birthday. ...,4.0,"{'and': 4L, 'grandfather!': 1L, ...",1
Evenflo Top of Stair Gate,"Awesome gate. It is sturdy, so its very well ...",5.0,"{'and': 1L, 'the': 4L, 'childproof': 1L, ...",1
Animal Planet's Big Tub of Dinosaurs ...,This is an awesome complete set of dinos ...,5.0,"{'we': 1L, 'set': 1L, 'price.': 1L, ...",1
"Graco TotBloc Pack 'N Play with Carry Bag, ...",I ordered this because my 23 lb 30 inch long 7 ...,1.0,"{'month': 1L, 'sleep': 1L, 'still': 1L, 'its': ...",1
Philips AVENT Isis On The Go Set ...,I based my decision to purchase this pump based ...,2.0,"{'all': 3L, ""don't"": 1L, 'baby': 1L, 'ounces': ...",1
Philips AVENT Isis On The Go Set ...,I loved this pump. I had my first child this past ...,5.0,"{'feed': 2L, 'and': 5L, 'inexpensive.': 1L, ...",1
The First Years Nature Sensations Lullaby Pl ...,Our son had problems falling asleep and ...,5.0,"{'all': 1L, 'just': 1L, 'saver': 1L, 'toy.and': ...",1


In [53]:
# selected_words = ['awesome', 'great', 'fantastic', 'amazing', 'love', 'horrible', 'bad', 'terrible', 'awful', 'wow', 'hate']
for word in selected_words:
    products[word] = products['word_count'].apply(partial(positive_count, word))

In [54]:
products.head()

name,review,rating,word_count,awesome,great
Planetwise Flannel Wipes,"These flannel wipes are OK, but in my opinion ...",3.0,"{'and': 5L, 'stink': 1L, 'because': 1L, 'order ...",0,0.0
Planetwise Wipe Pouch,it came early and was not disappointed. i love ...,5.0,"{'and': 3L, 'love': 1L, 'it': 2L, 'highly': 1L, ...",0,0.0
Annas Dream Full Quilt with 2 Shams ...,Very soft and comfortable and warmer than it ...,5.0,"{'and': 2L, 'quilt': 1L, 'it': 1L, 'comfortable': ...",0,0.0
Stop Pacifier Sucking without tears with ...,This is a product well worth the purchase. I ...,5.0,"{'ingenious': 1L, 'and': 3L, 'love': 2L, ...",0,0.0
Stop Pacifier Sucking without tears with ...,All of my kids have cried non-stop when I tried to ...,5.0,"{'and': 2L, 'parents!!': 1L, 'all': 2L, 'puppe ...",0,1.0
Stop Pacifier Sucking without tears with ...,"When the Binky Fairy came to our house, we didn't ...",5.0,"{'and': 2L, 'cute': 1L, 'help': 2L, 'doll': 1L, ...",0,1.0
A Tale of Baby's Days with Peter Rabbit ...,"Lovely book, it's bound tightly so you may no ...",4.0,"{'shop': 1L, 'be': 1L, 'is': 1L, 'it': 1L, ' ...",0,0.0
"Baby Tracker&reg; - Daily Childcare Journal, ...",Perfect for new parents. We were able to keep ...,5.0,"{'feeding,': 1L, 'and': 2L, 'all': 1L, 'right': ...",0,0.0
"Baby Tracker&reg; - Daily Childcare Journal, ...",A friend of mine pinned this product on Pinte ...,5.0,"{'and': 1L, 'help': 1L, 'give': 1L, 'is': 1L, ...",0,0.0
"Baby Tracker&reg; - Daily Childcare Journal, ...",This has been an easy way for my nanny to record ...,4.0,"{'journal.': 1L, 'all': 1L, 'standarad': 1L, ...",0,0.0

fantastic,amazing,love,horrible,bad,terrible,awful,wow,hate
0.0,0.0,0.0,0,0.0,0,0,0,0
0.0,0.0,1.0,0,0.0,0,0,0,0
0.0,0.0,0.0,0,0.0,0,0,0,0
0.0,0.0,2.0,0,0.0,0,0,0,0
0.0,0.0,0.0,0,0.0,0,0,0,0
0.0,0.0,0.0,0,0.0,0,0,0,0
0.0,0.0,0.0,0,0.0,0,0,0,0
0.0,0.0,0.0,0,0.0,0,0,0,0
0.0,0.0,0.0,0,0.0,0,0,0,0
0.0,0.0,0.0,0,0.0,0,0,0,0


In [55]:
# selected_words = ['awesome', 'great', 'fantastic', 'amazing', 'love', 'horrible', 'bad', 'terrible', 'awful', 'wow', 'hate']
sum_data = dict()
for word in selected_words:
    sum_data[word] = products[word].sum()

In [56]:
# word_count sum sorted by value
sorted_sum_data = OrderedDict(sorted(sum_data.items(), key=lambda t: t[1]))
sorted_sum_data

OrderedDict([('wow', 144L), ('awful', 383L), ('horrible', 734L), ('terrible', 748L), ('fantastic', 932.0), ('hate', 1220L), ('amazing', 1363.0), ('awesome', 2090L), ('bad', 3724.0), ('love', 42065.0), ('great', 45206.0)])

In [57]:
graphlab.canvas.set_target('ipynb')

In [58]:
products['rating'].show(view='Categorical')

##Define what's a positive and a negative sentiment

We will ignore all reviews with rating = 3, since they tend to have a neutral sentiment.  Reviews with a rating of 4 or higher will be considered positive, while the ones with rating of 2 or lower will have a negative sentiment.

In [59]:
#ignore all 3* reviews
products = products[products['rating'] != 3]

In [60]:
#positive sentiment = 4* or 5* reviews
products['sentiment'] = products['rating'] >=4

In [61]:
products.head()

name,review,rating,word_count,awesome,great
Planetwise Wipe Pouch,it came early and was not disappointed. i love ...,5.0,"{'and': 3L, 'love': 1L, 'it': 2L, 'highly': 1L, ...",0,0.0
Annas Dream Full Quilt with 2 Shams ...,Very soft and comfortable and warmer than it ...,5.0,"{'and': 2L, 'quilt': 1L, 'it': 1L, 'comfortable': ...",0,0.0
Stop Pacifier Sucking without tears with ...,This is a product well worth the purchase. I ...,5.0,"{'ingenious': 1L, 'and': 3L, 'love': 2L, ...",0,0.0
Stop Pacifier Sucking without tears with ...,All of my kids have cried non-stop when I tried to ...,5.0,"{'and': 2L, 'parents!!': 1L, 'all': 2L, 'puppe ...",0,1.0
Stop Pacifier Sucking without tears with ...,"When the Binky Fairy came to our house, we didn't ...",5.0,"{'and': 2L, 'cute': 1L, 'help': 2L, 'doll': 1L, ...",0,1.0
A Tale of Baby's Days with Peter Rabbit ...,"Lovely book, it's bound tightly so you may no ...",4.0,"{'shop': 1L, 'be': 1L, 'is': 1L, 'it': 1L, ' ...",0,0.0
"Baby Tracker&reg; - Daily Childcare Journal, ...",Perfect for new parents. We were able to keep ...,5.0,"{'feeding,': 1L, 'and': 2L, 'all': 1L, 'right': ...",0,0.0
"Baby Tracker&reg; - Daily Childcare Journal, ...",A friend of mine pinned this product on Pinte ...,5.0,"{'and': 1L, 'help': 1L, 'give': 1L, 'is': 1L, ...",0,0.0
"Baby Tracker&reg; - Daily Childcare Journal, ...",This has been an easy way for my nanny to record ...,4.0,"{'journal.': 1L, 'all': 1L, 'standarad': 1L, ...",0,0.0
"Baby Tracker&reg; - Daily Childcare Journal, ...",I love this journal and our nanny uses it ...,4.0,"{'all': 1L, 'forget': 1L, 'just': 1L, ""daughter ...",0,0.0

fantastic,amazing,love,horrible,bad,terrible,awful,wow,hate,sentiment
0.0,0.0,1.0,0,0.0,0,0,0,0,1
0.0,0.0,0.0,0,0.0,0,0,0,0,1
0.0,0.0,2.0,0,0.0,0,0,0,0,1
0.0,0.0,0.0,0,0.0,0,0,0,0,1
0.0,0.0,0.0,0,0.0,0,0,0,0,1
0.0,0.0,0.0,0,0.0,0,0,0,0,1
0.0,0.0,0.0,0,0.0,0,0,0,0,1
0.0,0.0,0.0,0,0.0,0,0,0,0,1
0.0,0.0,0.0,0,0.0,0,0,0,0,1
0.0,0.0,2.0,0,0.0,0,0,0,0,1


## Split the data into test and training

In [62]:
train_data,test_data = products.random_split(.8, seed=0)

In [63]:
# In what range is the accuracy of simply predicting the majority class on the test_data
float(len(test_data[test_data['sentiment'] > 0])) / len(test_data)

0.8400192169108815

#Build a selected words sentiment classifier

In [64]:
selected_words_model = graphlab.logistic_classifier.create(train_data,
                                                     target='sentiment',
                                                     features=selected_words,
                                                     validation_set=test_data)

In [65]:
# Using this approach, sort the learned coefficients according to the ‘value’ column using .sort(). 
# Q. Out of the 11 words in selected_words, which one got the most positive weight? Which one got the most negative weight?
selected_words_model['coefficients'].sort('value').print_rows(num_rows=12, num_columns=4)

+-------------+-------+-------+------------------+
|     name    | index | class |      value       |
+-------------+-------+-------+------------------+
|   terrible  |  None |   1   |  -2.09049998487  |
|   horrible  |  None |   1   |  -1.99651800559  |
|    awful    |  None |   1   |  -1.76469955631  |
|     hate    |  None |   1   |  -1.40916406276  |
|     bad     |  None |   1   | -0.985827369929  |
|     wow     |  None |   1   | -0.0541450123333 |
|    great    |  None |   1   |  0.883937894898  |
|  fantastic  |  None |   1   |  0.891303090304  |
|   amazing   |  None |   1   |  0.892802422508  |
|   awesome   |  None |   1   |  1.05800888878   |
| (intercept) |  None |   1   |  1.36728315229   |
|     love    |  None |   1   |  1.39989834302   |
+-------------+-------+-------+------------------+
[12 rows x 4 columns]



#Evaluate the selected words sentiment model

In [66]:
selected_words_model.evaluate(test_data, metric='roc_curve')

{'roc_curve': Columns:
 	threshold	float
 	fpr	float
 	tpr	float
 	p	int
 	n	int
 
 Rows: 1001
 
 Data:
 +------------------+-------------------+-------------------+-------+------+
 |    threshold     |        fpr        |        tpr        |   p   |  n   |
 +------------------+-------------------+-------------------+-------+------+
 |       0.0        | 0.000187687687688 | 3.57334286225e-05 | 27985 | 5328 |
 | 0.0010000000475  |   0.999812312312  |   0.999964266571  | 27985 | 5328 |
 | 0.00200000009499 |   0.999624624625  |   0.999964266571  | 27985 | 5328 |
 | 0.00300000002608 |   0.999624624625  |   0.999964266571  | 27985 | 5328 |
 | 0.00400000018999 |   0.999436936937  |   0.999964266571  | 27985 | 5328 |
 | 0.00499999988824 |   0.999436936937  |   0.999964266571  | 27985 | 5328 |
 | 0.00600000005215 |   0.999249249249  |   0.999964266571  | 27985 | 5328 |
 | 0.00700000021607 |   0.999249249249  |   0.999964266571  | 27985 | 5328 |
 | 0.00800000037998 |   0.999249249249  |   0.999

In [67]:
selected_words_model.show(view='Evaluation')

#Build a word_count sentiment classifier

In [69]:
sentiment_model = graphlab.logistic_classifier.create(train_data,
                                                     target='sentiment',
                                                     features=['word_count'],
                                                     validation_set=test_data)

#Evaluate the word_count words sentiment model

In [70]:
sentiment_model.evaluate(test_data, metric='roc_curve')

{'roc_curve': Columns:
 	threshold	float
 	fpr	float
 	tpr	float
 	p	int
 	n	int
 
 Rows: 1001
 
 Data:
 +------------------+----------------+------------------+-------+------+
 |    threshold     |      fpr       |       tpr        |   p   |  n   |
 +------------------+----------------+------------------+-------+------+
 |       0.0        | 0.222159624413 | 0.00424863436752 | 28009 | 5325 |
 | 0.0010000000475  | 0.777840375587 |  0.995751365632  | 28009 | 5325 |
 | 0.00200000009499 | 0.738028169014 |  0.994608875719  | 28009 | 5325 |
 | 0.00300000002608 | 0.715305164319 |  0.994001927952  | 28009 | 5325 |
 | 0.00400000018999 | 0.699906103286 |  0.993502088614  | 28009 | 5325 |
 | 0.00499999988824 | 0.688638497653 |  0.993145060516  | 28009 | 5325 |
 | 0.00600000005215 | 0.678873239437 |  0.992645221179  | 28009 | 5325 |
 | 0.00700000021607 | 0.668356807512 |  0.992288193081  | 28009 | 5325 |
 | 0.00800000037998 | 0.658028169014 |  0.992038273412  | 28009 | 5325 |
 | 0.00899999961257 

In [71]:
sentiment_model.show(view='Evaluation')

Q. What is the accuracy of the selected_words_model on the test_data? What was the accuracy of the sentiment_model that we learned using all the word counts in the IPython Notebook above from the lectures? What is the accuracy majority class classifier on this task? How do you compare the different learned models with the baseline approach where we are just predicting the majority class? Save these results to answer the quiz at the end.

# Interpreting the difference in performance between the models
To understand why the model with all word counts performs better than the one with only the selected_words, we will now examine the reviews for a particular product 'Baby Trend Diaper Champ'

In [72]:
diaper_champ_reviews = products[products['name'] == 'Baby Trend Diaper Champ']

## Applying the word_count sentiment model to understand sentiment for Diaper Champ

In [73]:
diaper_champ_reviews['predicted_sentiment'] = sentiment_model.predict(diaper_champ_reviews, output_type='probability')

##Sort the reviews based on the predicted sentiment and explore

In [74]:
diaper_champ_reviews = diaper_champ_reviews.sort('predicted_sentiment', ascending=False)

In [75]:
diaper_champ_reviews.head()

name,review,rating,word_count,awesome,great,fantastic
Baby Trend Diaper Champ,Baby Luke can turn a clean diaper to a dirty ...,5.0,"{'all': 1L, 'less': 1L, ""friend's"": 1L, '(whi ...",0,0.0,0.0
Baby Trend Diaper Champ,I LOOOVE this diaper pail! Its the easies ...,5.0,"{'just': 1L, 'over': 1L, 'rweek': 1L, 'sooo': 1L, ...",0,0.0,0.0
Baby Trend Diaper Champ,We researched all of the different types of di ...,4.0,"{'all': 2L, 'just': 4L, ""don't"": 2L, 'one,': 1L, ...",0,0.0,0.0
Baby Trend Diaper Champ,My baby is now 8 months and the can has been ...,5.0,"{""don't"": 1L, 'when': 1L, 'over': 1L, 'soon': 1L, ...",0,2.0,0.0
Baby Trend Diaper Champ,"This is absolutely, by far, the best diaper ...",5.0,"{'just': 3L, 'money': 1L, 'not': 2L, 'mechanism': ...",0,0.0,0.0
Baby Trend Diaper Champ,Diaper Champ or Diaper Genie? That was my ...,5.0,"{'all': 1L, 'bags.': 1L, 'son,': 1L, '(i': 1L, ...",0,0.0,0.0
Baby Trend Diaper Champ,Wow! This is fabulous. It was a toss-up between ...,5.0,"{'and': 4L, '""genie"".': 1L, 'since': 1L, ...",0,0.0,0.0
Baby Trend Diaper Champ,I originally put this item on my baby registry ...,5.0,"{'lysol': 1L, 'all': 2L, 'bags.': 1L, 'feedback': ...",0,0.0,0.0
Baby Trend Diaper Champ,Two girlfriends and two family members put me ...,5.0,"{'just': 1L, 'when': 1L, 'both': 1L, 'results': ...",0,0.0,0.0
Baby Trend Diaper Champ,I am one of those super- critical shoppers who ...,5.0,"{'taller': 1L, 'bags.': 1L, 'just': 1L, ""don't"": ...",0,0.0,0.0

amazing,love,horrible,bad,terrible,awful,wow,hate,sentiment,predicted_sentiment
0.0,0.0,0,0.0,0,0,0,0,1,0.999999937267
0.0,1.0,0,0.0,0,0,0,0,1,0.999999917406
0.0,0.0,0,1.0,0,0,0,0,1,0.999999899509
0.0,0.0,0,1.0,0,0,0,0,1,0.999999836182
0.0,2.0,0,0.0,0,0,0,0,1,0.999999824745
0.0,0.0,0,0.0,0,0,0,0,1,0.999999759315
0.0,0.0,0,0.0,0,0,0,0,1,0.999999692111
0.0,0.0,0,0.0,0,0,0,0,1,0.999999642488
0.0,0.0,1,0.0,0,0,0,0,1,0.999999604504
0.0,1.0,0,0.0,0,0,0,0,1,0.999999486804


Q. What is the ‘predicted_sentiment’ for the most positive review for ‘Baby Trend Diaper Champ’ according to the sentiment_model?

## Applying the selected words sentiment model to predict sentiment for most positive review for Diaper Champ

In [76]:
predicted_sentiment_for_most_positive_review = selected_words_model.predict(diaper_champ_reviews[0:1], output_type='probability')

In [77]:
predicted_sentiment_for_most_positive_review

dtype: float
Rows: 1
[0.796940851290671]

#### Q. Why is the predicted_sentiment for the most positive review found using the model with all word counts (sentiment_model) much more positive than the one using only the selected_words (selected_words_model) ?

In [78]:
diaper_champ_reviews[0]['review']

'Baby Luke can turn a clean diaper to a dirty diaper in 3 seconds flat. The diaper champ turns the smelly diaper into "what diaper smell" in less time than that. I hesitated and wondered what I REALLY needed for the nursery. This is one of the best purchases we made. The champ, the baby bjorn, fluerville diaper bag, and graco pack and play bassinet all vie for the best baby purchase.Great product, easy to use, economical, effective, absolutly fabulous.UpdateI knew that I loved the champ, and useing the diaper genie at a friend\'s house REALLY reinforced that!! There is no comparison, the chanp is easy and smell free, the genie was difficult to use one handed (which is absolutly vital if you have a little one on a changing pad) and there was a deffinite odor eminating from the genieplus we found that the quick tie garbage bags where the ties are integrated into the bag work really well because there isn\'t any added bulk around the sealing edge of the champ.'

In [79]:
diaper_champ_reviews[0]['word_count']

{'"what': 1L,
 '(which': 1L,
 '3': 1L,
 'a': 6L,
 'absolutly': 2L,
 'added': 1L,
 'all': 1L,
 'and': 6L,
 'any': 1L,
 'are': 1L,
 'around': 1L,
 'at': 1L,
 'baby': 3L,
 'bag': 1L,
 'bag,': 1L,
 'bags': 1L,
 'bassinet': 1L,
 'because': 1L,
 'best': 2L,
 'bjorn,': 1L,
 'bulk': 1L,
 'can': 1L,
 'champ': 1L,
 'champ,': 2L,
 'champ.': 1L,
 'changing': 1L,
 'chanp': 1L,
 'clean': 1L,
 'comparison,': 1L,
 'deffinite': 1L,
 'diaper': 7L,
 'difficult': 1L,
 'dirty': 1L,
 'easy': 2L,
 'economical,': 1L,
 'edge': 1L,
 'effective,': 1L,
 'eminating': 1L,
 'fabulous.updatei': 1L,
 'flat.': 1L,
 'fluerville': 1L,
 'for': 2L,
 'found': 1L,
 'free,': 1L,
 "friend's": 1L,
 'from': 1L,
 'garbage': 1L,
 'genie': 2L,
 'genieplus': 1L,
 'graco': 1L,
 'handed': 1L,
 'have': 1L,
 'hesitated': 1L,
 'house': 1L,
 'i': 3L,
 'if': 1L,
 'in': 2L,
 'integrated': 1L,
 'into': 2L,
 'is': 4L,
 "isn't": 1L,
 'knew': 1L,
 'less': 1L,
 'little': 1L,
 'loved': 1L,
 'luke': 1L,
 'made.': 1L,
 'needed': 1L,
 'no': 1L,
 'nu

In [80]:
for word in selected_words:
    print diaper_champ_reviews[0][word]

0
0.0
0.0
0.0
0.0
0
0.0
0
0
0
0


#### None of the words of the selected_words is present in the most popular review text, hence the sentiment prediction by selected words model is poor as compared to the words count model