# Building A Logistic Sentiment Classifier Based On Selected Words

Trying to train a logistic regression model using picked important words to classify review sentiments.

In [1]:
import turicreate as tc

In [2]:
products = tc.SFrame('./amazon_baby.sframe')


In [3]:
products.head()

name,review,rating
Planetwise Flannel Wipes,"These flannel wipes are OK, but in my opinion ...",3.0
Planetwise Wipe Pouch,it came early and was not disappointed. i love ...,5.0
Annas Dream Full Quilt with 2 Shams ...,Very soft and comfortable and warmer than it ...,5.0
Stop Pacifier Sucking without tears with ...,This is a product well worth the purchase. I ...,5.0
Stop Pacifier Sucking without tears with ...,All of my kids have cried non-stop when I tried to ...,5.0
Stop Pacifier Sucking without tears with ...,"When the Binky Fairy came to our house, we didn't ...",5.0
A Tale of Baby's Days with Peter Rabbit ...,"Lovely book, it's bound tightly so you may no ...",4.0
"Baby Tracker&reg; - Daily Childcare Journal, ...",Perfect for new parents. We were able to keep ...,5.0
"Baby Tracker&reg; - Daily Childcare Journal, ...",A friend of mine pinned this product on Pinte ...,5.0
"Baby Tracker&reg; - Daily Childcare Journal, ...",This has been an easy way for my nanny to record ...,4.0


In [4]:
products['rating'].show()

# Cleanining Data

Deleting all the reviews with borderline review (3 Stars) and keeping only the rest of the data.

In [5]:
products = products[products['rating'] != 3]

Creating the derived column verdict which contains either 0 for bad, or 1 for good reviews.

In [6]:
products['sentiment'] = products['rating'] >= 4
products.head()

name,review,rating,sentiment
Planetwise Wipe Pouch,it came early and was not disappointed. i love ...,5.0,1
Annas Dream Full Quilt with 2 Shams ...,Very soft and comfortable and warmer than it ...,5.0,1
Stop Pacifier Sucking without tears with ...,This is a product well worth the purchase. I ...,5.0,1
Stop Pacifier Sucking without tears with ...,All of my kids have cried non-stop when I tried to ...,5.0,1
Stop Pacifier Sucking without tears with ...,"When the Binky Fairy came to our house, we didn't ...",5.0,1
A Tale of Baby's Days with Peter Rabbit ...,"Lovely book, it's bound tightly so you may no ...",4.0,1
"Baby Tracker&reg; - Daily Childcare Journal, ...",Perfect for new parents. We were able to keep ...,5.0,1
"Baby Tracker&reg; - Daily Childcare Journal, ...",A friend of mine pinned this product on Pinte ...,5.0,1
"Baby Tracker&reg; - Daily Childcare Journal, ...",This has been an easy way for my nanny to record ...,4.0,1
"Baby Tracker&reg; - Daily Childcare Journal, ...",I love this journal and our nanny uses it ...,4.0,1


In [7]:
products.sort('sentiment').head()

name,review,rating,sentiment
Whoozit Crib Activity Mirror ...,I purchased this item and besides receiving the ...,1.0,0
MOBI Digital Ultra Thermometer ...,This has not been accurate since I opened ...,1.0,0
MOBI Digital Ultra Thermometer ...,This thermometer will give a reading that ...,1.0,0
MOBI Digital Ultra Thermometer ...,"It seemed to work well at first, but then I began ...",1.0,0
MOBI Digital Ultra Thermometer ...,I was very excited to receive this thermome ...,1.0,0
MOBI Digital Ultra Thermometer ...,Never felt like this product worked right. ...,1.0,0
MOBI Digital Ultra Thermometer ...,We were very happy with this thermometer for the ...,2.0,0
Momo Baby 2-Pack Wide Neck Fast Flow Silicone ...,these nipples aren't as good as born free ...,2.0,0
MOBI Digital Ultra Thermometer ...,Inaccurate readings! A big waste of money! ...,1.0,0
MOBI Digital Ultra Thermometer ...,Basically the worst thermometer I ever ...,1.0,0


# Creating The Word Count Feature Column

In [8]:
products['word_count'] = tc.text_analytics.count_words(products['review'])
products.head()

name,review,rating,sentiment,word_count
Planetwise Wipe Pouch,it came early and was not disappointed. i love ...,5.0,1,"{'recommend': 1.0, 'highly': 1.0, ..."
Annas Dream Full Quilt with 2 Shams ...,Very soft and comfortable and warmer than it ...,5.0,1,"{'quilt': 1.0, 'of': 1.0, 'the': 1.0, 'than': 1.0, ..."
Stop Pacifier Sucking without tears with ...,This is a product well worth the purchase. I ...,5.0,1,"{'tool': 1.0, 'clever': 1.0, 'approach': 2.0, ..."
Stop Pacifier Sucking without tears with ...,All of my kids have cried non-stop when I tried to ...,5.0,1,"{'rock': 1.0, 'many': 1.0, 'headaches': 1.0, ..."
Stop Pacifier Sucking without tears with ...,"When the Binky Fairy came to our house, we didn't ...",5.0,1,"{'thumb': 1.0, 'or': 1.0, 'break': 1.0, 'trying': ..."
A Tale of Baby's Days with Peter Rabbit ...,"Lovely book, it's bound tightly so you may no ...",4.0,1,"{'for': 1.0, 'barnes': 1.0, 'at': 1.0, 'is': ..."
"Baby Tracker&reg; - Daily Childcare Journal, ...",Perfect for new parents. We were able to keep ...,5.0,1,"{'right': 1.0, 'because': 1.0, 'questions': 1.0, ..."
"Baby Tracker&reg; - Daily Childcare Journal, ...",A friend of mine pinned this product on Pinte ...,5.0,1,"{'like': 1.0, 'and': 1.0, 'changes': 1.0, 'the': ..."
"Baby Tracker&reg; - Daily Childcare Journal, ...",This has been an easy way for my nanny to record ...,4.0,1,"{'in': 1.0, 'pages': 1.0, 'out': 1.0, 'run': 1.0, ..."
"Baby Tracker&reg; - Daily Childcare Journal, ...",I love this journal and our nanny uses it ...,4.0,1,"{'tracker': 1.0, 'now': 1.0, 'its': 1.0, 'sti ..."


Now filtering the words in the word count colmn to remove irrelevant words.

In [9]:
selected_words = ['awesome', 'great', 'fantastic', 'amazing',
                  'love', 'horrible', 'bad', 'terrible', 'awful', 'wow', 'hate']


In [10]:
products['word_count'].apply(lambda word_count: {word:word_count[word] for word in selected_words if word in word_count})

dtype: dict
Rows: 166752
[{'love': 1.0}, {}, {'love': 2.0}, {'great': 1.0, 'love': 1.0}, {'great': 1.0}, {}, {}, {'fantastic': 1.0}, {}, {'love': 2.0}, {}, {}, {}, {}, {'love': 1.0}, {}, {'amazing': 1.0}, {}, {'great': 2.0}, {}, {'love': 1.0}, {}, {}, {}, {}, {'great': 1.0}, {'great': 1.0, 'love': 1.0}, {}, {'love': 1.0}, {}, {'great': 1.0}, {'love': 1.0}, {}, {}, {}, {'love': 1.0}, {}, {'great': 2.0}, {'great': 1.0, 'love': 1.0}, {}, {}, {}, {'great': 1.0}, {'love': 2.0}, {'great': 3.0}, {}, {'fantastic': 1.0}, {}, {'great': 1.0, 'love': 1.0}, {}, {}, {}, {}, {'love': 1.0}, {}, {'great': 1.0}, {'great': 1.0, 'love': 1.0}, {}, {}, {'great': 1.0}, {}, {'great': 1.0}, {}, {}, {}, {'fantastic': 1.0}, {}, {'love': 1.0}, {'great': 1.0, 'love': 1.0}, {'great': 1.0}, {}, {}, {'great': 2.0}, {}, {}, {}, {'great': 2.0}, {}, {}, {}, {}, {}, {}, {}, {}, {'love': 1.0}, {'great': 1.0}, {'awesome': 1.0}, {}, {}, {'great': 1.0}, {'love': 1.0, 'terrible': 1.0}, {'great': 2.0, 'love': 3.0}, {'great': 1

In [11]:
for curr_word in selected_words:
    products[curr_word] = products['word_count'].apply(lambda word_count: word_count[curr_word] if curr_word in word_count else 0, dtype=int)


In [12]:
products

name,review,rating,sentiment,word_count,awesome
Planetwise Wipe Pouch,it came early and was not disappointed. i love ...,5.0,1,"{'recommend': 1.0, 'highly': 1.0, ...",0
Annas Dream Full Quilt with 2 Shams ...,Very soft and comfortable and warmer than it ...,5.0,1,"{'quilt': 1.0, 'of': 1.0, 'the': 1.0, 'than': 1.0, ...",0
Stop Pacifier Sucking without tears with ...,This is a product well worth the purchase. I ...,5.0,1,"{'tool': 1.0, 'clever': 1.0, 'approach': 2.0, ...",0
Stop Pacifier Sucking without tears with ...,All of my kids have cried non-stop when I tried to ...,5.0,1,"{'rock': 1.0, 'many': 1.0, 'headaches': 1.0, ...",0
Stop Pacifier Sucking without tears with ...,"When the Binky Fairy came to our house, we didn't ...",5.0,1,"{'thumb': 1.0, 'or': 1.0, 'break': 1.0, 'trying': ...",0
A Tale of Baby's Days with Peter Rabbit ...,"Lovely book, it's bound tightly so you may no ...",4.0,1,"{'for': 1.0, 'barnes': 1.0, 'at': 1.0, 'is': ...",0
"Baby Tracker&reg; - Daily Childcare Journal, ...",Perfect for new parents. We were able to keep ...,5.0,1,"{'right': 1.0, 'because': 1.0, 'questions': 1.0, ...",0
"Baby Tracker&reg; - Daily Childcare Journal, ...",A friend of mine pinned this product on Pinte ...,5.0,1,"{'like': 1.0, 'and': 1.0, 'changes': 1.0, 'the': ...",0
"Baby Tracker&reg; - Daily Childcare Journal, ...",This has been an easy way for my nanny to record ...,4.0,1,"{'in': 1.0, 'pages': 1.0, 'out': 1.0, 'run': 1.0, ...",0
"Baby Tracker&reg; - Daily Childcare Journal, ...",I love this journal and our nanny uses it ...,4.0,1,"{'tracker': 1.0, 'now': 1.0, 'its': 1.0, 'sti ...",0

great,fantastic,amazing,love,horrible,bad,terrible,awful,wow,hate
0,0,0,1,0,0,0,0,0,0
0,0,0,0,0,0,0,0,0,0
0,0,0,2,0,0,0,0,0,0
1,0,0,1,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0
0,0,0,0,0,0,0,0,0,0
0,0,0,0,0,0,0,0,0,0
0,1,0,0,0,0,0,0,0,0
0,0,0,0,0,0,0,0,0,0
0,0,0,2,0,0,0,0,0,0


In [13]:
for word in selected_words:
    print('{}: {}'.format(word, products[word].sum()))

awesome: 3892
great: 55791
fantastic: 1664
amazing: 2628
love: 41994
horrible: 1110
bad: 4183
terrible: 1146
awful: 687
wow: 425
hate: 1107


In [14]:
train_data, test_data = products.random_split(0.8, seed=0)

In [15]:
selected_words_model = tc.logistic_classifier.create(train_data, target='sentiment', features=selected_words, validation_set=test_data)

In [16]:
selected_words_model.coefficients

name,index,class,value,stderr
(intercept),,1,1.3365913848877558,0.0089299697876567
awesome,,1,1.133534666034145,0.0839964398318752
great,,1,0.8630655001196618,0.0189550524443773
fantastic,,1,0.8858047568814295,0.1116759129339965
amazing,,1,1.1000933113660285,0.0995477626046598
love,,1,1.3592688669225153,0.0280683001520994
horrible,,1,-2.251335236759093,0.0802024938878844
bad,,1,-0.9914778800650564,0.0384842866469906
terrible,,1,-2.223661436085127,0.0773173620378575
awful,,1,-2.0529082040313518,0.1009973543525925


In [17]:
selected_words_model.evaluate(test_data)

{'accuracy': 0.8463848186404036,
 'auc': 0.6935096220934976,
 'confusion_matrix': Columns:
 	target_label	int
 	predicted_label	int
 	count	int
 
 Rows: 4
 
 Data:
 +--------------+-----------------+-------+
 | target_label | predicted_label | count |
 +--------------+-----------------+-------+
 |      1       |        0        |  159  |
 |      0       |        0        |  371  |
 |      0       |        1        |  4957 |
 |      1       |        1        | 27817 |
 +--------------+-----------------+-------+
 [4 rows x 3 columns],
 'f1_score': 0.9157860082304526,
 'log_loss': 0.3962265467087378,
 'precision': 0.8487520595594068,
 'recall': 0.9943165570488991,
 'roc_curve': Columns:
 	threshold	float
 	fpr	float
 	tpr	float
 	p	int
 	n	int
 
 Rows: 1001
 
 Data:
 +-----------+--------------------+-----+-------+------+
 | threshold |        fpr         | tpr |   p   |  n   |
 +-----------+--------------------+-----+-------+------+
 |    0.0    |        1.0         | 1.0 | 27976 | 5328 

In [18]:
products['predicted_sentiment'] = selected_words_model.predict(products, output_type='probability')


In [19]:
# A helper function to print format the desired entries of a product review row.
def display_review(review):
    print('Rating:', review['rating'])    
    print('Sentiment (act, prd): ({}, {})'.format(review['sentiment'], review['predicted_sentiment']))    
    print('Review:', review['review'])    

# Investigating Reviews Of Product 1 (Giraffe Teether)

In [20]:
giraffe_reviews = products[products['name']== 'Vulli Sophie the Giraffe Teether'].sort('predicted_sentiment', ascending=False)

In [21]:
display_review(giraffe_reviews[0])

Rating: 5.0
Sentiment (act, prd): (1, 0.9965015088752052)
Review: Great feel, great squeek, great quality, great story...Sophie is just great all around. My little man loves her...even though in public I do feel a little odd asking my son &#34;here honey baby, do you want your Sophie doll&#34;? Hubs wanted to rename her to a boy name....but that would ruin Sophie's legacy. My son played with her up to about a year old..I'll be saving her forever in my keepsake box.


In [22]:
display_review(giraffe_reviews[1])

Rating: 5.0
Sentiment (act, prd): (1, 0.9955677154227358)
Review: Sophie is one of my daughter's favorite toys, and is wonderful as she begins teething.  Love love love Sophie!


In [23]:
display_review(giraffe_reviews[-1])

Rating: 5.0
Sentiment (act, prd): (1, 0.041550580911746814)
Review: When I first heard about this teether, I thought it was just a stupid expensive yuppie thing that is overpriced and appeals only to people so much money they don't know what to do with it.  I was dead wrong.  My daughter "tried" her cousin's Sophie when she was 7 months old and in a horrible bout of teething, and she didn't want to give it back.  I went out and purchased a Sophie for her the very next day.  This is the only teething toy that ever gave her any relief during teething, and she had a terrible time cutting teeth.  The quality of the toy reflects the price.  She dropped her Sophie at the zoo without me noticing one day, and I had to buy her another one. It's that good.  I think Sophie is a perfect baby shower gift, as it is tough for us new parents to justify spending so much money on a teether.  Buy it for a pregnant mom! Or if you can spare the cash, definitely go ahead and buy it for your baby.  I don't t

In [24]:
display_review(giraffe_reviews[-2])

Rating: 2.0
Sentiment (act, prd): (0, 0.15344997223968979)
Review: I received two of these at my baby shower. I thought they were cute and then I opened one and gave it to my baby. IT SQUEAKS!!!!! It makes a high-pitched, dog-toy squeak that is obnoxious. That being said, the baby loves chewing on it and it is easy for her to hold. But that noise - it is awful. It is loud and draws attention. I will not take it with us to restaurants or even in the car.  It is so bad I have considered &#34;losing&#34; Sophie. I would never give this to another parent.


# Investigating Reviews Of Product 2 (Baby Trend Diaper Champ)

In [25]:
diaper_champ_reviews = products[products['name']=='Baby Trend Diaper Champ'].sort('predicted_sentiment', ascending=False)

In [26]:
diaper_champ_reviews[diaper_champ_reviews['review'] == "I read a review below that can explain exactly what we experienced. We've had it for 16 months and it has worked wonderful for us. No smells, change it out once a week, easy to clean. Then a diaper snagged this foam material in the head part, so I pulled the rest of the foam out. Big mistake!!! Now it can no loner retain the stinkiness and we're looking for a replacement. Be careful of overloading and never take out that foam piece that is cushioned between pieces. I have figured out that it is key to keeping the stink out."]


name,review,rating,sentiment,word_count,awesome,great
Baby Trend Diaper Champ,I read a review below that can explain exactly ...,4.0,1,"{'key': 1.0, 'have': 1.0, 'pieces': 1.0, 'betwe ...",0,0

fantastic,amazing,love,horrible,bad,terrible,awful,wow,hate,predicted_sentiment
0,0,0,0,0,0,0,0,0,0.7919288370624453


In [27]:
display_review(diaper_champ_reviews[0])

Rating: 4.0
Sentiment (act, prd): (1, 0.9981253623335122)
Review: I LOVE LOVE LOVE this product! It is SO much easier to use than the Diaper Genie, (you need a PHD in poopy to figure out how to use the darn thing!) and it even takes the same bags as my kitchen trash can, shich is super convenient, and cost efficient as I can buy them in bulk.The only reason for not rating it a 5 star was that I did have one small problem with it. The foam gasket in the barrell which keeps the poopy smell inside the unit ripped somehow, and it got VERY stinky. HOWEVER, I contacted the manufacturer though their website, and received an email back the same day stating that this was unusual, and that replacement gaskets were on their way to me. They arrived inside of a week and after replacing, it works great again! (They even sent me extras should it happen again)I HIGHLY reccomend this diaper pail over ANY competitors, you will not be sorry!


In [28]:
display_review(diaper_champ_reviews[1])

Rating: 5.0
Sentiment (act, prd): (1, 0.9955677154227358)
Review: I received my Diaper Champ at my baby shower for the birth of my first son 11 months ago. I use it faithfully every day and love the ease and convenience of only having to change the bag once a week! I love that you can use regular kitchen-size trash bags and don't need to purchase any special expensive bags. One thing you might want to be careful of, however...make sure you do not throw loose baby wipes into the Diaper Champ or else the flip mechanism can become jammed and after time will not seal properly due to having to pull out wipes that are stuck. I love my diaper champ so much, I have asked for a second one for my upcoming baby shower for my second son.


In [29]:
display_review(diaper_champ_reviews[-1])


Rating: 1.0
Sentiment (act, prd): (0, 0.2860300801255359)
Review: ......all I can say is the smell is horrible.....1 star..... Please don't buy this one!


In [30]:
display_review(diaper_champ_reviews[-2])


Rating: 5.0
Sentiment (act, prd): (1, 0.2860300801255359)
Review: For my first born I purchased the Diaper Genie. It worked out well for the first 6 months. As the diaper sizes got larger the harder it was to get the diaper to fit. It would squish out the sides, making a horrible mess! So I purchased the wide mouth size. More $$! The larger diapers fit a little better, but the monthly cost of special Diaper Genie bags started adding up.With the birth of our second child, I refused to use the Diaper Genie. SO in a search for something better than a smelly garbage can in our baby's room, we found the Diaper Champ.The Diaper Champ out performed the Diaper Genie, not once was the diaper too large to fit in the top of the unit and it was cost effective because it uses your standard trash bags!On trash day, we would open the top pull out the old bag, tie it up, throw in a new one, and out to the corner for trash pickup.Ten years and 3 more kids later, this diaper pail is still the Champ in o

In [31]:
diaper_champ_reviews.tail()

name,review,rating,sentiment,word_count,awesome
Baby Trend Diaper Champ,We heard bad stories about the diaper genie ...,5.0,1,"{'in': 1.0, 'especially': 1.0, 'easy': 1.0, ...",0
Baby Trend Diaper Champ,I cannot believe that anyone has had good luck ...,1.0,0,"{'bad': 1.0, 'shocked': 1.0, 'pretty': 1.0, 'm': ...",0
Baby Trend Diaper Champ,I am a second time mom who used the rival Di ...,5.0,1,"{'other': 1.0, 'your': 1.0, 'buy': 1.0, ...",0
Baby Trend Diaper Champ,Two girlfriends and two family members put me ...,5.0,1,"{'winter': 1.0, 'outside': 1.0, 'day': ...",0
Baby Trend Diaper Champ,I've read all of the reviews of those of you ...,2.0,0,"{'never': 1.0, 'day': 1.0, 'buy': 1.0, ...",0
Baby Trend Diaper Champ,The Diaper Champ is TERRIBLE at keeping the ...,1.0,0,"{'just': 1.0, 'dirty': 1.0, 'enjoy': 1.0, ...",0
Baby Trend Diaper Champ,My 8 year old yellow lab was able to get the top ...,1.0,0,"{'prevent': 1.0, 'literature': 1.0, ...",0
Baby Trend Diaper Champ,I registered for this product after reading ...,2.0,0,"{'lift': 1.0, 'nails': 1.0, 'three': 1.0, ...",0
Baby Trend Diaper Champ,For my first born I purchased the Diaper ...,5.0,1,"{'house': 1.0, 'still': 1.0, 'is': 1.0, 'this': ...",0
Baby Trend Diaper Champ,......all I can say is the smell is ...,1.0,0,"{'buy': 1.0, 'this': 1.0, 't': 1.0, 'don': 1.0, ...",0

great,fantastic,amazing,love,horrible,bad,terrible,awful,wow,hate,predicted_sentiment
0,0,0,0,0,1,0,0,0,0,0.5854321171706491
0,0,0,0,0,1,0,0,0,0,0.5854321171706491
0,0,0,0,0,1,0,0,0,0,0.5854321171706491
0,0,1,0,1,0,0,0,1,0,0.5438399411170777
0,0,0,0,0,2,0,0,0,0,0.3438092840908523
0,0,0,0,0,0,1,0,0,0,0.2917148358682929
0,0,0,0,1,0,0,0,0,0,0.2860300801255359
0,0,0,0,1,0,0,0,0,0,0.2860300801255359
0,0,0,0,1,0,0,0,0,0,0.2860300801255359
0,0,0,0,1,0,0,0,0,0,0.2860300801255359
