# Creating a Dictionary-based Sentiment Analyzer

In [17]:
import pandas as pd
import nltk
from IPython.display import display
pd.set_option('display.max_columns', None)

### Step 1: Loading in the small_corpus .csv file created in the "creating_dataset" milestone.

In [4]:
reviews = pd.read_csv("../data/small_corpus.csv")

In [5]:
reviews.head()

Unnamed: 0,overall,verified,reviewTime,reviewerID,asin,reviewerName,reviewText,summary,unixReviewTime,vote,style,image
0,1.0,True,"11 30, 2015",A3AC92K59QLYR8,B00503E8S2,ben,Game freezes over and over its unplayable,it just doesn't work,1448841600,,{'Format:': ' Video Game'},
1,1.0,False,"05 19, 2012",A334LHR8DWARY8,B00178630A,Xenocide,I have no problem with needing to be online to...,The only real way to show Blizzard our feeling...,1337385600,23.0,{'Format:': ' Computer Game'},
2,1.0,True,"10 19, 2014",A28982ODE7ZGVP,B001AWIP7M,Eric Frykberg,NOT GOOD,One Star,1413676800,,{'Format:': ' Video Game'},
3,1.0,True,"09 6, 2015",A19E85RLQCAMI1,B00NASF4MS,Joe,Really not worth the money to buy this game on...,Really not worth the money to buy this game on...,1441497600,2.0,{'Format:': ' Video Game'},
4,1.0,False,"05 28, 2008",AEMQKS13WC4D2,B00140P9BA,Craig,They need to eliminate the Securom. I purchase...,Securom can ruin a great game,1211932800,55.0,{'Format:': ' DVD-ROM'},


### Step 2: Tokenizing the sentences and words of the reviews
Here, We're going to test different versions of word tokenizer on reviews. We'll then decide which tokenizer might be better to use.

### Treebank Word Tokenizer

In [10]:
from nltk.tokenize import TreebankWordTokenizer

In [11]:
tb_tokenizer = TreebankWordTokenizer()

In [14]:
reviews["tb_tokens"] = reviews['reviewText'].apply(lambda rev: tb_tokenizer.tokenize(str(rev)))

In [21]:
pd.set_option('display.max_colwidth', None)

In [23]:
reviews[['reviewText','tb_tokens']].head()

Unnamed: 0,reviewText,tb_tokens
0,Game freezes over and over its unplayable,"[Game, freezes, over, and, over, its, unplayable]"
1,"I have no problem with needing to be online to play and have had no problems with stability since the first 2 days. This review ignores those issues.\n\nThe game itself is fun and addictive, but so is Din's Curse (which was programmed by one guy) and is more re-playble than Diablo 3. Hopefully Torchlight 2 will take the mantle of this genere from Blizzard. One reason all these games are fun is because they are all similar in atmosphere, gameplay and loot system. As a stand alone title, D3 would be 4 stars, maybe- but as a successor to Diablo 2, with all the man-power and resources Blizzard had including the beautiful skill tree system from D2... If I could go lower than 1, I would.\n\nWhy was Diablo 2 addictive? One big reason was farming gear- loading magic find amd finding a group for Hell runs, kind of like a drug. But why? For the millions of regular players it was in order to gear up their next specialized build. Everything was centered around unique builds that become somewhat cookie cutter if good. Players perfectly balancing point distributions in attributes and skills to make that personalized zealot or blizzard sorc. The quest for constant gear for new builds was what drove the desire to aquire it. Every major nerf patch was like a ladder reset (which D3 lacks). Even if they add another 5 classes, the problem of no variation or ability to make a custom character means no constant demand for anything but a few top level items.\n\nThis seems intentional to keep server use lighter between expansions as there really is little reason to play unless your into farming for only a handful of desired endgame items. Everything else can be gotten off the AH for little gold. Unless they introduce raids that give special gear or something that would make that game into an MMO, none of the desire to keep playing is there unless it's more of a social hang-out.\n\nSo, where did all the money and time go? The skills are fun, as are the runes- but also very simplistic. The cut-screens are nicely done but, in this genere, most people watch them once or twice at most, they are basically useless. I'd hate to think all the resources went there when the cut-screens are movie quality where you see real looking people crying and giving decent lines, but then we shift back to real game-play and the next line from the smith who just killed his wife after she vomited and changed into an undead creature is a comically gruff and bemused... ""If you see my idiot apprentice out there, tell him to get back...""\n\nThe same companion dialogue reapeated over and over and often when fighting so it's annoying. It's all just randomized for the most part. So, your talking to your companions in a casual way about something trivial while getting killed by a boss. Poor execution of useless content. It happens often as well.\n\nIt is just sad to see no ability to customize a build, choose your dialogue or do much outside of what this game present to you. You essentially walking through how the developers envisioned you to progress- from what you say to what skills you want to master.\n\nIt's really sad to see how they dumbed this game down to the point where there is no ability to customize any aspect of your gameplay.","[I, have, no, problem, with, needing, to, be, online, to, play, and, have, had, no, problems, with, stability, since, the, first, 2, days., This, review, ignores, those, issues., The, game, itself, is, fun, and, addictive, ,, but, so, is, Din, 's, Curse, (, which, was, programmed, by, one, guy, ), and, is, more, re-playble, than, Diablo, 3., Hopefully, Torchlight, 2, will, take, the, mantle, of, this, genere, from, Blizzard., One, reason, all, these, games, are, fun, is, because, they, are, all, similar, in, atmosphere, ,, gameplay, and, loot, system., As, a, stand, alone, title, ,, D3, would, be, 4, stars, ...]"
2,NOT GOOD,"[NOT, GOOD]"
3,Really not worth the money to buy this game on PS4 unless it becomes $10.... don't make the mistake I made.,"[Really, not, worth, the, money, to, buy, this, game, on, PS4, unless, it, becomes, $, 10, ..., ., do, n't, make, the, mistake, I, made, .]"
4,"They need to eliminate the Securom. I purchased Mass Effect as a digital download hoping that the faulty disc protection software would not be on that version, however it seems the Securom is on all versions. Now every time I log on to play, it's hit or miss- sometimes an error pops up stating ""a required security module could not be activated"", and sometimes it works. It's like pulling a handle on a slot machine to see if Securom will allow you play or not. Ridiculous for a game I spent $50 on. There's a whole thread about this issue on the official forums. Don't have this issue with other games that use less intrusive copy protection methods.","[They, need, to, eliminate, the, Securom., I, purchased, Mass, Effect, as, a, digital, download, hoping, that, the, faulty, disc, protection, software, would, not, be, on, that, version, ,, however, it, seems, the, Securom, is, on, all, versions., Now, every, time, I, log, on, to, play, ,, it, 's, hit, or, miss-, sometimes, an, error, pops, up, stating, ``, a, required, security, module, could, not, be, activated, '', ,, and, sometimes, it, works., It, 's, like, pulling, a, handle, on, a, slot, machine, to, see, if, Securom, will, allow, you, play, or, not., Ridiculous, for, a, game, I, spent, $, 50, ...]"


### Casual Tokenizer

In [24]:
from nltk.tokenize.casual import casual_tokenize

In [25]:
reviews['casual_tokens'] = reviews['reviewText'].apply(lambda rev: casual_tokenize(str(rev)))

In [27]:
reviews[['reviewText','casual_tokens','tb_tokens']].sample(5)

Unnamed: 0,reviewText,casual_tokens,tb_tokens
1793,Good,[Good],[Good]
2337,If you want something to do not bad. I payed $5 for it but I would not have payed more. It will eat time but they thing is it has much of the same battles over and over. You should just play a little here and there cause it gets pretty bland.,"[If, you, want, something, to, do, not, bad, ., I, payed, $, 5, for, it, but, I, would, not, have, payed, more, ., It, will, eat, time, but, they, thing, is, it, has, much, of, the, same, battles, over, and, over, ., You, should, just, play, a, little, here, and, there, cause, it, gets, pretty, bland, .]","[If, you, want, something, to, do, not, bad., I, payed, $, 5, for, it, but, I, would, not, have, payed, more., It, will, eat, time, but, they, thing, is, it, has, much, of, the, same, battles, over, and, over., You, should, just, play, a, little, here, and, there, cause, it, gets, pretty, bland, .]"
2273,"I wouldn't recommend getting Sims 3 online because once you download it, you have to update it and that requires the disk. You can, however get the expansion packs online. They work great and have no problems on my part.","[I, wouldn't, recommend, getting, Sims, 3, online, because, once, you, download, it, ,, you, have, to, update, it, and, that, requires, the, disk, ., You, can, ,, however, get, the, expansion, packs, online, ., They, work, great, and, have, no, problems, on, my, part, .]","[I, would, n't, recommend, getting, Sims, 3, online, because, once, you, download, it, ,, you, have, to, update, it, and, that, requires, the, disk., You, can, ,, however, get, the, expansion, packs, online., They, work, great, and, have, no, problems, on, my, part, .]"
4362,"It's like Ms. Pac Man, but for guys","[It's, like, Ms, ., Pac, Man, ,, but, for, guys]","[It, 's, like, Ms., Pac, Man, ,, but, for, guys]"
4016,"When I first bought Civ III, I was amazed at how in depth, challenging, and great the game was. I didn't think it could get any better! Eventually, though, I realized I had a desire to test my skills against real opponents - and I realized that without multiplayer, that couldn't be done.\nSo there was a void. And now it has been filled.\nNot only have we been provided with awesome multiplayer options, including everyone's favorite Hotseat Mode, it tacked on a whole pile of new features as well. There are new game modes, like simultaneous and turnless games. There are new civilizations like the Koreans and (my favorite) the Carthaginians. New improvements like outposts and airfields are added to give your military that extra edge, and new units give your civilization even more possibilities. There's even a scenario editor to create scenarios of any type from almost any time.\nTo say the least, this expansion is well worth every penny and a must-have for the serious Civ gamer. To say the most, this is hands down one of the greatest additions to an already great series I've ever seen. Three cheers for Sid Meier and crew for Civ III: Play the World!","[When, I, first, bought, Civ, III, ,, I, was, amazed, at, how, in, depth, ,, challenging, ,, and, great, the, game, was, ., I, didn't, think, it, could, get, any, better, !, Eventually, ,, though, ,, I, realized, I, had, a, desire, to, test, my, skills, against, real, opponents, -, and, I, realized, that, without, multiplayer, ,, that, couldn't, be, done, ., So, there, was, a, void, ., And, now, it, has, been, filled, ., Not, only, have, we, been, provided, with, awesome, multiplayer, options, ,, including, everyone's, favorite, Hotseat, Mode, ,, it, tacked, on, a, whole, pile, of, new, ...]","[When, I, first, bought, Civ, III, ,, I, was, amazed, at, how, in, depth, ,, challenging, ,, and, great, the, game, was., I, did, n't, think, it, could, get, any, better, !, Eventually, ,, though, ,, I, realized, I, had, a, desire, to, test, my, skills, against, real, opponents, -, and, I, realized, that, without, multiplayer, ,, that, could, n't, be, done., So, there, was, a, void., And, now, it, has, been, filled., Not, only, have, we, been, provided, with, awesome, multiplayer, options, ,, including, everyone, 's, favorite, Hotseat, Mode, ,, it, tacked, on, a, whole, pile, of, new, features, ...]"
