Let's experiment with an existing library for classification: [TextBlob](https://textblob.readthedocs.org/)

Training
-----

In [1]:
from textblob.classifiers import NaiveBayesClassifier

In [10]:
train = [
    ('I love this sandwich.', 'pos'),
    ('This is an amazing place!', 'pos'),
    ('I feel very good about these beers.', 'pos'),
    ('This is my best work.', 'pos'),
    ("What an awesome view", 'pos'),
    ('I do not like this restaurant', 'neg'),
    ('I am tired of this stuff.', 'neg'),
    ("I can't deal with this", 'neg'),
    ('He is my sworn enemy!', 'neg'),
    ('My boss is horrible.', 'neg')
]
test = [
    ('The beer was good.', 'pos'),
    ('I do not enjoy my job', 'neg'),
    ("I ain't feeling dandy today.", 'neg'),
    ("I feel amazing!", 'pos'),
    ('Gary is a friend of mine.', 'pos'),
    ("I can't believe I'm doing this.", 'neg')
]

In [11]:
cl = NaiveBayesClassifier(train)

In [15]:
print(f"{cl.accuracy(test):.2}")

0.83


In [14]:
for review, label in test:
    print(review)
    print("  observed:  ", label)
    print("  predicted: ", cl.classify(review), end="\n\n")

The beer was good.
  observed:   pos
  predicted:  pos

I do not enjoy my job
  observed:   neg
  predicted:  neg

I ain't feeling dandy today.
  observed:   neg
  predicted:  neg

I feel amazing!
  observed:   pos
  predicted:  pos

Gary is a friend of mine.
  observed:   pos
  predicted:  neg

I can't believe I'm doing this.
  observed:   neg
  predicted:  neg



In [16]:
# Let's see its ability to generalize
cl.classify("Their burgers are amazing")  #=> "pos"

'pos'

In [17]:
cl.classify("I don't like their pizza.") #=> neg

'neg'

In [18]:
cl.show_informative_features()

Most Informative Features
          contains(this) = True              neg : pos    =      2.3 : 1.0
          contains(this) = False             pos : neg    =      1.8 : 1.0
            contains(an) = False             neg : pos    =      1.6 : 1.0
          contains(This) = False             neg : pos    =      1.6 : 1.0
             contains(I) = True              neg : pos    =      1.4 : 1.0
             contains(I) = False             pos : neg    =      1.4 : 1.0
          contains(deal) = False             pos : neg    =      1.2 : 1.0
          contains(love) = False             neg : pos    =      1.2 : 1.0
         contains(tired) = False             pos : neg    =      1.2 : 1.0
          contains(very) = False             neg : pos    =      1.2 : 1.0


Textblob for sentiment
------

[The docs for TextBlob](https://textblob.readthedocs.org/en/dev/quickstart.html#sentiment-analysis)

[Here is breakdown of the tool](http://planspace.org/20150607-textblob_sentiment/)

It comes pretrained...

In [2]:
from textblob import TextBlob

In [3]:
# Roger Ebert hatin' http://www.rogerebert.com/reviews/north-1994

testimonial = TextBlob("""I have no idea why Rob Reiner, or anyone else, wanted to make this story into a movie, and close examination of the film itself is no help. 
"North" is one of the most unpleasant, contrived, artificial, cloying experiences I've had at the movies. 
To call it manipulative would be inaccurate; it has an ambition to manipulate, but fails""")

In [4]:
testimonial.sentiment

Sentiment(polarity=-0.35, subjectivity=0.7)

In [5]:
testimonial.sentiment.polarity

-0.35

In [6]:
TextBlob("Hummus is Greek Food").sentiment.subjectivity # Subjectivity = 0.0, completely objective

0.0

In [7]:
TextBlob("Hummus is the most awesome dip to put on a pita.").sentiment.subjectivity # Subjectivity = 0.75, very subjective

0.75