# Voting System

The goal of this project is to create a voting system for bivariant sentiment analysis of any type of short reviews. To achieve this we are going to combine Naive Bayes algorithm from `nltk` and similar algorithms from `scikit-learn`. This combination should increase the accuracy and reliability of the confidence percentages. The training and testing will be done on the short reviews from https://pythonprogramming.net/.

Note: We will also use `pickle` to save the trained classifiers and sets to reduce the running time.

In [120]:
import nltk
import random
import pickle
import os

from nltk.classify.scikitlearn import SklearnClassifier
from sklearn.naive_bayes import MultinomialNB, BernoulliNB
from sklearn.linear_model import LogisticRegression, SGDClassifier
from sklearn.svm import SVC, LinearSVC, NuSVC

from nltk.classify import ClassifierI
from statistics import mode
from nltk.tokenize import word_tokenize


def pickle_object(classifier, file_path):
    with open(file_path, 'wb') as f:
        # Take contents of trained classifier and put it to the new file   
        pickle.dump(classifier, f)

def unpickle_object(file_path):
    if not os.path.isfile(file_path):
        return None
    with open(file_path, 'rb') as f:
        # Get trained classifier to work with it
        return pickle.load(f)



# Load sets from serialised files
training_sets = unpickle_object('training_sets.pickle')
testing_sets = unpickle_object('testing_sets.pickle')

# Load classifiers from serialised files
naive_bayes_classifier = unpickle_object('naivebayes.pickle')
MultinomialNB_classifier = unpickle_object('multinomialnb.pickle')
BernoulliNB_classifier = unpickle_object('bernoullinb.pickle')
LogisticRegression_classifier = unpickle_object('logistic_regression.pickle')
SGDClassifier_classifier = unpickle_object('sgd_classifier.pickle')
SVC_classifier = unpickle_object('svc_classifier.pickle')
LinearSVC_classifier = unpickle_object('linear_svc_classifier.pickle')
NuSVC_classifier = unpickle_object('nu_svc_classifier.pickle')


# Upload and prepare training and testing sets if not done yet
if training_sets is None or testing_sets is None:
    short_pos = open('positive.txt', 'r').read()
    short_neg = open('negative.txt', 'r').read()


    # List speach parts that will be analysed by classifiers
    allowed_word_types = ['JJ', 'JJR', 'JJS', 'RB', 'RBR', 'RBS']
    
    # Create list of reviews with labels
    documents = []

    # Add each positive review on a separate line in documents
    for review in short_pos.split('\n'):
        documents.append((review, 'pos'))

   # Same for negative reviews
    for review in short_neg.split('\n'):
        documents.append((review, 'neg'))

   # Randomise reviews order to mix positives and negatives 
    random.shuffle(documents)


    # Create list of most common words of allowed speach parts
    all_words = []
    # Tokenise each word in positive reviews
    words = word_tokenize(short_pos)
    # Add parts of speach tags to each tokenised word
    pos = nltk.pos_tag(words)

    # Collect to new list only those tagged words from positive reviews that are in allowed list
    for w in pos:
        if w[1] in allowed_word_types:
            all_words.append(w[0].lower())

    # Tokenise each word in negative reviews
    words = word_tokenize(short_neg)
    # Add parts of speach tags to each tokenised word
    neg = nltk.pos_tag(words)
    
    # Collect to list with positives only those tagged words from negative reviews that are in allowed list
    for w in pos:
        if w[1] in allowed_word_types:
            all_words.append(w[0].lower())

    # Count number of occurances for each word and sort in desc. order
    all_words = nltk.FreqDist(all_words)

    # Take 3000 most common from all available words
    most_common = list(all_words.keys())[:3000]



    def find_features(document):
        words = word_tokenize(document)
        features = {}
        for word in most_common:
            features[word] = word in words
        return features

    # Create a classification list in which each word marked to tell whether or not it is also in the list of most common words
    feature_sets = [(find_features(rev), category) for (rev, category) in documents]

    # Create training set from the list of marked words
    training_sets = feature_sets[:10500]
    # Create testing set from the list of marked words that are not in training set
    testing_sets = feature_sets[10500:]

    # Serialise shuffled training set if not done
    pickle_object(training_sets, 'training_sets.pickle')
    # Serialise shuffled testing set if not done
    pickle_object(testing_sets, 'testing_sets.pickle')


# Train and serialise all classifiers if not done
if naive_bayes_classifier is None:
    # Train classifier on training sets
    naive_bayes_classifier = nltk.NaiveBayesClassifier.train(training_sets)
    # Serialise classifier
    pickle_object(naive_bayes_classifier, 'naivebayes.pickle')

if MultinomialNB_classifier is None:
    MultinomialNB_classifier = SklearnClassifier(MultinomialNB())
    MultinomialNB_classifier.train(training_sets)
    pickle_object(MultinomialNB_classifier, 'multinomialnb.pickle')

if BernoulliNB_classifier is None:
    BernoulliNB_classifier = SklearnClassifier(BernoulliNB())
    BernoulliNB_classifier.train(training_sets)
    pickle_object(BernoulliNB_classifier, 'bernoullinb.pickle')

if LogisticRegression_classifier is None:
    LogisticRegression_classifier = SklearnClassifier(LogisticRegression())
    LogisticRegression_classifier.train(training_sets)
    pickle_object(LogisticRegression_classifier, 'logistic_regression.pickle')

if SGDClassifier_classifier is None:
    SGDClassifier_classifier = SklearnClassifier(SGDClassifier())
    SGDClassifier_classifier.train(training_sets)
    pickle_object(SGDClassifier_classifier, 'sgd_classifier.pickle')

if SVC_classifier is None:
    SVC_classifier = SklearnClassifier(SVC())
    SVC_classifier.train(training_sets)
    pickle_object(SVC_classifier, 'svc_classifier.pickle')

if LinearSVC_classifier is None:
    LinearSVC_classifier = SklearnClassifier(LinearSVC())
    LinearSVC_classifier.train(training_sets)
    pickle_object(LinearSVC_classifier, 'linear_svc_classifier.pickle')

if NuSVC_classifier is None:
    NuSVC_classifier = SklearnClassifier(NuSVC())
    NuSVC_classifier.train(training_sets)
    pickle_object(NuSVC_classifier, 'nu_svc_classifier.pickle')

We've uploaded processed and serialised training and testing datasets, as well as trained and serialised classifiers. Let's check their accuracy percentages at this point. 

In [121]:
# Get Naive Bayes classifier from nltk accuracy on testing sets
nltk.classify.accuracy(naive_bayes_classifier, testing_sets)*100

71.34146341463415

In [122]:
# Get accuracy on testing sets of MultinomialNB_classifier from sklearn 
nltk.classify.accuracy(MultinomialNB_classifier, testing_sets)*100

71.95121951219512

In [123]:
nltk.classify.accuracy(BernoulliNB_classifier, testing_sets)*100

71.95121951219512

In [124]:
nltk.classify.accuracy(LogisticRegression_classifier, testing_sets)*100

71.34146341463415

In [125]:
nltk.classify.accuracy(SGDClassifier_classifier, testing_sets)*100

73.78048780487805

In [126]:
nltk.classify.accuracy(SVC_classifier, testing_sets)*100

70.1219512195122

In [127]:
nltk.classify.accuracy(LinearSVC_classifier, testing_sets)*100

68.90243902439023

In [128]:
nltk.classify.accuracy(NuSVC_classifier, testing_sets)*100

70.1219512195122

On a selected part of shuffled dataset all algorithms are doing fairly well. Now let's create a voting system for all classifiers.

In [129]:
class VoteClassifier(ClassifierI):
    def __init__(self, *classifiers):
        self._classifiers = classifiers
    # Determine whether review is positive or negative
    def classify(self, features):
        votes = []
        for c in self._classifiers:
            v = c.classify(features)
            votes.append(v)
        return mode(votes)
    # Give confidence percentages
    def confidence(self, features):
        votes = []
        for c in self._classifiers:
            v = c.classify(features)
            votes.append(v)
        chosen_votes = votes.count(mode(votes))
        conf = chosen_votes / len(votes)
        return conf

voted_classifier = VoteClassifier(
    naive_bayes_classifier,
    MultinomialNB_classifier,
    BernoulliNB_classifier,
    LogisticRegression_classifier, 
    SGDClassifier_classifier,
    SVC_classifier,
    LinearSVC_classifier,
    NuSVC_classifier
    )


Let's check accuracy of the voting system.

In [130]:
nltk.classify.accuracy(voted_classifier, testing_sets)*100

71.34146341463415

Now we are going to run through our voting system several reviews to get their particular classification and confidence percentages. 

In [131]:
for x in range(20,40):
    print(f'Classification of the review #{x}:', voted_classifier.classify(testing_sets[x][0]))
    print('Confidence in %:', voted_classifier.confidence(testing_sets[x][0])*100)
    print('--------')

Classification of the review #20: pos
Confidence in %: 100.0
--------
Classification of the review #21: neg
Confidence in %: 100.0
--------
Classification of the review #22: neg
Confidence in %: 100.0
--------
Classification of the review #23: neg
Confidence in %: 100.0
--------
Classification of the review #24: pos
Confidence in %: 100.0
--------
Classification of the review #25: neg
Confidence in %: 100.0
--------
Classification of the review #26: neg
Confidence in %: 50.0
--------
Classification of the review #27: neg
Confidence in %: 100.0
--------
Classification of the review #28: pos
Confidence in %: 100.0
--------
Classification of the review #29: pos
Confidence in %: 100.0
--------
Classification of the review #30: pos
Confidence in %: 100.0
--------
Classification of the review #31: pos
Confidence in %: 100.0
--------
Classification of the review #32: pos
Confidence in %: 100.0
--------
Classification of the review #33: neg
Confidence in %: 100.0
--------
Classification of the

Now let's perform sentiment analysis of any free text using our voting system.  

In [132]:
def sentiment(text):
    features = find_features(text)
    return voted_classifier.classify(features), voted_classifier.confidence(features)

In [133]:
sentiment('This movie is awesome!')

('pos', 1.0)

In [134]:
sentiment('This movie is not so awesome!')

('neg', 0.875)

In [135]:
sentiment('This movie is not awesome!')

('pos', 0.75)

In [136]:
sentiment('This movie is ok')

('pos', 1.0)

In [137]:
sentiment('This movie is not ok')

('pos', 0.875)

In [138]:
sentiment('I think, my dress makes me look fat, but on the high heels I look better in it.')

('neg', 1.0)

In [139]:
sentiment('My dog has huge fangs, I think it has wolves in her close ancestors.')

('neg', 0.5)

### Conclusion:

- We created a voting system based on many similar algorithms which gave us more reliable classification results. The final accuracy of the voting system is about `70%`. 

- We also compared confidence of the voting system on exclusively positive and negative reviews and found out that positive have significantly lower confidence percentages - only `69%`, while negative had `81%` of confidence.
 
- On the level of individual reviews voting system gave us different results from `50%` of confidence to `100%`. Though, at first glance `50%` and lower seems to be not too comon of a result, for more information further research is required. 

- Sentiment analysis of free text gives more or less adequate results, though it has obvious flaws when sentences are more ambiguous in their belonging to a positive or negative connotation or contain adverb "not" in various positions. Nevetheless all these flaws are typical for bivariant classifiers, so to work successfully with such classifiers one should account for them.