# Naive Bayes Sentiment Analysis Analysis Report

In [None]:
# remove_cell
import pandas as pd
import numpy as np
from naive_bayes import *

In [None]:
# remove_output
path = '../../data/txt_sentoken'
df = create_dataframe(path)
df['CleanReview'] = df['Review'].apply(clean_review)

ratio = 0.9
train_num = int(1000 * ratio)
test_num = 1000 - train_num
df_train = pd.concat([df.iloc[:train_num,:],df.iloc[1000:train_num + 1000,:]]).reset_index()
df_test = pd.concat([df.iloc[1000 - test_num:1000,:],df.iloc[-test_num:,:]]).reset_index()
df_train.to_pickle('train.pkl')
df_test.to_pickle('test.pkl')

print(f'Training data: {df_train.shape}, Testing data: {df_test.shape}')
nb = run_naive_bayes('train.pkl', 'test.pkl')

Training data: (1800, 4), Testing data: (200, 4)
Reading in the train and test data
-
Creating classifier
-
Training the classifier
-
Making predictions
-
Evaluating predictions
-
All done! You achieved 84.50% accuracy!




In [None]:
# remove_cell
correct_predictions = nb.test_reviews[nb.predictions == nb.labels].apply(lambda x: ' '.join(x))
error_predictions = nb.test_reviews[nb.predictions != nb.labels].apply(lambda x: ' '.join(x))

In [None]:
# remove_cell
def helper(word, sent):
    if sent == 'positive' and nb.prob_word_given_positive[word] > nb.prob_word_given_negative[word]:
        return nb.prob_word_given_positive[word]
    elif sent != 'positive' and nb.prob_word_given_positive[word] < nb.prob_word_given_negative[word]:
        return nb.prob_word_given_negative[word] 
    else:
        return 0

In [None]:
# remove_cell
def keywords(i):
    review = nb.test_reviews[i]
    predicted = nb.predictions[i]
    keywords = sorted(review, key=lambda word: helper(word,predicted), reverse=True)
    ten_keywords = [(i[0] + 1, i[1]) for i in enumerate(keywords[:50])]
    return ' '.join([str(i) for i in ten_keywords])

In [None]:
# remove_cell
def summary(i):
    print(f'The review {len(" ".join(nb.test_reviews[i]))} characters')
    print(f'and {len(nb.test_reviews[i])} tokens.')
    print(f'The top 50 most influential words include:')
    print(f'{keywords(i)}')

## Analysis of Incorrect Predictions
At the bottom there is a generated summary of 20 reviews, 5 misclassified positive reviews, 5 misclassified negative reviews, 5 correctly classified positive reviews and 5 correctly classified negative reviews.

The summaries look at the number of characters and words for each review along with the top 50 key words that have high probabilities but are also probabilistically biased toward the prediction outcome (if a review was predicted as 'positive', a word with P(word|positive) > P(word|negative) would be considered as a key word.

The nex three reviews are positive but have been classified as negative.

### Review 9
The review contains words such as "flaws", "holes", and "problems", which may have caused the classifier to interpret it as negative.
The review mentions that the film has "enough holes to ruin the tale", and that "pivotal clue" are "obscure". These criticisms may have contributed to the misclassification.
The review mentions that "the comparison between the two films is inevitable", and that "no doubt the original is better". These phrases contains words like inevitable and doubt. This may have swayed the classifier to negative.

In the summary below, we see that but, not, it's, like, out, up, just and no are some of negatively charged words that occur frequently in the review.

### Review 11
Below are some examples of why the model may have classified the review as negative.
- "utterly useless"
- "rather pointless" - 
- "stupid maybe but once i'm not disappointed"

Due to these several negative statements, the model may have gotten confused despite the overall positive theme.

### Review 15
In review 15 there are several examples of negation being used. This would confuse the naive bayes model I develop because it does not handle negation instead it just uses a Bag of Words.
- "Despite its flaws, the film still manages entertaining." (the negation "despite" turns the negative "flaws" into a positive "entertaining").
- "While the plot may seem ridiculous, the cleverness of the script and the talent of the cast keep it alive." (the negation "while" turns the negative "ridiculous" into a positive "cleverness of the script" and "talent of the cast" keep it alive".
- "The movie isn't likely to make a bundle like its predecessors, but that's what works." (the negation "isn't" turns the negative "likely to make a bundle" into a positive "what works")

The following two reviews are negative but have been classified as positive.

### Review 167
This review was predicted as positive despite it being an overall negative review. I was very confused by this review due to the fact that it is overwhelmingly negative. When looking at the key words we see

- "all-star cast", "beautiful teenage girl", "adventurous" (all very positive language)
- "works more as bad laugh all melodrama" (works and laugh here)
- "the most hilarious scene", "most laughable" (even though they are negative comments, they use positive language)

Overall however, this review was very negative. It is surprisingly to see it misclassified due to all the negative language used throughout such as "innately disgusting", "non-thrilling action scenes"

### Review 178
This review uses positive language such as "lavish sets" and "fancy costumes" as well as "luscious cinematography". These adjectives are very positive but are drowned out by the negative perspective on the movie.

This is an example of a review that the naive bayes model will struggle with due to the positive language and complex use of phrases.

## Analysis of Correct Predictions
I have analysed 5 reviews, 3 that are positive and 3 that are negative. In each review, I point out details that may have influenced the model to make the correct decision.

### Review 0
Based on the keywords given in the analysis, story and character appear to be positively associated words that are found often in this review. These words are not inherently good or bad but perhaps they are focused on in positive reviews.
Below is a list of words that have positive connotations
- magical
- extremely good
- memorable
- spectacular
- astonishing
- successful

Already there is a stark contrast between correctly and incorrectly classified reviews. The language is much more clear and decisive.

### Review 1
Review 1 is a little hit and miss. It has overall positive sentiment but some of the language used is sometimes uncertain. 
- "very good special effects" (good and special are clear positive words)
- "never fails" (a negation on fails using never, this is a phrase that may confuse the model)
- "fair bit of action particularly towards end" (here is an example of uncertain language with positive sentiment)
- "definitely watchable" (this is a strange expression, definitely has such certainty while watchable indicates a level of mediocrity)

I think overall Review 1 leans strongly on the side of positive. It makes sense the model correctly classified considering the number of positive phrases.

### Review 2
Review 2 is a mixed review. There is a lot of language that is both positive and negative. Here is some of the positive language:
- "magnificent bombast of mr downey jr's performance" (magnificent and bombast are powerfully positive)
- "captivatingly dynamic blusterous stealthy piece work" (this expression is quite mixed, often blusterous is negative but has a positive leaning in this context)
- "never less than wildly entertaining and insightful" (never and less may confuse the model but entertaining and insightful may push to the positive side)

Some examples of negative comments include:
- "several sequences fall flat" (fall flat is a phrase that won't be recognised by the model but it may place negative sentiment on the word 'flat')
- "unconvincing fashion", "uncompelling weepy finale" (unconvincing and uncompelling are undesirable)
- "film's thematic narrative rather dubious" (there is a contrast between thematic and dubious that our model may not understand, instead thematic and dubious may cancel one another out.

Overall this review is positive but it was challenging for the model to discern between the mixed comments.

The following are two example negative reviews correctly classified.

### Review 195
This is review is the most negative I have seen in the dataset. It is strongly deflamatory toward the movie and uses a variety of negative words and phrases that negatively sway the model toward classifying the review correcly.
- "clearly worst" (definitive language)
- "wasn't anything i'd lose sleep over" (an example of negation)
- "mtv-influenced tripe" (strongly negative word use however mtv-influenced may not be common in the corpus)

The positive language used in this review was negligible.

This review is an example of where the naive bayes model works very effectively, the language is clear (bar some a few examples mentioned) and there is little nuance.

### Review 196
Review 196 is very negative but it does contain language that may be positive but is used in a negative context.
- "bizarre enjoyment quality all its own" - enjoyment is typically positive, but it's just lauging about the film's failures.
- "slick modern masterpiece Deliverance" - masterpiece is positive, but it's not talking about this film

Despite this the model make the right decision. This is because of the volume of other negative language used throughout.


## General Comments and Conclusion
As a whole the Naive Bayes Sentiment Classifier achieved 84.50% accuracy which is impressive. We see there are places that it works well and others that it does not.

In summary, the performance of naive bayes classifier depends on several factors, including the data (is it definitive or ambiguous), the vocabulary and the amount of training data (are there enough examples for reliable probabilities). While this model can work well in some cases, it may make mistakes when the sentiment is not solely determined by individual words or when there are many rare or ambiguous words that can cause confusion.

## Automated Summaries

In [None]:
for i in list(error_predictions.index[:5]) + list(error_predictions.index[-5:]) + \
        list(correct_predictions.index[:5]) + list(correct_predictions.index[-5:]):
    print('-' * 100)
    print(f'Summary of Review {i}')
    print('-' * 100)
    print(f'Review: {i}, Predicted: {nb.predictions[i]}, Label: {nb.labels[i]}')
    print(summary(i))
    print()
    print(' '.join(nb.test_reviews[i]))

----------------------------------------------------------------------------------------------------
Summary of Review 9
----------------------------------------------------------------------------------------------------
Review: 9, Predicted: negative, Label: positive
The review 2477 characters
and 357 tokens.
The top 50 most influential words include:
(1, 'but') (2, 'but') (3, 'but') (4, 'but') (5, 'but') (6, 'but') (7, 'not') (8, 'not') (9, 'was') (10, 'all') (11, "it's") (12, "it's") (13, "it's") (14, "it's") (15, "it's") (16, 'like') (17, 'out') (18, 'if') (19, 'up') (20, 'up') (21, 'just') (22, 'just') (23, 'no') (24, 'no') (25, 'no') (26, 'no') (27, 'even') (28, 'only') (29, 'good') (30, 'time') (31, 'time') (32, 'can') (33, 'can') (34, 'bad') (35, 'two') (36, 'two') (37, 'two') (38, 'characters') (39, 'had') (40, 'had') (41, 'because') (42, 'how') (43, "don't") (44, 'scene') (45, "doesn't") (46, "there's") (47, 'better') (48, 'something') (49, 'big') (50, 'should')
None

gere w

In [None]:
# remove_cell
!jupyter nbconvert --to html --no-input Analysis.ipynb --output=Analysis.html --TagRemovePreprocessor.remove_cell_tags="{'remove_cell'}" --TagRemovePreprocessor.remove_all_outputs_tags="{'remove_output'}"

[NbConvertApp] Converting notebook Analysis.ipynb to html
[NbConvertApp] Writing 675170 bytes to Analysis.html
