# Part 1: Identifying Misclassified Reviews

### Importing and Concatenating Reviews

In [1]:
import numpy as np
import os
import random
import math

In [2]:
pos_directory = '../Part_3/review_polarity/txt_sentoken/pos'
positive_reviews_list = []

for filename in os.listdir(pos_directory):
    file_path = pos_directory + "/" + filename
    file = open(file_path, "r")
    review = []
    for line in file.readlines():
        review.append(line.rstrip())
    positive_reviews_list.append(" ".join(review))

In [3]:
neg_directory = '../Part_3/review_polarity/txt_sentoken/neg'
negative_reviews_list = []

for filename in os.listdir(neg_directory):
    file_path = neg_directory + "/" + filename
    file = open(file_path, "r")
    review = []
    for line in file.readlines():
        review.append(line.rstrip())
    negative_reviews_list.append(" ".join(review))

### Creating Model Functions

In [4]:
def create_frequency_dictionary(list_of_reviews):
    review_dictionary = {}
    for review in list_of_reviews:
        words = review.split()
        for word in words:
            if word in review_dictionary:
                review_dictionary[word] += 1
            else:
                review_dictionary[word] = 1
    return review_dictionary

In [5]:
def result_for_class(review, frequency_dictionary):
    punctuation = ['.', ',', ':', '"', '&', '?', '-', '(', ')', "'", '/']
    class_result = math.log(0.5)
    for word in review.split():
        if word in punctuation:
            continue
        elif word not in frequency_dictionary:
            class_result += math.log(1 / (sum(frequency_dictionary.values()) + 1))
        else:
            class_result += math.log((frequency_dictionary[word] + 1) / (sum(frequency_dictionary.values()) + 1))
    return class_result

### Splitting into Training and Testing Data and Assigning Frequency Dictionary

In [6]:
negative_training_data = negative_reviews_list[:900]
positive_training_data = positive_reviews_list[:900]
negative_test_data = negative_reviews_list[900:]
positive_test_data = positive_reviews_list[900:]

In [7]:
negative_frequency_dictionary = create_frequency_dictionary(negative_training_data)
positive_frequency_dictionary = create_frequency_dictionary(positive_training_data)

### Creating Lists for Misclassified Reviews

In [8]:
positive_reviews_misclassified = []

for review in positive_test_data:
    positive_result = result_for_class(review, positive_frequency_dictionary)
    negative_result = result_for_class(review, negative_frequency_dictionary)
    if positive_result < negative_result:
        positive_reviews_misclassified.append(review)

In [9]:
negative_reviews_misclassified = []

for review in negative_test_data:
    positive_result = result_for_class(review, positive_frequency_dictionary)
    negative_result = result_for_class(review, negative_frequency_dictionary)
    if positive_result > negative_result:
        negative_reviews_misclassified.append(review)

# Part 2: Analysing Misclassification of Positive Reviews

### Deciding which 5 reviews to analyse

In [53]:
positive_reviews_sample_ids = random.sample(list(np.arange(0,len(positive_reviews_misclassified))), 5)
positive_reviews_sample_ids

[10, 12, 2, 18, 3]

The analysis.md file actually only needs 5 incorrect reviews (including both positive and negative) to be analysed. As there are 33 misclassified reviews and two thirds of them are positive reviews, we will take 3 positive reviews and 2 negative reviews to use in analysis.md

In [31]:
random.sample([10, 12, 2, 18, 3], 3)

[2, 10, 12]

## Part 2.1: Analysing Positive Review #12

To begin the analysis, we will investigate the tokens with the highest count in the review (and therefore the most influence) and see if these words appear more in the positive or negative dictionary.

In [33]:
review12_dict = create_frequency_dictionary([positive_reviews_misclassified[12]])

In [40]:
# sorted(review12_dict.items(), key=lambda x: x[1], reverse=True)[:20]

The top 20 most frequent words in this first review are:

1. ('.', 46) -> punctuation

2. ('the', 36) -> determiner
 
3. (',', 23) -> punctuation
 
4. ('to', 19) -> particle
 
5. ('and', 16) -> conjunction
 
6. ('in', 14) -> preposition
 
7. ('a', 13) -> determiner
 
8. ('of', 11) -> preposition
 
9. ('"', 10) -> punctuation
 
10. ('(', 9) -> punctuation
 
11. (')', 9) -> punctuation
 
12. ('that', 9) -> determiner/conjunction
 
13. ('is', 9) -> verb
 
14. ('film', 8) -> noun
 
15. ('as', 8) -> conjunction/preposition
 
16. ('car', 7) -> noun
 
17. ('his', 7) -> pronoun
 
18. ('just', 7) -> adjective/adverb
 
19. ('memphis', 6) -> noun
 
20. ('have', 6) -> verb

This is expected, as punctuation and determiners/particles/prepositions etc. are more common than nouns or adjectives. 

We will have a look at a few of the most common nouns, verbs, adjectives and adverbs.

* ('is', 9)
* ('film', 8)
* ('car', 7)
* ('just', 7)
* ('memphis', 6)
* ('have', 6)

In [62]:
common_words = ['is', 'film', 'car', 'just', 'memphis', 'have']

for word in common_words:
    if word in positive_frequency_dictionary:
        print(word + " appears in the positive dictionary " + str(positive_frequency_dictionary[word]) + " times")
    else:
        print(word + " appears in the positive dictionary 0 times")
    if word in negative_frequency_dictionary:
        print(word + " appears in the negative dictionary " + str(negative_frequency_dictionary[word]) + " times")
    else: print(word + " appears in the negative dictionary 0 times")

is appears in the positive dictionary 12549 times
is appears in the negative dictionary 9952 times
film appears in the positive dictionary 4376 times
film appears in the negative dictionary 3598 times
car appears in the positive dictionary 112 times
car appears in the negative dictionary 165 times
just appears in the positive dictionary 1197 times
just appears in the negative dictionary 1390 times
memphis appears in the positive dictionary 0 times
memphis appears in the negative dictionary 17 times
have appears in the positive dictionary 1992 times
have appears in the negative dictionary 2408 times


4 out of the 6 words appear more in the negative reviews than in the positive reviews. This could be a possibly cause to it's misclassification.

It's still very difficult to understand how much these counts affect the overall classification result.

Rather than identifying the counts for each word in each review set (positive and negative), we will look at ratios to identify words in the review that appear proportionately more in the negative/positive test set.

In [66]:
def find_ratio_positive_review(word, positive_dict, negative_dict):
    
    # This method finds the ratio of the word count in both the negative and positive dictionaries
    # A result less than 1 means that the word appears more in the positive dictionary
    # A result more than 1 means that the word appears more in the negative dictionary
    
    if word not in negative_dict and word not in positive_dict:
        return 1 # because the word appears in both dictionaries 0 times (equal ratio)
    if word not in negative_dict:
        return 0 # because the word appears more in the positive dictionary (1-1)
    elif word not in positive_dict:
        return negative_dict[word] + 1 # because the word appears more in the negative dictionary (1+1)
    
    return negative_dict[word] / positive_dict[word]    

In [54]:
find_ratio_positive_review('happy', positive_frequency_dictionary, negative_frequency_dictionary)

0.5614035087719298

This result of ~56% means that the word 'happy' appears in the negative dictionary 56% as much as it appears in the positive dictionary, meaning it appears much more in the positive dictionary than the negative.

In [55]:
score_dict = {}

for word in review12_dict.keys():
    score_dict[word] = find_ratio_positive_review(word, positive_frequency_dictionary, negative_frequency_dictionary)

In [56]:
sorted(score_dict.items(), key=lambda x: x[1], reverse=True)

[('bruckheimer', 19),
 ('memphis', 18),
 ('jolie', 15.0),
 ('angelina', 7.0),
 ('shiny', 6.0),
 ('hottest', 5),
 ('mindset', 3.5),
 ('jumping', 3.142857142857143),
 ('dominic', 3.0),
 ('improvise', 3.0),
 ('cage', 2.9166666666666665),
 ('bad', 2.833846153846154),
 ('remake', 2.5454545454545454),
 ('seconds', 2.4210526315789473),
 ('none', 2.3278688524590163),
 ('sena', 2),
 ('kalifornia', 2.0),
 ('nicolas', 2.0),
 ('overlord', 2),
 ('moby', 2),
 ('assorted', 2.0),
 ('lindo', 2.0),
 ('chases', 2.0),
 ('ramp', 2),
 ('filmmakers', 1.9482758620689655),
 ('armageddon', 1.9473684210526316),
 ('unfortunately', 1.8956521739130434),
 ('christopher', 1.8571428571428572),
 ('cars', 1.8421052631578947),
 ('loud', 1.78125),
 ('halfway', 1.7777777777777777),
 ('monkey', 1.736842105263158),
 ('speed', 1.7037037037037037),
 ('partner', 1.631578947368421),
 ('postman', 1.5714285714285714),
 ('oh', 1.5277777777777777),
 ('plot', 1.5265225933202358),
 ('sixty', 1.5),
 ("anyone's", 1.5),
 ('car', 1.473214

Out of the 334 unique words in the review, it appears there are: 

* 135 words that scored above 1
* 10 words that are scored exactly 1
* 189 words that scored less than 1

This means that there are more words that are associated with positive reviews than negative reviews. This does not explain why this review is classified as a negative review.

In [37]:
print(positive_reviews_misclassified[12])

director dominic sena ( who made the highly underrated kalifornia ) and producer jerry bruckheimer ( the rock , armageddon ) bring us a slick and entertaining remake of the 1974 film of the same name that absolutely no one has ever seen . nicolas cage plays memphis , a retired car thief who's " pulled back in " to the business by an evil car thief overlord ( christopher eccleston ) determined to kill memphis' kid brother ( giovanni ribisi ) . memphis is ordered to steal 50 cars in four days time or his brother will meet an unfortunate demise , all while having to elude the detectives hot on his trail and a rival car thief who feels the job should have been given to him and his gang . memphis sets out to put his old crew back together , but discovers that most of them have retired as well . gone in sixty seconds does things right from the opening credits . in that sequence , we get a rockin' little tune from moby , along with some simple back story told only with photographs and assorte

### <b> Personal Analysis:</b>

After reading this review, it's very obvious that this is a positive review. 

A possible cause for it's misclassification is that the writer does include things that they did not like about the movie. They spoke about how the story line could have been improved and that it's very difficult to make car chases interesting anymore. 

As the writer includes both positive and negative sentament, this would be difficult for a sentiment analysis system to identify which is the true sentiment. 

The review also writes a description of the film. It seems that the film has some dark aspects such as theft and gangs. This could have swayed the performance of the sentiment classification.

## Part 2.2: Analysing Positive Review #10

In [58]:
review10_dict = create_frequency_dictionary([positive_reviews_misclassified[10]])
sorted(review10_dict.items(), key=lambda x: x[1], reverse=True)[10:30]

[('it', 9),
 ('for', 8),
 ('but', 7),
 ('movies', 6),
 ("it's", 6),
 ('on', 6),
 ('in', 6),
 ('earth', 6),
 ('(', 5),
 (')', 5),
 ('mst3k', 5),
 ('film', 5),
 ('island', 5),
 ('watch', 4),
 ('most', 4),
 ('fun', 4),
 ('man', 4),
 ('by', 4),
 ('will', 4),
 ('some', 4)]

The words above are the most common words (excluding the first ten, as they were all punctuation/determiners) in the review. 

It doesn't seem to show any words with negative associations, so it's difficult to tell why this review was classified as negative.

In [48]:
score_dict = {}

for word in review10_dict.keys():
    score_dict[word] = find_ratio_positive_review(word, positive_frequency_dictionary, negative_frequency_dictionary)
    
sorted(score_dict.items(), key=lambda x: x[1], reverse=True)

[('worst', 5.204545454545454),
 ('torturous', 5.0),
 ('unfunny', 5.0),
 ('poking', 4.0),
 ('bad', 2.833846153846154),
 ('cheezy', 2.5),
 ('70', 2.5),
 ('worse', 2.4923076923076923),
 ('sake', 2.4166666666666665),
 ('catch', 2.2666666666666666),
 ('spacecraft', 2.25),
 ('horrible', 2.1481481481481484),
 ('annoying', 2.076923076923077),
 ('disappointment', 2.0526315789473686),
 ('horrendous', 2),
 ('3000', 2),
 ('indulging', 2),
 ('retched', 2),
 ('1954', 2.0),
 ('irrelevant', 2.0),
 ('tv-series', 2),
 ('functioning', 2.0),
 ('anyway', 2.0),
 ('asinine', 2),
 ('mad', 1.92),
 ('attempt', 1.8987341772151898),
 ('ripped', 1.875),
 ('brain', 1.8333333333333333),
 ('why', 1.7894736842105263),
 ('plots', 1.7857142857142858),
 ('intends', 1.6666666666666667),
 ('post', 1.6666666666666667),
 ('plain', 1.6538461538461537),
 ('buddies', 1.588235294117647),
 ('save', 1.5806451612903225),
 ('explain', 1.5294117647058822),
 ('straight', 1.5084745762711864),
 ('maniacal', 1.5),
 ('exploit', 1.5),
 ('m

Out of the 337 unique words in the review, it appears there are:

* 135 words that scored above 1
* 5 words that are scored exactly 1
* 197 words that scored less than 1

This means that there are more words that are associated with positive reviews than negative reviews. This does not explain why this review is classified as a negative review.

The above output shows the proportion of words in the review that are more negatively/positively associated.
The higher the score, the more negative the association. The lower the score, the more positive the association.

* The most <b>negtively associated words</b> that appear in this review are 'worst', 'torturous', 'unfunny', 'poking', 'bad' and 'cheezy'.

* The most <b>positively associated words</b> that appear in this review (excluding 0-scores) are 'clayton', 'hilarious', 'cruelty', 'mst3k', 'wisely', 'overall' and 'observations'.

The comparison between these words show that the the negative words in the review are more commonly associated to strong negative emotions than the positive words were to strong positive emotions, or positive emotions at all. 

In [57]:
print(positive_reviews_misclassified[10])



### <b> Personal Analysis:</b>

It's easy to understand why the model has misclassified this review after reading it. 

The review is about a movie/series where they show people badly-rated movies as part of an experiment. This review goes on to detail one specific episode, where there was a particularly bad movie. Some phrases in this review include:

* "the film-within-the-film naturally has to be as horrible as possible"
* "it's a bad enough flick to bring about some hysterical cruelty"
* "it is a movie to be seen on home video , late at night when your brain is not functioning to full capacity anyway"

There were also some phrases that were used with respect to the show itself, mst3k, that would lead a trained model to believe it was a negative review, such as:

* "but retched examples of film-making chosen specifically by dr . forrester in an attempt to break mike's will to live"
* "the torture becomes somewhat of a honorary party for all that is wrong in the world of cheezy cinema"
* "the plots behind the movies ripped apart are really quite irrelevant"

While there is some mention of the series being called 'hilarious' and 'fun', it is expected that the above sentences would outweigh these comments and would cause the model to incorrectly classify this review as negative.

## Part 2.3: Analysing Positive Review #2

In [61]:
review2_dict = create_frequency_dictionary([positive_reviews_misclassified[2]])
sorted(review2_dict.items(), key=lambda x: x[1], reverse=True)[10:30]

[('"', 8),
 ('jackal', 7),
 ('that', 7),
 ('(', 7),
 (')', 7),
 ('as', 6),
 ('it', 6),
 ('are', 6),
 ('but', 6),
 ('willis', 5),
 ("it's", 5),
 ('his', 5),
 ('gere', 4),
 ('with', 4),
 ('no', 4),
 ('has', 4),
 ('poitier', 3),
 ('other', 3),
 ('by', 3),
 ('their', 3)]

The most common words in this review are not a great indicator of the sentiment present in the review.

In [62]:
score_dict = {}

for word in review2_dict.keys():
    score_dict[word] = find_ratio_positive_review(word, positive_frequency_dictionary, negative_frequency_dictionary)
    
sorted(score_dict.items(), key=lambda x: x[1], reverse=True)

[('preston', 11.0),
 ('$70', 8.0),
 ('presumably', 5.0),
 ('hitman', 4.25),
 ('declan', 4),
 ('watchable', 3.3333333333333335),
 ('raid', 3.0),
 ('ira', 3.0),
 ('venora', 3.0),
 ('irishman', 3),
 ('jackal', 2.857142857142857),
 ('bad', 2.833846153846154),
 ('blame', 2.4347826086956523),
 ('villains', 2.2777777777777777),
 ('compliment', 2.25),
 ('poor', 2.2131147540983607),
 ('borders', 2.1666666666666665),
 ('diane', 2.142857142857143),
 ('superior', 2.074074074074074),
 ('mulqueen', 2),
 ('valentina', 2),
 ('koslova', 2),
 ('donning', 2.0),
 ('mobbed', 2),
 ('lest', 2.0),
 ('misunderstand', 2),
 ('redman', 1.75),
 ('somewhere', 1.7317073170731707),
 ('attempts', 1.7162162162162162),
 ('psychic', 1.625),
 ('holes', 1.6),
 ('pick', 1.5925925925925926),
 ('villain', 1.5789473684210527),
 ('gere', 1.5714285714285714),
 ("let's", 1.5681818181818181),
 ("they're", 1.509933774834437),
 ('poitier', 1.5),
 ('copyright', 1.5),
 ('operative', 1.5),
 ('tracks', 1.4666666666666666),
 ('club', 1.4

Out of the 320 unique words in the review, it appears there are:

* 117 words that scored above 1
* 13 words that are scored exactly 1
* 190 words that scored less than 1

This means that there are more words that are associated with positive reviews than negative reviews. This does not explain why this review is classified as a negative review.

Both the most positive and negative words seem to have a weak association to their corresponding sentiment.

* The most <b>negtively associated words</b> that appear in this review are 'preston', '70', 'presumably', 'hitman', 'declan' and 'watchman'.

* The most <b>positively associated words</b> that appear in this review (excluding 0-scores) are 'dread', 'slip', 'moscow', 'mathilda', 'organized', 'pivotal' and 'flaws'.

This may cause some confusion in the classifier and could be the cause of misclassification.

In [63]:
print(positive_reviews_misclassified[2])

gere , willis , poitier chase each other around the world the jackal a film review by michael redman copyright 1997 by michael redman when the soviet union imploded , the western countries lost their shadow . with the united states friendly with the russians , we no longer had an entity to blame for the world's problems . this showed up in hollywood films as the communist government was no longer the easy bad guy . it's time to rejoice because we've found our new villain . now it's no longer the russian government who sends killers out into foreign lands , it's the russian _mafia_ . a perfect solution , it combines the dread of organized crime and the still-present uneasiness with the former eastern block countries . best of all , the villains are still foreigners : fear of the other always plays best . so it is a crime lord in moscow that sends legendary hitman the jackal ( bruce willis ) to assassinate a highly placed us government figure in retaliation for the death of his brother d

### <b> Personal Analysis:</b>

Majority of this review is spent giving a description of the plot of the movie. The movie has a dark plot and so contains some negative words such as 'dread', 'fear' and 'bad'.

The end of the review is where the writer introduces their opinion, which is both negative and positive. The writer goes on to list the flaws of the movie, such as:
* there is no doubt that the original is the better movie
* " the jackal " has enough holes in it to ruin the tale
* holes ? let's see ? a pivotal clue for mulqueen is so obscure that he must possess psychic powers to pick it up
* the jackal is an incredibly poor shot
* [...] ( and lest you misunderstand , that's not a compliment . )

After listing the flaws of the movie, the writer concludes the review saying that despite these flaws, it's a good movie. 

The proportion of sentences/tokens with negative sentiment appear far more frequently than those with positive sentiment, which explains the misclassification.

## Part 2.4: Analysing Positive Review #18

In [67]:
review18_dict = create_frequency_dictionary([positive_reviews_misclassified[18]])
sorted(review18_dict.items(), key=lambda x: x[1], reverse=True)[10:30]

[('of', 15),
 ('she', 12),
 ('syd', 11),
 ('lucy', 11),
 ('are', 9),
 ('with', 9),
 ('who', 9),
 ('that', 8),
 ('has', 8),
 ('film', 7),
 ('by', 7),
 ('they', 7),
 ('as', 7),
 ('but', 6),
 ('be', 6),
 ('an', 5),
 ('which', 5),
 ('(', 5),
 (')', 5),
 ('high', 4)]

Once again, the most common words in the review does not offer much insight into the sentiment behind the review.

In [68]:
score_dict = {}

for word in review18_dict.keys():
    score_dict[word] = find_ratio_positive_review(word, positive_frequency_dictionary, negative_frequency_dictionary)
    
sorted(score_dict.items(), key=lambda x: x[1], reverse=True)

[('wh', 8),
 ('screenwriting', 6.5),
 ('syd', 6),
 ('gabriel', 5.0),
 ('worn-out', 4),
 ('clarkson', 4),
 ('pit', 4.0),
 ('disappears', 3.4285714285714284),
 ('hopeless', 3.3333333333333335),
 ('lucy', 3.25),
 ('radha', 3),
 ('editors', 3.0),
 ('bad', 2.833846153846154),
 ('throwaway', 2.75),
 ('lisa', 2.3333333333333335),
 ('nowhere', 2.142857142857143),
 ('pathetic', 2.107142857142857),
 ('superior', 2.074074074074074),
 ("cholodenko's", 2),
 ('judging', 2.0),
 ('negatively', 2),
 ('inform', 2.0),
 ('greta', 2.0),
 ('conciousness', 2),
 ("lucy's", 2),
 ("ricci's", 2.0),
 ('gratuitous', 2.0),
 ('insecurities', 2.0),
 ('christina', 1.9),
 ('stereotypical', 1.7692307692307692),
 ('demons', 1.75),
 ('bath', 1.6666666666666667),
 ('lesbianism', 1.6666666666666667),
 ('nudity', 1.6666666666666667),
 ('stuck', 1.6551724137931034),
 ('ricci', 1.6363636363636365),
 ('headed', 1.6),
 ('ally', 1.5714285714285714),
 ('seriously', 1.55),
 ('ten', 1.5490196078431373),
 ('ideas', 1.5192307692307692

Out of the 384 unique words in the review, it appears there are:

* 117 words that scored above 1
* 24 words that are scored exactly 1
* 243 words that scored less than 1

This means that there are more words that are associated with positive reviews than negative reviews. This does not explain why this review is classified as a negative review.

In [84]:
print(positive_reviews_misclassified[18])

lisa cholodenko's " high art , " is an intelligent , quiet drama . its strongest quality , aside from the top-notch central performances , is the perceptive way in which the film , also written by cholodenko , observes its characters . they are all flawed people , some more troubled than others , but they are not judged . judging the characters in this picture would be a creative misstep on the filmmakers' parts , because no one , no matter how bad off they are , deserve to be negatively judged if they are involved in some serious problems that they cannot break free of . syd ( radha mitchell ) , a 24-year-old woman living with her longtime boyfriend james ( gabriel mann ) , has recently been awarded an ideal job at the high-profile photography magazine , " frame . " she very much enjoys where her career is headed , but is often not taken very seriously by her managers , who are always giving her petty jobs to do , when she knows she could be doing more important things . one night , w

### <b> Personal Analysis:</b> 

It is difficult to justify why this film has been misclassified. A possible reason for this could be that the description of the film itself is about very unhappy people, and while describing this film the writer is using very negative words/phrases such as:
* flawed
* troubled
* negatively judged
* serious problems
* worn-out
* unhappy
* pathetic 
* stereotypical

The positive reaction to the film is portrayed in the vocabulary of the review, with words/phrases like:
* intelligent
* refreshing
* flawless portrayal
* full of riches
* beautifully and originally done
* strongest, and best, role to date

After reading this review, it's quite obvious that a classification model would see an equal mix of positive and negative sentiment attached to this review. It's likely that the probability of the model identifying it as positive vs. negative were very close.

## Part 2.5: Analysing Positive Review #3

In [69]:
review3_dict = create_frequency_dictionary([positive_reviews_misclassified[3]])
score_dict = {}

for word in review3_dict.keys():
    score_dict[word] = find_ratio_positive_review(word, positive_frequency_dictionary, negative_frequency_dictionary)
    
sorted(score_dict.items(), key=lambda x: x[1], reverse=True)

[('azaria', 9.5),
 ('wasted', 6.642857142857143),
 ('unlikeable', 6.0),
 ('ridiculous', 5.1),
 ('stupid', 5.081081081081081),
 ('raja', 5),
 ('mess', 4.461538461538462),
 ('pointless', 4.3125),
 ('reubens', 4.0),
 ('cleverness', 4.0),
 ("they'd", 3.6666666666666665),
 ('batman', 3.5128205128205128),
 ('greg', 3.4444444444444446),
 ('superman', 3.2),
 ('all-star', 3.0),
 ('casanova', 3.0),
 ('kel', 3),
 ("it'", 3.0),
 ('bundle', 3.0),
 ('thats', 3.0),
 ('bad', 2.833846153846154),
 ('invisible', 2.7),
 ('janeane', 2.4),
 ('carrot', 2.3333333333333335),
 ('none', 2.3278688524590163),
 ('disgusting', 2.1666666666666665),
 ('superior', 2.074074074074074),
 ('flaming', 2.0),
 ('moronic', 2.0),
 ('robin', 1.9180327868852458),
 ('super', 1.8333333333333333),
 ('maybe', 1.8148148148148149),
 ('useless', 1.7777777777777777),
 ('overblown', 1.7142857142857142),
 ('stood', 1.7142857142857142),
 ('forgets', 1.7142857142857142),
 ('happily', 1.6363636363636365),
 ('villain', 1.5789473684210527),
 ('

Out of the 328 unique words in the review, it appears there are:

* 125 words that scored above 1
* 19 words that are scored exactly 1
* 184 words that scored less than 1

This means that there are more words that are associated with positive reviews than negative reviews. This does not explain why this review is classified as a negative review.

In [86]:
print(positive_reviews_misclassified[3])

usually when a blockbuster comes out , it's loaded with effects , stars , bad scripts , and plenty of action . mystery men may contain an all-star cast , and efects , but the clever script and characters are what really works , which is rare to see this year . the film is based upon the comic book series " the flaming carrot " by bob burden , in which 3 wanna be super heroes try and fight crime , only to be out done by the real hero of champion city , captain amazing ( greg kinnear ) . things go a little haywire , when the sinister casanova frankenstein ( geoffrey rush ) is released into the city , where he captures captain amazing , and plans to wreak havoc upon champion city . well , the trio decide to take matters in their own hands , by saving the city , but first they need some assitance . this is where the film takes a turn for the better . in the beggining , there were only 3 wanna be heroes .  " blue raja ( hank azaria ) , " mr . furious ( ben stiller ) and " the shoveller ( wi

### <b> Personal Analysis:</b> 

Overall, this review can be read as a very positive review. The writer seems to enjoy most aspects of the film. Although, particularly at the end of the review, has described some of the faults of the film. These are words/phrases that would be considered very negative, such as:
* ridiculous
* disgusting or revolting
* utterly useless
* so limited
* wasted
* unlikeable
* pointless
* miserable

While the movie review remains mostly positive, these words/phrases are possibly the cause of the misclassification.

# Part 3: Analysing Misclassification of Negative Reviews

### Deciding which 5 reviews to analyse

In [62]:
negative_reviews_sample_ids = random.sample(list(np.arange(0,len(negative_reviews_misclassified))), 5)
negative_reviews_sample_ids

[3, 6, 8, 4, 1]

In [63]:
random.sample([3, 6, 8, 4, 1], 2)

[3, 1]

## Part 3.1: Analysing Negative Review #1

In [64]:
review1_dict = create_frequency_dictionary([negative_reviews_misclassified[1]])
sorted(review1_dict.items(), key=lambda x: x[1], reverse=True)[10:30]

[('for', 4),
 ('that', 4),
 ('they', 4),
 ('his', 4),
 ('walken', 3),
 ('has', 3),
 ('(', 3),
 (')', 3),
 ('film', 3),
 ("o'fallon", 3),
 ('in', 3),
 ('an', 3),
 ('on', 3),
 ('crosses', 3),
 ('over', 3),
 ('as', 2),
 ('mobster', 2),
 ('by', 2),
 ('four', 2),
 ('rich', 2)]

It appears that, similar to the positive reviews, viewing the most common words in the negative reviews does not offer much insight into the sentiment.

In [67]:
score_dict = {}

for word in review1_dict.keys():
    score_dict[word] = find_ratio_positive_review(word, positive_frequency_dictionary, negative_frequency_dictionary)
    
sorted(score_dict.items(), key=lambda x: x[1], reverse=True)

[('mess', 4.461538461538462),
 ('flannery', 4),
 ('tangents', 4),
 ('terrible', 3.6296296296296298),
 ('finger', 3.5),
 ('bratty', 3),
 ('serpentine', 3),
 ('tuned', 3.0),
 ("woman's", 2.5714285714285716),
 ('double', 2.3157894736842106),
 ('--and', 2),
 ('triple', 2.0),
 ('peer', 2.0),
 ('dennis', 1.793103448275862),
 ('leary', 1.7142857142857142),
 ('material', 1.6835443037974684),
 ("what's", 1.6296296296296295),
 ('unrelated', 1.6),
 ('save', 1.5806451612903225),
 ('walken', 1.5555555555555556),
 ('plot', 1.5265225933202358),
 ('amusingly', 1.5),
 ('endless', 1.4736842105263157),
 ('earth', 1.4639175257731958),
 ('wondering', 1.4),
 ('?', 1.365598885793872),
 ("i'm", 1.3559322033898304),
 ("isn't", 1.350609756097561),
 ('guys', 1.297029702970297),
 ('movie', 1.2906755470980018),
 ('idea', 1.2876712328767124),
 ('favourite', 1.2857142857142858),
 ('someone', 1.2733333333333334),
 ('t', 1.2727272727272727),
 ('wrong', 1.2702702702702702),
 ('if', 1.2671480144404332),
 ('any', 1.26527

Out of the 243 unique words in the review, it appears there are:

* 82 words that scored above 1
* 18 words that are scored exactly 1
* 143 words that scored less than 1

This means that there are more words that are associated with positive reviews than negative reviews. This could explain the misclassification of this review.

In [68]:
print(negative_reviews_misclassified[1])

walken stars as a mobster who is kidnapped and held for ransom by four bratty rich kids . it seems that a woman has also been kidnapped--she is the sister of one of them ( e . t . 's henry thomas ) and the girlfriend of another ( flannery ) --and the asking price is $2 million , which said snots are unable to cough up alone . they even cut off walken's finger to show they mean business , because they are desperate to save the woman's life . suicide kings is a terrible film . walken aside , there isn't a single appealing cast member . o'fallon creates characters that are functional types without any resonance . in an amusingly unironic scene , walken plays poker with the foursome and describes each of their personalities to a tee--it's as if he was reading the summary sheet for a casting director . the plot is another issue entirely . o'fallon is someone whom i'm betting has seen reservoir dogs and the usual suspects too many times , for not only does his story veer off on bizarre tange

### <b> Personal Analysis:</b> 

Judging by the vocabulary used in this review, it is surprising that this review has been classified as positive. The following phrases/words are generally negative in nature:

* bratty
* snots
* is a terrible film
* there isn't a single appealing cast member
* amusingly unironic
* the plot is another issue
* veer off on bizarre tangents
* the central plot itself is a serpentine mess
* i had completely tuned out

There isn't many positively associated words that appear in this review, it's difficult to justify the misclassification of this review.

## Part 3.2: Analysing Negative Review #3

In [70]:
review3_dict = create_frequency_dictionary([negative_reviews_misclassified[3]])
# sorted(review3_dict.items(), key=lambda x: x[1], reverse=True)[10:30]

In [71]:
review3_dict = create_frequency_dictionary([negative_reviews_misclassified[3]])
score_dict = {}

for word in review3_dict.keys():
    score_dict[word] = find_ratio_positive_review(word, positive_frequency_dictionary, negative_frequency_dictionary)
    
sorted(score_dict.items(), key=lambda x: x[1], reverse=True)

[('skip', 4.285714285714286),
 ('bad', 2.833846153846154),
 ('jungle', 2.6),
 ('lisa', 2.3333333333333335),
 ('stereotypes', 2.3333333333333335),
 ('murky', 2.25),
 ('explained', 2.2222222222222223),
 ('unless', 2.2),
 ('tired', 2.0),
 ('rack', 2.0),
 ('overpaid', 2.0),
 ('subplots', 1.8888888888888888),
 ('neither', 1.8125),
 ('halfway', 1.7777777777777777),
 ('fair', 1.6),
 ('nor', 1.5384615384615385),
 ('plot', 1.5265225933202358),
 ('expected', 1.489795918367347),
 ('kills', 1.4642857142857142),
 ('die-hard', 1.4),
 ('johnny', 1.3666666666666667),
 ('blazing', 1.3333333333333333),
 ('escapes', 1.3125),
 ('action', 1.3113207547169812),
 ('falls', 1.3043478260869565),
 ('movie', 1.2906755470980018),
 ('fan', 1.2833333333333334),
 ('enemies', 1.2727272727272727),
 ('if', 1.2671480144404332),
 ('services', 1.25),
 ('paul', 1.244186046511628),
 ('anyone', 1.2335329341317365),
 ('africa', 1.2142857142857142),
 ('bright', 1.2105263157894737),
 ('have', 1.2088353413654618),
 ('self', 1.208

Out of the 223 unique words in the review, it appears there are:

* 65 words that scored above 1
* 30 words that are scored exactly 1
* 128 words that scored less than 1

This means that there are more words that are associated with positive reviews than negative reviews. This could explain the misclassification of this review.

In [72]:
print(negative_reviews_misclassified[3])

it's a sad state of affairs when the back box blurb is more exciting than the movie contained within it . such is the case for the 1990 paul mayersberg film _the last samurai_ . though the blurb alludes to " a jungle filled with political intrigue , uneasy alliances , and murderous enemies at every turn , " the story of the movie is actually quite simple ( and prosaic ) : a middle-aged japanese businessman named endo ( played by john fujioka ) and his assistant , both of whom have samurai aspirations , travel to africa in search of his ancestor , who went to bring buddhism to africa . he hires the services of down-at-the-heels vietnam veteran pilot johnny congo ( the redoubtable lance henriksen ) and his girlfriend ( arabella holzbog ) , and travels to the camp of an arms-merchant-cum-safari-host- cum-islamic-missionary ( john saxon ) and his wife ( lisa eilbacher ) . they are all kidnapped by an african revolutionary guerilla with witch-doctor aspirations to conceal a pre-arranged arm

<b><u>Personal Analysis:</u></b>

Similar to the negative review #1, this also contains a lot of negative words. However, it is slightly easier to understand why this review was misclassified as there are words/phrases that the model would identify as positive:

* exciting
* intrigue
* good
* enjoyable
* fan

These tokens may have impacted the performance of the model and caused the model to classify this as a positive review.