## Part 1:##

I began by loading the IMDB dataset and converting it into a list of input vectors and output labels.

In [120]:
from datasets import load_dataset

imdb_dataset = load_dataset("imdb")['train']
train_data = []
train_data_labels = []
for item in imdb_dataset:
  train_data.append(item['text'])
  train_data_labels.append(item['label'])
print(train_data[-1])
print(train_data_labels[-1])

The story centers around Barry McKenzie who must go to England if he wishes to claim his inheritance. Being about the grossest Aussie shearer ever to set foot outside this great Nation of ours there is something of a culture clash and much fun and games ensue. The songs of Barry McKenzie(Barry Crocker) are highlights.
1


I used TfidfVectorizer to extract the feature list using the parameters that I configured during Assignment 1.

In [121]:
from sklearn.feature_extraction.text import TfidfVectorizer

vectorizer = TfidfVectorizer(analyzer='word',max_features=1000,lowercase=True,stop_words='english', ngram_range=(1, 2),)
features = vectorizer.fit_transform(train_data).toarray()
print(features.shape)
print(vectorizer.get_feature_names_out())

(25000, 1000)
['10' '15' '20' '30' '50' '80' '90' 'able' 'absolutely' 'accent' 'act'
 'acted' 'acting' 'action' 'actor' 'actors' 'actress' 'actual' 'actually'
 'add' 'admit' 'adult' 'adventure' 'age' 'ago' 'agree' 'air' 'amazing'
 'america' 'american' 'amusing' 'animated' 'animation' 'annoying' 'anti'
 'apart' 'apparently' 'appear' 'appears' 'appreciate' 'aren' 'art' 'aside'
 'ask' 'atmosphere' 'attempt' 'attempts' 'attention' 'audience'
 'audiences' 'average' 'avoid' 'away' 'awesome' 'awful' 'baby'
 'background' 'bad' 'bad movie' 'badly' 'band' 'barely' 'based' 'basic'
 'basically' 'battle' 'beautiful' 'beauty' 'begin' 'beginning' 'begins'
 'believable' 'believe' 'ben' 'best' 'better' 'big' 'biggest' 'bit'
 'bizarre' 'black' 'blood' 'body' 'book' 'books' 'bored' 'boring' 'box'
 'boy' 'boys' 'br' 'br 10' 'br br' 'br don' 'br film' 'br movie'
 'br story' 'brain' 'break' 'brilliant' 'bring' 'brings' 'british'
 'brother' 'brothers' 'brought' 'budget' 'bunch' 'business' 'buy' 'called'
 'ca

After reviewing the feature list I noticed that there were a number of terms starting with the word "br". After researching online, I found that this was likely due to html tags being incorrectly processed as normal words within the reviews. I chose to clean the data using the Python package BeautifulSoup by removing these html tags therefore improving the effectiveness of the feature list.

In [122]:
from bs4 import BeautifulSoup
from sklearn.feature_extraction.text import TfidfVectorizer

def clean_html(text):
    return BeautifulSoup(text, "html.parser").get_text()

# Clean the training data by removing HTML tags
train_data = [clean_html(review) for review in train_data]

vectorizer = TfidfVectorizer(analyzer='word',max_features=1000,lowercase=True,stop_words='english', ngram_range=(1, 2),)
features = vectorizer.fit_transform(train_data).toarray()
print(features.shape)
print(vectorizer.get_feature_names_out())


  return BeautifulSoup(text, "html.parser").get_text()


(25000, 1000)
['10' '15' '20' '30' '50' '80' '90' 'able' 'absolutely' 'accent' 'act'
 'acted' 'acting' 'action' 'actor' 'actors' 'actress' 'actual' 'actually'
 'add' 'admit' 'adult' 'adventure' 'age' 'ago' 'agree' 'air' 'amazing'
 'america' 'american' 'amusing' 'animated' 'animation' 'annoying' 'anti'
 'apart' 'apparently' 'appear' 'appears' 'appreciate' 'aren' 'art' 'aside'
 'ask' 'atmosphere' 'attempt' 'attempts' 'attention' 'audience'
 'audiences' 'average' 'avoid' 'away' 'awesome' 'awful' 'baby'
 'background' 'bad' 'bad movie' 'badly' 'band' 'barely' 'based' 'basic'
 'basically' 'battle' 'beautiful' 'beauty' 'begin' 'beginning' 'begins'
 'believable' 'believe' 'ben' 'best' 'better' 'big' 'biggest' 'bit'
 'bizarre' 'black' 'blood' 'body' 'book' 'books' 'bored' 'boring' 'box'
 'boy' 'boys' 'brain' 'break' 'brilliant' 'bring' 'brings' 'british'
 'brother' 'brothers' 'brought' 'budget' 'bunch' 'business' 'buy' 'called'
 'came' 'camera' 'car' 'care' 'career' 'cartoon' 'case' 'cast' 'cas

The new feature list no longer contains any html tags. 

Next, I split the data 90/10 into training and validation sets.

I then created and trained my Multinomial Naive Bayes model.




In [123]:
from sklearn.model_selection import train_test_split
X_train, X_val, y_train, y_val = train_test_split(features,train_data_labels,train_size=0.9,random_state=123)
from sklearn.naive_bayes import MultinomialNB
model = MultinomialNB()
model = model.fit(X=X_train,y=y_train)

I determined the 50 most important features of this model using class-conditional probability differences.

The logarithmic probability of each feature being in a postive or negative review is stored in the vector feature_log_prob. I converted these to normal probabilities using numpy.

Next, I gave each feature an importance value calculated by finding the absolute value of the difference between each features probability of being positive or negative. The features with the biggest discrepancy in probabilities are the ones with the most importance to the model. If a feature is far more likely to be in a positive or negative review than any review containing that feature is far more likely to be classified accordingly.

I used another numpy function to sort the features by their importance value and created an ordered list of the top 50 features. 



In [124]:
import numpy as np

feature_probabilities = np.exp(feature_log_prob)
feature_importance = np.abs(feature_probabilities[1] - feature_probabilities[0])


top_features_indices = np.argsort(feature_importance)[::-1]  
top_features = [(feature_names[i], feature_importance[i]) for i in top_features_indices[:50]]


print("Top 50 most important features:")
for i, (feature, importance) in enumerate(top_features):
    print(f"{i + 1}. {feature}: {importance:.4f}")  




Top 50 most important features:
1. bad: 0.0055
2. great: 0.0041
3. worst: 0.0034
4. movie: 0.0028
5. best: 0.0026
6. love: 0.0025
7. waste: 0.0023
8. just: 0.0023
9. awful: 0.0022
10. excellent: 0.0021
11. terrible: 0.0019
12. wonderful: 0.0019
13. boring: 0.0019
14. plot: 0.0019
15. stupid: 0.0018
16. minutes: 0.0018
17. life: 0.0018
18. worse: 0.0017
19. don: 0.0017
20. acting: 0.0017
21. poor: 0.0016
22. horrible: 0.0016
23. script: 0.0016
24. beautiful: 0.0015
25. money: 0.0015
26. story: 0.0014
27. loved: 0.0014
28. amazing: 0.0014
29. perfect: 0.0014
30. thing: 0.0013
31. crap: 0.0013
32. film: 0.0013
33. world: 0.0013
34. performance: 0.0013
35. family: 0.0013
36. make: 0.0013
37. favorite: 0.0013
38. young: 0.0012
39. like: 0.0012
40. supposed: 0.0012
41. better: 0.0012
42. series: 0.0012
43. brilliant: 0.0011
44. years: 0.0011
45. highly: 0.0011
46. fun: 0.0011
47. avoid: 0.0011
48. instead: 0.0011
49. ridiculous: 0.0011
50. poorly: 0.0011


## Part 2

I created my "easy" test set by modifying real, existing reviews from the IMDB website. Generally, I took the opening few sentences/paragraphs from each review. Below are the 6 "easy" reviews along with a brief explanation as to why they were chosen.

### Easy Reviews:

**Positive**  
1. **"Wow, this was fantastic! As I was watching it, I asked myself, 'Is this the best animated movie I've ever seen?' I think the answer is 'yes.'"**

   Enthusiastic and exclamatory language such as "Wow," "fantastic," and "yes" make it an obviously positive review. These are also very common words likely to be found in many other positive reviews.

2. **"Yeah, I must admit, I love this movie. Which is nothing to be ashamed of; great movie, great directing, great set, great scale, great canvas, great story."**

   The repeated use of "great," which the model values highly as a positive feature, makes this a straightforward classification.

3. **"This is the finest movie I have ever seen of the drama kind; it has everything to make an excellent movie. All the actors play an outstanding role."**

   This is a genuine endorsement of the movie, shown through phrases like "finest movie" and "outstanding role." There is no attempt at sarcasm, so the model is very unlikely to be incorrect.

**Negative**  
1. **"I have no idea how anyone could like this dull, uninspiring movie. It was very, very predictable. The leading actress had no talent."**

   Due to n-grams being included in the feature list, a phrase like "no talent" will be identified as negative along with other strong criticisms like "dull" and "uninspiring."

2. **"This movie is probably one of the worst movies I have ever seen. Don't waste your time watching this. I almost turned this movie off watching it."**

   There is no ambiguity in this review. Hyperbolic negative language like "worst" and "waste" make it easy to classify.

3. **"If you want a quick and easy way to punish your kids, take them to see this film. This overlong and boring movie will put them to sleep."**

   Although there is slight sarcasm in this review, the negative connotations behind "punish," "boring," and "overlong" remove any uncertainty.

---

I wrote my own "adversarial" test set. My aim was to create reviews that the model would classify incorrectly due to sarcasm, ambiguity, or unfamiliarity with the language.

### Adversarial Reviews:

**Positive**  
1. **"I was dreading watching this movie as I had been told the acting was awful and the plot was confusing. However, having watched it myself, I completely disagree. Those reviews were absolute nonsense."**

   Initially, negative language is used to describe the writer's expectations (e.g. "dreading," "awful"). However, even when they express their own positive sentiment toward the film, it is through criticizing other negative reviews. The further use of "disagree" and "nonsense" compounds the false negative sentiment the model is likely to detect.

2. **"I wanted to turn this film off from the moment it began. It was a horrible and unsettling experience, exactly how a horror movie should be. Despite the awful fear I felt, I couldn’t look away."**

   As horror movies aim to evoke emotions like fear and anxiety, certain words such as "unsettling" and "awful" lose their usual negative sentiments. The model is unlikely to pick up on these becoming desirable traits in certain genres.

3. **"I usually hate the lead actor but his performance wasn’t a complete disaster this time around. I was shocked that he didn’t totally ruin the film and it was actually quite enjoyable."**

   "Hate" and "disaster" are typically associated with negative sentiments, but the writer's admission that the film exceeded their expectations and was "quite enjoyable" creates ambiguity and uncertainty for the model.

**Negative**  
1. **"For a horror film, it had me laughing out loud from start to finish—truly a unique experience for the genre."**

   This review describes a horror movie that was hilariously bad. Of course, it is not a horror film's intention to make the audience laugh. The model would not be familiar with the alternative use of some prominent features, considering "laughing" and "unique" are typically found in positive reviews.

2. **"What a delightful waste of time! I love sitting through a three-hour film that goes nowhere. I’m looking forward to the sequel already!"**

   I would not expect my model to detect the sarcasm in this review. Positive-sounding language such as "love" and "delightful" is used to express distaste for the film.

3. **"Almost everything about my day at the cinema was brilliant; the popcorn was lovely, the seats were fantastic, and the tickets were great value. The film itself though, the less said the better."**

   Here, positive language is used to describe elements not relating explicitly to the film itself. Later, there is an abrupt shift to negative sentiments about the movie, but the use of a euphemism that the model is not familiar with means it is unlikely to be picked up on.





In [135]:
test_reviews = [
    # Positive Reviews
    "Wow, this was fantastic! As I was watching it, I asked myself, 'Is this the best animated movie I've ever seen?' I think the answer is 'yes.'",
    "Yeah, I must admit, I love this movie. Which is nothing to be ashamed of; great movie, great directing, great set, great scale, great canvas, great story.",
    "This is the finest movie I have ever seen of the drama kind; it has everything to make an excellent movie. All the actors play an outstanding role.",

    # Negative Reviews
    "I have no idea how anyone could like this dull, uninspiring movie. It was very, very predictable. The leading actress had no talent.",
    "This movie is probably one of the worst movies I have ever seen. Don't waste your time watching this. I almost turned this movie off watching it.",
    "If you want a quick and easy way to punish your kids, take them to see this film. This overlong and boring movie will put them to sleep."
]

test_labels = [1, 1, 1, 0, 0, 0]


X_test = vectorizer.transform(test_reviews).toarray()
test_predictions = model.predict(X_test)
total = 0

for i, review in enumerate(test_reviews):
    sentiment = "Positive" if test_predictions[i] == 1 else "Negative"
    actual_sentiment = "Positive" if test_labels[i] == 1 else "Negative"

    if sentiment == actual_sentiment:
        result = "Correct"
        total += 1
    else:
        result = "Wrong"
    
    print(f"Review: {review}\nPredicted Sentiment: {sentiment}\nActual Sentiment: {actual_sentiment}\nResult: {result}\n")

    

print(f"Total Results: {total}/{len(test_reviews)}\nAccuracy Score: {accuracy_score(test_labels, test_predictions):4f}")








Review: Wow, this was fantastic! As I was watching it, I asked myself, 'Is this the best animated movie I've ever seen?' I think the answer is 'yes.'
Predicted Sentiment: Positive
Actual Sentiment: Positive
Result: Correct

Review: Yeah, I must admit, I love this movie. Which is nothing to be ashamed of; great movie, great directing, great set, great scale, great canvas, great story.
Predicted Sentiment: Positive
Actual Sentiment: Positive
Result: Correct

Review: This is the finest movie I have ever seen of the drama kind; it has everything to make an excellent movie. All the actors play an outstanding role.
Predicted Sentiment: Positive
Actual Sentiment: Positive
Result: Correct

Review: I have no idea how anyone could like this dull, uninspiring movie. It was very, very predictable. The leading actress had no talent.
Predicted Sentiment: Negative
Actual Sentiment: Negative
Result: Correct

Review: This movie is probably one of the worst movies I have ever seen. Don't waste your time

As expected the model was very successful at classifying the easy reviews due to their straightforward, familiar language and absence of sarcasm or ambiguity.

In [134]:
adversarial_reviews = [

    # Negative Reviews
    "For a horror film, it had me laughing out loud from start to finish—truly a unique experience for the genre.",
    "What a delightful waste of time! I love sitting through a three hour film that goes nowhere. I’m looking forward to the sequel already!]",
    "Almost everything about my day at the cinema was brilliant; the popcorn was lovely, the seats were fantastic and the tickets were great value. The film itself though, the less said the better.",
    
    # Positive Reviews
    "I was dreading watching this movie as I had been told the acting was awful and the plot was confusing. However, having watched it myself, I completely disagree. Those reviews were absolute nonsense.",
    "I wanted to turn this film off from the moment it began. It was a horrible and unsettling experience, exactly how a horror movie should be. Despite the awful fear I felt, I couldn’t look away.",
    "I usually hate the lead actor but his performance wasn’t a complete disaster this time around. I was shocked that he didn’t totally ruin the film and it was actually quite enjoyable."
]

test_labels = [0, 0, 0, 1, 1, 1] 


X_test = vectorizer.transform(adversarial_reviews).toarray()
test_predictions = model.predict(X_test)
total = 0

for i, review in enumerate(adversarial_reviews):
    sentiment = "Positive" if test_predictions[i] == 1 else "Negative"
    actual_sentiment = "Positive" if test_labels[i] == 1 else "Negative"

    if sentiment == actual_sentiment:
        result = "Correct"
        total += 1
    else:
        result = "Wrong"
    
    print(f"Review: {review}\nPredicted Sentiment: {sentiment}\nActual Sentiment: {actual_sentiment}\nResult: {result}\n")

    

print(f"Total Results: {total}/{len(test_reviews)}\nAccuracy Score: {accuracy_score(test_labels, test_predictions):4f}")


Review: For a horror film, it had me laughing out loud from start to finish—truly a unique experience for the genre.
Predicted Sentiment: Positive
Actual Sentiment: Negative
Result: Wrong

Review: What a delightful waste of time! I love sitting through a three hour film that goes nowhere. I’m looking forward to the sequel already!]
Predicted Sentiment: Negative
Actual Sentiment: Negative
Result: Correct

Review: Almost everything about my day at the cinema was brilliant; the popcorn was lovely, the seats were fantastic and the tickets were great value. The film itself though, the less said the better.
Predicted Sentiment: Positive
Actual Sentiment: Negative
Result: Wrong

Review: I was dreading watching this movie as I had been told the acting was awful and the plot was confusing. However, having watched it myself, I completely disagree. Those reviews were absolute nonsense.
Predicted Sentiment: Negative
Actual Sentiment: Positive
Result: Wrong

Review: I wanted to turn this film off f

The model found it much harder to classify the adversarial reviews. It struggled to correctly assign sentiment to language it had not seen before or that was used in a different context e.g. sarcasm, irony etc. 

Surprisingly it did classify one of the adversarial reviews correctly. This was likely because despite the sarcastic tone of the review, the model places more importance on the negative sentiment of words like "waste" and "nowhere" than the positive sentiment of "love" and "delightful".