## Sentiment Analysis

In this exercise we use the IMDb-dataset, which we will use to perform a sentiment analysis. The code below assumes that the data is placed in the same folder as this notebook. We see that the reviews are loaded as a pandas dataframe, and print the beginning of the first few reviews.

In [71]:
import numpy as np
import pandas as pd

reviews = pd.read_csv('reviews.txt', header=None)
labels = pd.read_csv('labels.txt', header=None)
Y = (labels=='positive').astype(np.int_)

print(type(reviews))
print(reviews.head())
print(labels.head())

<class 'pandas.core.frame.DataFrame'>
                                                   0
0  bromwell high is a cartoon comedy . it ran at ...
1  story of a man who has unnatural feelings for ...
2  homelessness  or houselessness as george carli...
3  airport    starts as a brand new luxury    pla...
4  brilliant over  acting by lesley ann warren . ...
          0
0  positive
1  negative
2  positive
3  negative
4  positive


**(a)** Split the reviews and labels in test, train and validation sets. The train and validation sets will be used to train your model and tune hyperparameters, the test set will be saved for testing. Use the `CountVectorizer` from `sklearn.feature_extraction.text` to create a Bag-of-Words representation of the reviews. Only use the 10,000 most frequent words (use the `max_features`-parameter of `CountVectorizer`).

In [72]:
from sklearn.feature_extraction.text import CountVectorizer

vectorizer = CountVectorizer(max_features=10000)
revs_array = reviews[0].values
bag = vectorizer.fit_transform(revs_array)

**(b)** Explore the representation of the reviews. How is a single word represented? How about a whole review?

In [73]:
bag
# Inspecting the output below, which is the data-type of the representation of the reviews,
# we see that it is a sparse matrix. Let's take a closer look below.

<25000x10000 sparse matrix of type '<class 'numpy.int64'>'
	with 3156666 stored elements in Compressed Sparse Row format>

In [74]:

# convert to a DataFrame for visualization
df = pd.DataFrame(bag.toarray(), columns=vectorizer.get_feature_names_out())

print(df.tail())

       aaron  abandon  abandoned  abc  abilities  ability  able  aboard  \
24995      0        0          0    0          0        0     0       0   
24996      0        0          0    0          0        0     1       0   
24997      0        0          0    0          0        0     0       0   
24998      0        0          0    0          0        0     0       0   
24999      0        0          0    0          0        0     0       0   

       abominable  abomination  ...  zhang  zizek  zodiac  zombi  zombie  \
24995           0            0  ...      0      0       0      0       0   
24996           0            0  ...      0      0       0      0       0   
24997           0            0  ...      0      0       0      0       2   
24998           0            0  ...      0      0       0      0       0   
24999           0            0  ...      0      0       0      0       0   

       zombies  zone  zoom  zorro  zu  
24995        0     0     0      0   0  
24996       

We see that the bag-of-words representation of a review has each word of the vocabulary (at least the 10000 most frequent ones) as a row and then their occurrence in each review (which are the columns) as an integer. So the value we will see under the column of each word will be the amount of times it occurrs in each review. This is why the matrix is so scarce - there are 10000 words in our "vocabulary" and each review probably only uses a very small percentage of that.

We see in the tail of the reviews above that review number 24997 said the word "zombie" twice, for example.

In [75]:
from sklearn.model_selection import train_test_split
from tensorflow.keras.utils import to_categorical

X = bag
Y = to_categorical(Y, 2)
# splitting data into training and testing sets
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.2, random_state=42)

**(c)** Train a neural network with a single hidden layer on the dataset, tuning the relevant hyperparameters to optimize accuracy. 

In [108]:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras import optimizers
from numpy.random import seed, randint

seed(0)
tf.random.set_seed(0)

model = Sequential() #initialize neural network
model.add(Dense(units = 30, activation = 'sigmoid', input_dim = 10000)) #add the first hidden layer
model.add(Dense(units = 2, activation = 'softmax')) #output layer

sgd = optimizers.SGD(learning_rate = 0.03)
model.compile(loss = 'categorical_crossentropy', optimizer = sgd, metrics = ['accuracy'])

history = model.fit(X_train, Y_train, epochs = 20, batch_size = 50, validation_split = 0.2, verbose = 1)

Epoch 1/20
[1m240/240[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.6118 - loss: 0.6500 - val_accuracy: 0.7423 - val_loss: 0.5583
Epoch 2/20
[1m240/240[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - accuracy: 0.7490 - loss: 0.5434 - val_accuracy: 0.7890 - val_loss: 0.4900
Epoch 3/20
[1m240/240[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - accuracy: 0.7926 - loss: 0.4818 - val_accuracy: 0.8097 - val_loss: 0.4445
Epoch 4/20
[1m240/240[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - accuracy: 0.8179 - loss: 0.4384 - val_accuracy: 0.8283 - val_loss: 0.4142
Epoch 5/20
[1m240/240[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - accuracy: 0.8334 - loss: 0.4063 - val_accuracy: 0.8337 - val_loss: 0.3926
Epoch 6/20
[1m240/240[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - accuracy: 0.8452 - loss: 0.3810 - val_accuracy: 0.8440 - val_loss: 0.3761
Epoch 7/20
[1m240/240[0m 

I messed around with the Learning Rate, Curve type, number of epochs and batch size until I achieved 91% accuracy on the test set and 87% accuracy on the validation set.

What happened first was that I was looking at the train accuracy and realized that if I added enough epochs it would eventually reach 100%. So I capped the epochs at the point where it reached 100% training accuracy. However, I then started keeping my eye in the validation accuracy and realized that this one "plateau'd" far sooner, which made me reduce the epoch number even further. 87% for the validation accuracy was the absolyte maximum I managed to reach across all experiments.

I then realized that if my learning rate was too high, there would be considerate fluctuations (both up and down) across my epochs. Considering that I had so little epochs, I decided to reduce the learning rate so that I would have a higher chance of increasing the accuracy with each epoch. This worked out well for the hyperparameters I ended up going with.

**(d)** Test your sentiment-classifier on the test set.

In [110]:
loss, accuracy = model.evaluate(X_test, Y_test)

print(f'Test Loss: {loss}')
print(f'Test Accuracy: {accuracy}')

[1m157/157[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 583us/step - accuracy: 0.8599 - loss: 0.3350
Test Loss: 0.3363731801509857
Test Accuracy: 0.8611999750137329


It seems that the test accuracy is extremely similar to the validation accuracy, which is good!

It means that the model is good at generalizing on unseen data and that the split between train/test/validation was good as well, since there was no considerable fluctutation between the test and validation accuracy results.

Since I couldn't bring my validation accuracy any higher, I also didn't expect the test accuracy to go higher than what we got in the previous step.

I am happy with these results.

**(e)** Use the classifier to classify a few sentences you write yourselves. 

In [152]:
new_text = [
    "This movie was terrible! The acting was awful.",
    "Amazing movie!",
    "Cate Blanchett",
    "I have seen more boobs in this movie than in my life",
    "I love the authenticity of the actors",
    "a lot of violence",
    "too much violence",
    "i hated this movie",
    "Adam Sandler is ugly and acts bad",
    "I actually cried",
    "beautiful animation",
    "This studio only does shitty crappy movies",
    "terrible",
    "This studio only does terrible movies",
    "I hate this movie",
    "I hated this movie",
    "I hated hated hated hated this movie"]


def print_models_preds(newtextarray):
    new_text_vectorized = vectorizer.transform(newtextarray)

    predictions = model.predict(new_text_vectorized)

    for i, pred in enumerate(predictions):
        print(pred)
        sentiment = "positive" if pred[0] < 0.5 else "negative"
        print(f"Original Text: {newtextarray[i]}")
        print(f"Predicted Sentiment: {sentiment}\n")
    
print_models_preds(new_text)



[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 26ms/step
[0.86603284 0.13396718]
Original Text: This movie was terrible! The acting was awful.
Predicted Sentiment: negative

[0.28333795 0.71666205]
Original Text: Amazing movie!
Predicted Sentiment: positive

[0.43251127 0.56748873]
Original Text: Cate Blanchett
Predicted Sentiment: positive

[0.29950377 0.7004962 ]
Original Text: I have seen more boobs in this movie than in my life
Predicted Sentiment: positive

[0.3316189 0.6683811]
Original Text: I love the authenticity of the actors
Predicted Sentiment: positive

[0.3597052  0.64029485]
Original Text: a lot of violence
Predicted Sentiment: positive

[0.5069899  0.49301013]
Original Text: too much violence
Predicted Sentiment: negative

[0.46559173 0.5344083 ]
Original Text: i hated this movie
Predicted Sentiment: positive

[0.64204276 0.35795733]
Original Text: Adam Sandler is ugly and acts bad
Predicted Sentiment: negative

[0.39344728 0.6065528 ]
Original Text: I ac

By printing out the predicted sentiment alongside the model's confidence for each of these reviews, we can conclude that the "Bag of Words" method might not be the best way to go for this exercise because perhaps it removes emphasis from the words that actually matter for the sentiment of the review. An average person would immediatelly determine "I hated this movie" and "This studio only does shitty crappy movies" as negative reviews, but the model, despite being a bit "on edge" (only around 60% sure), still classified them as positive.

From testing with my own sentences, I also realized that the amount of times that a word appears in a review greatly influences the output of this model ("I hated this movie" being classified as positive but "I hated hated hated hated this movie" being classified as negative).

One way to make this model more robust would be to remove neutral words from the bag of words (for example "I", "movie", "this", "is") and all of these words that are required to make a sentence make sense. Then, the remaining words would carry far more meaning to the model, perhaps making it perform a lot better (Because the sentence "I hated this movie" would have just become "hated" and be far more likely to be classified as negative).

I would also like to note that perhaps the reason why the model didn't perform so well on my hand-written reviews is because they are drastically shorter than those we see on imdb, so this type of data might be a bit foreign to our model ahahahah.

Having observed this, I want to try it with actual imdb reviews.

Below, I have added three positive and three negative reviews (10 stars and 1 star) from Everything Everywhere All at Once.

In [154]:
imdb_text = [
    "I have trouble turning off my brain. Anxieties, worries, mundane to-dos, even positive things, sometimes feel like they're swirling around in a chaotic funnel cloud and I would like nothing more than to sit in physical and mental silence. Everything Everywhere All At Once felt like the inside of my head. In a world of non-stop, 24/7 news, most of it bad, how is a person like me, who has trouble filtering out things that affect me directly from all of the other things that are just out there happening in general and over which I have no control, supposed to cope? One answer is to decide that nothing matters anyway and give up caring. But that means deciding that my wife doesn't matter. And that my kids don't matter. And that art, and nature, and things that bring joy to my life, don't matter. Another way is to decide that some things, ok maybe most things, don't matter, but that there are things that do, and those are the things that make it all worth it. I get to decide what those things are. The first approach is nihilistic. The second approach is empowering. This film explores both approaches, and I was a sobbing mess at the end. I will say there were times that I was a bit exhausted by this movie. It throws a lot on the screen and at the viewer, and occasionally it can't keep up with its ambitions. But this was mostly a home run. Michell Yeoh does terrific work in this, but the MVP is Ke Huy Quan (Short Round from the Indiana Jones movies).",
    "Be kind, especially when you don't know what's going on. If only we could recognize that those who combat us in life do so out of hidden pain, and fight them with kindness. If only we could accept our kids for who they are, and say supportive things from the heart instead of trying to mold them. If only we could be content with the life we have, and set aside the idea of the countless other lives we might have led, had we made different decisions along the way. If only we could see that the flipside to life being meaningless and everything ultimately being sucked into the abyss is the freedom that comes from that, that we can do anything with the time we've got. Like the first word in its title, this film feels like everything. While watching it I thought Stephanie Hsu was everything too, but then again, so was Michelle Yeoh and Jamie Lee Curtis. The homage to Wong Kar-Wai with Ke Huy Quan making his enlightened speech is extraordinary, and the fact that James Hong was still getting it done at age 93 was wonderful. I also loved how both mother and daughter, troubled as they were with their upbringing, each found comfort in a kind and patient partner, and Tallie Medel's character was a nice touch. The film threatens to go off the rails with its madcap multiverse hopping in the first part, but it's all a setup for the second part, which is incredibly powerful. It was impressive that it managed to be so entertaining along the way, with its fast-paced action and the wild ride it takes us on. It's one that rewards a rewatch to, as its full of little details and references. Just a great film, lots of fun, and from the heart.",
    "Everything Everywhere All At Once is even crazier than the trailer would lead one to believe. It's bursting with so many original, weird concepts and ideas that no serious storyteller in their right mind would ever dare to put in their movie. But Daniels did. And it works with spectacularly effective results. These filmmakers choose not only to put every bizarre idea they could think of into their movie, but they ensure that every oddity adds something to the thought-provoking, emotionally resonant themes that pervade the story. So many angles can be explored surrounding this story of emotional connection and the things that distract from it. As Evelyn connects with her alternate selves and alternate relatives, there is both envy of and pity toward them that is explored. It's such a cathartic experience to watch this woman discover what she truly wants from life and loves about her life. All of the beautiful themes and heartwarming character moments are just the tip of the iceberg here, though. This is one of the funniest movies I have ever seen. The Daniels lean into the absurdity of the world that they have created, always reaching for the most outlandish possibilities and never playing it safe. It makes for some truly outstanding visual gags that will never leave my mind., and it makes for the most unique viewing experience I have ever had in my life. This is a masterpiece of originality. There is not, nor will there ever be, a film quite like this one. And I cannot wait to watch it again and again and again.",
    "This movie was supposed to be the best ever, but it just freaked me out. The movie is all over the place and the depth it could have had is ruined by randomness, absurd humor and pretenciousness. In one of the scenes the characters suddenly have sausages instead of fingers and they make a pretensious referance to A space odyssey where the pre human apes fight with sausage fingers. Maybe that's just humourous and well meant, but I could hardly watch. It's impossible to tell if this movie is a comedy or an action movie or a deep drama and if such confusion is your thing, then maybe you would love it. I could barely stand it.",
    "I cannot understand why this movie got high reviews. It is overly strange but not in a good way. It's strange in a way that makes absolutely no sense. There's no good acting or effects. The story is almost non-existing. There is no cohesion and no characters that makes sense at all. It may have some famous characters or a famous director, I'm not sure. I recognized the main character only. It's like someone did a school project while doing drugs and ended with this weird but bad result. It's too far fetched even from an artistic point of view. I cannot recommend it no matter what preferences you may have.",
    "I watch and re-watch a lot of movies per year as a hobby (no money involved), and today I have 9988 reviews in IMDb. Last month, I received an email from IMDb listing Everything Everywhere All at Once as one of the Top-10 movies of 2022. In IMDb, it is informed that this flick is nominated for 10 BAFTA Awards, 240 wins and 351 nominations. I can only understand that this is a heard behavior to the promotion of studio, using professional critics and press to promote such a garbage. I cannot envision a normal being, without financial interest or being manipulated by critics, to enjoy this crap. It seems to be a bad trip of the writers turned into a movie by insane productors. In the end, this film is 2h 19 min of complete waste of time. My vote is one (awful).",
]

print_models_preds(imdb_text)

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 26ms/step
[0.1804981 0.8195018]
Original Text: I have trouble turning off my brain. Anxieties, worries, mundane to-dos, even positive things, sometimes feel like they're swirling around in a chaotic funnel cloud and I would like nothing more than to sit in physical and mental silence. Everything Everywhere All At Once felt like the inside of my head. In a world of non-stop, 24/7 news, most of it bad, how is a person like me, who has trouble filtering out things that affect me directly from all of the other things that are just out there happening in general and over which I have no control, supposed to cope? One answer is to decide that nothing matters anyway and give up caring. But that means deciding that my wife doesn't matter. And that my kids don't matter. And that art, and nature, and things that bring joy to my life, don't matter. Another way is to decide that some things, ok maybe most things, don't matter, but that 

The model does a great job at classifying positive reviews with EXTREME confidence, but is far less confident when it comes to negative reviews, even misclassifying the first negative one.

I also noticed, with my naked eye, that the positive reviews are far longer than the negative. Could it be that, due to using the Bag of Words method, the method always leans more towards positive reviews since those contain more words and so the sentiment of positivity is associated with far more words than the negative sentiment?

Using real reviews made the analysis of this model even more interesting!!!

I couldn't resist but to also add reviews of 5 stars for the same movie... I thought their sentiment would be so-so, but they are actually quite negative! Some of them felt even more negative than the one-star ones. Let's have a look.

In [155]:
five_star_text = [
    "This one is a meh film, someone's alternate idea of multi-verse sci-fi and how do we script things to get Michelle Yeoh doing martial arts again ? A long, long way to push a theme about LGBT acceptance. The anal plugs and dildos were just odd and a strange inclusion. Some of the characters were outright annoying such as the daughter and the various minions. The hubby was OK as he smoothly switched personalities. Good old James Hong still with us, always recall him as hey that guy.... Jamie Lee Curtis is fine as the IRS Auditor hamming it up. And Yeoh going through the paces, looks like a bit of a payday for her. All in all certainly not one of the better films of the year.",
    "The onslaught of Visual Metaphors gets old, and one is looking for the film to just get it over with and end. So disappointing - doesn't live up to expectations nor the hype. First half has some engaging moments, but it could have wrapped up scenes without explaining what we already knew. I don't need a primer to explain the content and themes of a film thank you very much.This film could have been so much better, but it fails to rely upon the intricate visuals (so painstakingly laid out), deciding instead we needed explanation of the filmmakers' intent. Such a hedge essentially ruined this film - it is awkward and runs about 45 minutes too long because of it. Nice effort visually - my rating is frankly higher than I feel it should be, but the visuals were often witty, so...5.Overall, I had to force myself to see entire film, and it was painful. I feel cheated out of what could have been. Perhaps one Director would have made a difference. Don't know. Won't ever know. It's a self-indulgent mess.",
    "Another example of a incoherent and weak storytelling, weird shifts in tempo and bad exposition after people become even more confused.Besides two fight scenes I was bored and annoyed to death by the film. There was only one scene I enjoyed very much about two stones sitting on a ledge of a canyon.",
    "I had high hopes for this film. I love Michelle Y. She is a brilliant and talented actor. However this film is a big let down. Her acting is excellent as usual. The plot is all over the place. It is random to the point of feeling like some sort of bizarre circus show from an alternative Universe.I am completely baffled by the rating and can only concur that it's high because of the pretty high standard of visual effects and constant barrage of random action.I Felt like I was eating 10 portions of sticky toffee pudding and needed to vomit by the end but I was to delirious to find a place to release the contents of my stomach and therefore had to hold it in. Not a great feeling!This film had huge potential. The cast is great. It ended up feeling lazy and confusing. Maybe that was the intended effect? Maybe this is a film that is feeding into the random Instagram/Tik Tok scrolling generation?For those with huge attention deficit this film might be a blessing.Not for me I'm afraid. A big waste of talent."
    "You know that feeling when someone starts telling you an anecdote and you think its going to be funny, but they just go on and on until eventually you switch off to, what they're saying... that's how this film was for me. I was laughing out loud for the first 20 minutes, enjoying the bizarreness of it all, then I was smiling, then just watching... after an hour I'd picked up my iPad and was checking emails, glancing up at the TV screen, where the nonsense was still trying to make me laugh. But it was just more nonsense. When The End came up, it was a relief. I know there was 50 mins+ left, but I really didn't care. Gave up and watched something else."
]

print_models_preds(five_star_text)

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 31ms/step
[0.53321844 0.46678147]
Original Text: This one is a meh film, someone's alternate idea of multi-verse sci-fi and how do we script things to get Michelle Yeoh doing martial arts again ? A long, long way to push a theme about LGBT acceptance. The anal plugs and dildos were just odd and a strange inclusion. Some of the characters were outright annoying such as the daughter and the various minions. The hubby was OK as he smoothly switched personalities. Good old James Hong still with us, always recall him as hey that guy.... Jamie Lee Curtis is fine as the IRS Auditor hamming it up. And Yeoh going through the paces, looks like a bit of a payday for her. All in all certainly not one of the better films of the year.
Predicted Sentiment: negative

[0.97885555 0.02114445]
Original Text: The onslaught of Visual Metaphors gets old, and one is looking for the film to just get it over with and end. So disappointing - doesn't 

Our model caught all of them! So it does perform pretty good on real reviews from imdb. Maybe we should feet it YouTube comments too next time and it'll do better with the short-form commentary ahahahah.

This assignment was super interesitng :).