# Recurrent Neural Network

### Load data

In [2]:
pip install pandas

Defaulting to user installation because normal site-packages is not writeable
Note: you may need to restart the kernel to use updated packages.


In [3]:
import pandas as pd
import re
from keras.preprocessing.text import Tokenizer
from keras.preprocessing.sequence import pad_sequences
from sklearn.model_selection import train_test_split
from keras.models import Sequential
from keras.layers import Embedding, LSTM, Dense
import numpy as np

### Visualize data

In [4]:
data = pd.read_csv('IMDB Dataset.csv')

# Keeping only the necessary columns
data = data[['review', 'sentiment']]
data.head(10)

Unnamed: 0,review,sentiment
0,One of the other reviewers has mentioned that ...,positive
1,A wonderful little production. <br /><br />The...,positive
2,I thought this was a wonderful way to spend ti...,positive
3,Basically there's a family where a little boy ...,negative
4,"Petter Mattei's ""Love in the Time of Money"" is...",positive
5,"Probably my all-time favorite movie, a story o...",positive
6,I sure would like to see a resurrection of a u...,positive
7,"This show was an amazing, fresh & innovative i...",negative
8,Encouraged by the positive comments about this...,negative
9,If you like original gut wrenching laughter yo...,positive


### Format data

In [5]:
import re
from keras.preprocessing.text import Tokenizer
from keras.preprocessing.sequence import pad_sequences

# Remove neutral feedback
data = data[data['sentiment'] != "Neutral"]

# Handle NaN values in 'review' column
data['review'] = data['review'].fillna('')

# Convert 'review' to lowercase
data['review'] = data['review'].str.lower()

# Remove special characters
data['review'] = data['review'].apply(lambda x: re.sub('[^a-zA-z0-9\s]', '', x))

# Tokenization
max_features = 2000
tokenizer = Tokenizer(num_words=max_features, split=' ')
tokenizer.fit_on_texts(data['review'].values)
X = tokenizer.texts_to_sequences(data['review'].values)
X = pad_sequences(X)
y = data['sentiment']

### Training set

In [6]:
# Convert feedback to categorical
Y = pd.get_dummies(data['sentiment']).values

# Split data into training and testing sets
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.33, random_state=42)


### Design a model

In [7]:
model = Sequential()
model.add(Embedding(max_features, 128, input_length=X.shape[1]))
model.add(LSTM(128))
model.add(Dense(2, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

### Training

In [8]:
batch_size = 64
model.fit(X_train, Y_train, epochs=10, batch_size=batch_size, verbose=2)

Epoch 1/10
105/105 - 727s - loss: 0.6128 - accuracy: 0.6391 - 727s/epoch - 7s/step
Epoch 2/10
105/105 - 754s - loss: 0.3829 - accuracy: 0.8316 - 754s/epoch - 7s/step
Epoch 3/10
105/105 - 776s - loss: 0.2779 - accuracy: 0.8918 - 776s/epoch - 7s/step
Epoch 4/10
105/105 - 800s - loss: 0.2204 - accuracy: 0.9158 - 800s/epoch - 8s/step
Epoch 5/10
105/105 - 838s - loss: 0.1932 - accuracy: 0.9275 - 838s/epoch - 8s/step
Epoch 6/10
105/105 - 4861s - loss: 0.1532 - accuracy: 0.9393 - 4861s/epoch - 46s/step
Epoch 7/10
105/105 - 1173s - loss: 0.1281 - accuracy: 0.9537 - 1173s/epoch - 11s/step
Epoch 8/10
105/105 - 1147s - loss: 0.1104 - accuracy: 0.9612 - 1147s/epoch - 11s/step
Epoch 9/10
105/105 - 859s - loss: 0.1407 - accuracy: 0.9485 - 859s/epoch - 8s/step
Epoch 10/10
105/105 - 884s - loss: 0.1280 - accuracy: 0.9566 - 884s/epoch - 8s/step


<keras.src.callbacks.History at 0x1ed517ceeb0>

### Validation

In [9]:
score, acc = model.evaluate(X_test, Y_test, verbose=2, batch_size=batch_size)
print("Test Score:", score)
print("Test Accuracy:", acc)

52/52 - 24s - loss: 0.5770 - accuracy: 0.8333 - 24s/epoch - 457ms/step
Test Score: 0.5770147442817688
Test Accuracy: 0.8333333134651184


### Formatting Test Example

In [22]:
#NEGATIVE
#text = "What happened? What we have here is basically a solid and plausible premise and with a decent and talented cast, but somewhere the movie loses it. Actually, it never really got going. There was a little excitement when we find out that Angie is not really pregnant, then find out that she is after all, but that was it. Steve Martin, who is a very talented person and usually brings a lot to a movie, was dreadful and his entire character was not even close to being important to this movie, other than to make it longer. I really would have liked to see more interactions between the main characters, Kate and Angie, and maybe try not for a pure comedy, which unfortunately it was not, but maybe a drama with comedic elements. I think if the movie did this it could have been very funny since both actresses are quite funny in their own ways and sitting here I can think of numerous scenarios that would have been a riot."
#POSITIVE
#text="I haven't read the Anne Rice novel that this movie was based on, but who knows, maybe reading the book is cheaper than renting QUEEN OF THE DAMNED and is probably better for your health. It isn't that this movie is necessarily bad for your health, but a book can be very relaxing and certainly exercises the active part of your brain more so than this movie. You can count the number of pages by Anne Rice that I've read on one hand, but after seeing this movie and Interview with a Vampire, I get the feeling that she writes really good novels. The plots for both movies hint at a whole sea of deep and interwoven vampire history.<br /><br />Still, Stuart Townsend's voice-over narration gets a heck of a lot more annoying than Brad Pitt's vampire narrative ever did, and you can tell that QUEEN OF THE DAMNED's limited production resources barely give enough flesh to the Anne Rice storyline. While Interview decided to go with lace and elegance, QUEEN relies on low budget special effects that try really hard to be taken seriously. One can see that the original novel had potential as a movie and that the production team focused its attention in the wrong places. The costumes and rock & roll stage could have been replaced with more blood and an eerier soundtrack.<br /><br />However, I'll give credit where credit is due. The soundtrack is excellent. Korn and Disturbed had me down with the sickness bobbing my noggin like Butthead.<br /><br />The film opens with a very cool Goth-rock zoom & splice montage, but after the first ten minutes or so, the directing degenerates quickly. It's as if the movie was so long that the director realized that there wasn't enough time and enough money to do an Anne Rice novel justice. What results are some mediocre vampire scenes and plenty of cheesy special effects. Unfortunately, QUEEN OF THE DAMNED fails to do the genre justice just as its John Carpenter counterparts fail to impress. Where are the yellow contacts? Where's the pale blue make-up? Scene after scene, I shook my head reminiscing about the days of Salem's Lot and Fright Night when low budget was done right.<br /><br />There are redeeming qualities though that save this movie from being garbage. Props to Aaliyah, and may her soul forever rest in peace. She might have become a renowned actress, had her life not been taken from us so prematurely, for she did give this movie a decent performance with plenty of nice belly dancing. Did I mention that the soundtrack was good? Let's see, what else can I say? It wasn't too long. The Anne Rice novel could have easily been a three hour movie if an ambitious director like Francis Ford Coppola got his hands on it. There are a few twists and turns here and there in the plot. But all in all it was a legitimate rock and roll addition to the slew of second-rate vampire movies out there. The director of this movie went on to direct a new Battlestar Galactica mini series if that tells you anything.<br /><br />JY<br /><br />Jimboduck-dot-com"
#NEGATIVE
#text= "I watched mask in the 80's and it's currently showing on Fox Kids in the UK (very late at night). I remember thinking that it was kinda cool back in the day and had a couple of the toys too but watching it now bores me to tears. I never realised before of how tedious and bland this cartoon show really was. It's just plain awful! It is no where near in the same league as The Transformers, He-man or Thundercats and was very quickly forgot by nearly everyone once it stopped being made. I only watch it on Fox Kids because Ulysses 31 comes on straight after it (that's if mask doesn't put me to sleep first). One of the lesser 80's cartoons that i hope to completely forget about again once it finishes airing on Fox Kids!"
#POSITIVE
text= "These slasher pics are past their sell by date, but this one is good fun.<br /><br />The valentine cards themselves are witty, and well thought out.<br /><br />The film has one Peach of a line... 'He's no Angel....' when he in fact IS Angel!!! Watching Buffy reruns will never be the same!<br /><br />The cast is a sizzling display of young talent, but the story does not give them enough real depth. Denise Richards on the DVD extras seemed to think the girls on set bonded well together and this would give the feeling that you empathised with their characters. Sorry but NO!<br /><br />The direction is very good, managing to show very little actual gore, and relying on your imaginations implied threat. Much can be said also for the similar manner in which Miss Richards and Heigel do not remove their clothes...:-(<br /><br />Essentially, the main directorial plus, lies within the 'borrowing' of various other ideas from previous slasher flicks. Psycho's shower scene is tributed, along with Halloween's 'masking'. <br /><br />Murdering someone hiding in a bodybag though is a pretty original one as far as I know!!!<br /><br />Light viewing, not very scary but a few good jump moments. If it was a choice between The Hole and this though, choose The Hole. Slasher movies have had their day, and this is just another slasher. A very good slasher, but nothing groundbreaking!!!"

text = text.lower()

text = re.sub('[^a-zA-z0-9\s]', '', text)

test_sequence = tokenizer.texts_to_sequences([text])

test_sequence = pad_sequences(test_sequence, maxlen=X.shape[1])

prediction = model.predict(test_sequence)

sentiment = "Positive" if np.argmax(prediction) == 1 else "Negative"

print('Predicted Sentiment:', sentiment)

Predicted Sentiment: Positive
