This code loads the IMDB dataset from Keras, pads the sequences to a fixed length, builds a GRU model with an embedding layer, trains the model on the data, and evaluates the model on the test set. The model achieves a validation accuracy of around 0.83 after 5 epochs. Note that you may need to adjust the hyperparameters and experiment with different architectures to achieve better performance.





In [14]:
import numpy as np
from tensorflow import keras
from tensorflow.keras.datasets import imdb
from tensorflow.keras.layers import Dense, GRU, Embedding
from tensorflow.keras.preprocessing.sequence import pad_sequences

# Set hyperparameters
max_features = 30000  # Number of words to consider as features # define vocab of 30000 words
maxlen = 300  # set Max sequence length of each row to 300
embedding_size = 128  # Embedding dimension

# Load data
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)

# Pad sequences 
x_train = pad_sequences(x_train, maxlen=maxlen)
x_test = pad_sequences(x_test, maxlen=maxlen)

# Build GRU model
model = keras.Sequential()
model.add(Embedding(max_features, embedding_size, input_length=maxlen))
model.add(GRU(128, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

# Train model
model.fit(x_train, y_train, validation_data=(x_test, y_test), epochs=5, batch_size=128)

# Evaluate model
loss, accuracy = model.evaluate(x_test, y_test, batch_size=128)
print('Test loss:', loss)
print('Test accuracy:', accuracy)




Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Test loss: 0.4540673792362213
Test accuracy: 0.8378400206565857


In [11]:
# Define function to predict sentiment of new review
def predict_sentiment(new_review):
  # Convert new review to word indices and pad sequence
  sequence = keras.preprocessing.text.text_to_word_sequence(new_review)
  word_index = imdb.get_word_index()
  test_data = [[word_index[word] if word in word_index and word_index[word] < max_features else 0 for word in sequence]]
  test_data = pad_sequences(test_data, maxlen=maxlen)
  prediction = model.predict(test_data)[0][0]
  if prediction >= 0.5: #threshold = 0.5
    return 'The sentiment of the input text is Positive'
  else:
    return 'The sentiment of the input text is Negative'

# Test the function
reviews = ["This movie was great!", "This movie was horrible and terrible.", "This movie was really good. I enjoyed it a lot."]
for review in reviews:
    sentiment = predict_sentiment(review)
    print(sentiment)




The sentiment of the input text is Positive
The sentiment of the input text is Negative
The sentiment of the input text is Positive


![alt text](https://i.pinimg.com/originals/44/b2/1c/44b21cbdd640cbb1ae57470151787aac.png)"
