<a href="https://colab.research.google.com/github/rhiosutoyo/Teaching-Deep-Learning-and-Its-Applications/blob/main/02_sentiment_analysis_using_imdb_dataset.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Sentiment Analysis using IMDB Dataset
This example uses the IMDB dataset, which is a commonly used dataset for binary sentiment classification (positive or negative). The example utilize the Keras library with TensorFlow.

This code will build, train, and evaluate a simple LSTM-based model for sentiment analysis on the IMDB dataset. You can further tune the hyperparameters and experiment with different architectures to improve performance.

#Import Libraries

In [1]:
import tensorflow as tf
from tensorflow.keras.datasets import imdb
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, LSTM, Dense

#Load and Preprocess Data

In [2]:
# Load the IMDB dataset
max_features = 10000  # Number of words to consider as features
maxlen = 300  # Cut texts after this number of words

(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)

# Pad sequences to ensure uniform input length
x_train = pad_sequences(x_train, maxlen=maxlen)
x_test = pad_sequences(x_test, maxlen=maxlen)

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/imdb.npz


#Build the Model

In [3]:
model = Sequential([
    Embedding(max_features, 128, input_length=maxlen),
    LSTM(128, dropout=0.2, recurrent_dropout=0.2),
    Dense(1, activation='sigmoid')
])

#Compile the Model

In [4]:
model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

#Train the Model

In [5]:
batch_size = 32
epochs = 5

history = model.fit(x_train, y_train,
                    batch_size=batch_size,
                    epochs=epochs,
                    validation_data=(x_test, y_test))

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


#Evaluate the Model

In [6]:
score, acc = model.evaluate(x_test, y_test, batch_size=batch_size)
print(f'Test score: {score}')
print(f'Test accuracy: {acc}')

Test score: 0.4358404278755188
Test accuracy: 0.8654800057411194


#Make Predictions

In [9]:
# Make predictions on the test data
predictions = model.predict(x_test)
# Convert predictions to binary labels (0 or 1)
predicted_labels = (predictions > 0.5).astype("int32")



#Make Predictions (for single sentence)

In [17]:
from tensorflow.keras.preprocessing.text import Tokenizer

# Function to preprocess and predict the sentiment of a single sentence
def predict_sentiment(review, tokenizer, model, maxlen=maxlen):
    # Tokenize the review
    tokens = tokenizer.texts_to_sequences([review])
    # Pad the sequence
    tokens_pad = pad_sequences(tokens, maxlen=maxlen)
    # Predict the sentiment
    prediction = model.predict(tokens_pad)
    sentiment = 'positive' if prediction > 0.5 else 'negative'
    return sentiment, prediction[0][0]

# Prepare a tokenizer using the training data
word_index = imdb.get_word_index()
reverse_word_index = {value: key for key, value in word_index.items()}
tokenizer = Tokenizer(num_words=max_features) # Now Tokenizer is defined
tokenizer.word_index = word_index

## Positive Review

In [18]:
# Example review
pos_review = "This movie was fantastic! The acting was great and the story was compelling."

# Predict the sentiment
sentiment, score = predict_sentiment(pos_review, tokenizer, model)
print(f'Review: {pos_review}')
print(f'Sentiment: {sentiment}, Score: {score}')

Review: This movie was fantastic! The acting was great and the story was compelling.
Sentiment: positive, Score: 0.845703661441803


## Negative Review

In [19]:
# Example review
neg_review = "I was really disappointed with this film. The plot was predictable and boring, and the acting felt forced."

# Predict the sentiment
sentiment, score = predict_sentiment(neg_review, tokenizer, model)
print(f'Review: {neg_review}')
print(f'Sentiment: {sentiment}, Score: {score}')

Review: I was really disappointed with this film. The plot was predictable and boring, and the acting felt forced.
Sentiment: negative, Score: 0.164226695895195
