<a href="https://colab.research.google.com/github/shalomorlando/CAT2_-AI/blob/main/Sentiment_Analysis_Using_Artificial_Neural_Nets_(Keras).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# The goal of this notebook is to build a ANN model using Keras that will determine whether movie reviews are positive, negative or neutral.

**Initial setup: Library Imports**

In [1]:
from tensorflow import keras
import numpy as np
from keras import layers
from keras import Sequential
from keras.utils import to_categorical
from sklearn.feature_extraction.text import CountVectorizer

**Getting the data from Keras.datasets**



In [2]:
review_dataset = keras.datasets.imdb

**Loading the data into variables**

In [3]:
(training_x, training_y),(testing_x, testing_y) = review_dataset.load_data(num_words=10000)

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/imdb.npz


  x_train, y_train = np.array(xs[:idx]), np.array(labels[:idx])
  x_test, y_test = np.array(xs[idx:]), np.array(labels[idx:])


**prepairing data using text vectorization**

In [4]:
def vectorize_text(sequences, dimension=10000):
    # create a zero matrix of shape (len(sequences), dimension)
    result = np.zeros((len(sequences), dimension))
    for i,sequence in enumerate(sequences):
        result[i, sequence] = 1
    return result

**Splitting data**

In [5]:
testing_features = vectorize_text(testing_x)
training_features = vectorize_text(training_x)

testing_targets = np.asarray(testing_y).astype('float32')
training_targets = np.asarray(training_y).astype('float32')

#testing_features.shape
#testing_targets.shape

**Building and training the model using Keras's Sequential model class**

In [6]:
model = Sequential()

#add input layer
model.add(layers.Dense(50, activation = "relu", input_shape=(10000, )))

#add hidden - Layers
model.add(layers.Dropout(0.3, noise_shape=None, seed=None))
model.add(layers.Dense(50, activation = "relu"))
model.add(layers.Dropout(0.2, noise_shape=None, seed=None))
model.add(layers.Dense(50, activation = "relu"))

# Output- Layer
model.add(layers.Dense(1, activation = "sigmoid"))
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense (Dense)                (None, 50)                500050    
_________________________________________________________________
dropout (Dropout)            (None, 50)                0         
_________________________________________________________________
dense_1 (Dense)              (None, 50)                2550      
_________________________________________________________________
dropout_1 (Dropout)          (None, 50)                0         
_________________________________________________________________
dense_2 (Dense)              (None, 50)                2550      
_________________________________________________________________
dense_3 (Dense)              (None, 1)                 51        
Total params: 505,201
Trainable params: 505,201
Non-trainable params: 0
__________________________________________________

**Setting up the model for training**

In [7]:
'''the optimization algorithm used is Adam a variation of Gradient descent
binary_crossentropy is used to calculate the loss'''

model.compile(optimizer="adam", loss="binary_crossentropy", metrics="accuracy")

**Training the model**

In [8]:
model.fit(training_features, training_targets, epochs=10)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x7f04e82ff668>

**Testing the model**

In [9]:
_, accuracy = model.evaluate(testing_features, testing_targets) #Returns the loss value & metrics values for the model in test mode.



In [10]:
print("On testing, the model has an accuracy of:", accuracy * 100, "percent")

On testing, the model has an accuracy of: 86.30399703979492 percent


**Applying the model**

In [11]:
predictions = model.predict(testing_features)
predictions[30]

array([0.9999994], dtype=float32)