## end to end Depp Learning Project Using Simple RNN

In [None]:
import numpy as np
import tensorflow as tf
from tensorflow.keras.datasets import imdb
from tensorflow.keras.preprocessing import sequence
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding,SimpleRNN,Dense

In [None]:
## load the imdb dataset

max_features = 10000
(X_train,y_train),(X_test,y_test) = imdb.load_data(num_words=max_features)

print(X_train.shape,y_train.shape)
print(X_test.shape,y_test.shape)

In [None]:
X_train[0] #--> this gives a one_hot representation of sentence each word's index out of 10000

In [None]:
## inspect the sample reviewa and its label
sample_review = X_train[0]
sample_label = y_train[0]
sample_review,sample_label
print(f"sample_review(as integers):{sample_review}")
print(f"sample_label:{sample_label}")

In [None]:
## mapping of words index back to words(for understanding)
word_index = imdb.get_word_index()
word_index


In [None]:
## reverse dictionary
reverse_word_index = {value:key for key,value in word_index.items()}
reverse_word_index

In [None]:
decoded_review = ' '.join([reverse_word_index.get(i-3,'?') for i in sample_review])

decoded_review 


In [None]:
max_len = 500
X_train = sequence.pad_sequences(X_train,maxlen=max_len)
X_test = sequence.pad_sequences(X_test,maxlen=max_len)

In [None]:
X_train[0]

In [None]:
model = Sequential()
model.add(Embedding(input_dim=max_features, output_dim=128, input_shape=(max_len,)))  # Set input_shape
model.add(SimpleRNN(128, activation='relu'))
model.add(Dense(1, activation='sigmoid'))

model.summary()


In [None]:
# create instance os early stopping callbacks
from tensorflow.keras.callbacks import EarlyStopping
early_stopping=EarlyStopping(monitor='val_loss',patience = 5,restore_best_weights = True)
early_stopping

In [None]:
model.compile(optimizer='adam',loss='binary_crossentropy',metrics=['accuracy'])

an epoch refers to one complete pass through the entire training dataset. During each epoch, the model processes all training samples, adjusts its weights, and learns patterns.

Why are epochs important?
Too few epochs → The model may not learn enough, leading to underfitting.

Too many epochs → The model may memorize the training data, causing overfitting.

Optimal epochs → Typically determined using validation loss or early stopping.

Epochs vs. Iterations vs. Batches
Batch: A subset of training samples processed at once.

Iteration: One update of model parameters (one batch processed).

Epoch: One full cycle through the dataset.

### batch_size
it refers to the number of training samples processed before the model updates its weights

validation_split=0.2 reserves 20% of the training data for validation. This validation data is not used for training, but helps assess the model’s performance after each epoch

In [None]:
# train the modek with early stopping

model.fit(
            X_train,y_train,epochs=10,batch_size=32,
          validation_split = 0.2,
          callbacks = [early_stopping]
          )

In [None]:
model.save('SimpleRNN_imdb.h5')