## Lab-4.1: Text classification

**Submission:**

* You need to upload ONE document to Canvas when you are done
  * (1) A PDF (or HTML) of the completed form of this notebook
* The final uploaded version should NOT have any code-errors present
* All outputs must be visible in the uploaded version, including code-cell outputs, images, graphs, etc

**Instructions**

* In a Jupyter notebook named `lab-4.1.ipynb`
  * Train both an LSTM and a GRU on the IMDB dataset
  * Split the data into training, test, cross validation
  * manually tune the hyper-parameters and ANN architecture to get the highest accuracy that you can
  * **You only need to do this week's example with Keras (i.e. no PyTorch)**
  * Normalize the data as needed
  * Visualize the results at the end where possible
  * Partition data into training, validation, and test
  * Monitor training and validation throughout training by plotting
  * Print training, validation, and test errors at the very end
  * You `MUST` use early stopping: [click here](https://keras.io/api/callbacks/early_stopping/)
  * Do `MANUAL` hyper parameter tuning to try to achieve an optimal fit model
    * i.e. best training/validation loss without over-fitting
    * Explore L1 and L2 regularization and dropout
    * Explore different optimizers
    * Use the loss functions specified in the textbook
    * Explore different options for activation functions, network size/depth, etc
* **Document what is going on in the code, as needed, with narrative markdown text between cells.**
* *Submit the version with hyper parameters that provide the optimal fit*
  * i.e. you don't need to show the outputs of your hyper-parameter tuning process
  * See the Chollet textbook for reference code


# Hyperparameters 

Managing hyperparameters here for easier tuning of model. 

In [1]:
max_features = 10000 
maxlen = 500  
batch_size = 32
embedding_dim = 50
epochs = 10
lstm_units = 32
gru_units = 32
dropout_rate = 0.5
l1_reg = 0.01
l2_reg = 0.01
optimizer = 'adam' 

# Data

In [2]:
import tensorflow as tf
from tensorflow.keras.datasets import imdb
from tensorflow.keras.preprocessing import sequence

2024-04-03 12:20:23.697014: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-04-03 12:20:23.725320: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI AVX512_BF16 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


In [3]:
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)

x_train = sequence.pad_sequences(x_train, maxlen=maxlen)
x_test = sequence.pad_sequences(x_test, maxlen=maxlen)

x_val = x_train[:10000]
y_val = y_train[:10000]
x_train_part = x_train[10000:]
y_train_part = y_train[10000:]

In [4]:
print(tf.__version__)

2.16.1


# Build Models

In [5]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, LSTM, GRU, Dense, Dropout
from tensorflow.keras.regularizers import l1_l2

## LSTM

In [6]:
def build_lstm_model():
    model = Sequential()
    model.add(Embedding(max_features, embedding_dim, input_shape=(maxlen, )))
    model.add(LSTM(lstm_units, dropout=dropout_rate, recurrent_dropout=dropout_rate,
                   kernel_regularizer=l1_l2(l1=l1_reg, l2=l2_reg)))
    model.add(Dense(1, activation='sigmoid'))
    return model

In [7]:
lstm_model = build_lstm_model()
lstm_model.compile(optimizer=optimizer, loss='binary_crossentropy', metrics=['accuracy'])

  super().__init__(**kwargs)
2024-04-03 12:20:29.038600: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-04-03 12:20:29.041723: W tensorflow/core/common_runtime/gpu/gpu_device.cc:2251] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...


## GRU

In [8]:
def build_gru_model():
    model = Sequential()
    model.add(Embedding(max_features, embedding_dim, input_shape=(maxlen,)))
    model.add(GRU(gru_units, dropout=dropout_rate, recurrent_dropout=dropout_rate,
                  kernel_regularizer=l1_l2(l1=l1_reg, l2=l2_reg)))
    model.add(Dense(1, activation='sigmoid'))
    return model

In [9]:
gru_model = build_gru_model()
gru_model.compile(optimizer=optimizer, loss='binary_crossentropy', metrics=['accuracy'])

# Early Stopping

In [10]:
from tensorflow.keras.callbacks import EarlyStopping

early_stopping = EarlyStopping(monitor='val_loss', patience=2, verbose=1, mode='min', restore_best_weights=True)

Quick check to make sure tf is using gpu

In [11]:
#print tensorflow version
print(tf.__version__)

2.16.1


In [12]:
import tensorflow as tf

physical_devices = tf.config.list_physical_devices('GPU')
if len(physical_devices) > 0:
    print(f"Num GPUs Available: {len(physical_devices)}")
    for device in physical_devices:
        print(device)
else:
    print("Not found any GPU. Falling back to CPU.")


Not found any GPU. Falling back to CPU.


In [13]:
for gpu in physical_devices:
    tf.config.experimental.set_memory_growth(gpu, True)

In [14]:
from tensorflow.python.client import device_lib

def get_available_devices():
    local_device_protos = device_lib.list_local_devices()
    return [x.name for x in local_device_protos]

print(get_available_devices())


['/device:CPU:0']


2024-04-03 12:13:14.495996: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-04-03 12:13:14.496111: W tensorflow/core/common_runtime/gpu/gpu_device.cc:2251] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...


# Training

In [11]:
lstm = lstm_model.fit(x_train_part, y_train_part, epochs=epochs, batch_size=batch_size, validation_data=(x_val, y_val), callbacks=[early_stopping])
gru = gru_model.fit(x_train_part, y_train_part, epochs=epochs, batch_size=batch_size, validation_data=(x_val, y_val), callbacks=[early_stopping])

Epoch 1/10


 42/469 [=>............................] - ETA: 55s - loss: 5.8631 - accuracy: 0.5045

KeyboardInterrupt: 

In [None]:
import matplotlib.pyplot as plt

#function to plot training and validation accuracy and loss
def plot_history(history, title=''):
    acc = history.history['acc']
    val_acc = history.history['val_acc']
    loss = history.history['loss']
    val_loss = history.history['val_loss']
    epochs = range(1, len(acc) + 1)

    plt.figure(figsize=(12, 6))

    plt.subplot(1, 2, 1)
    plt.plot(epochs, acc, 'bo', label='Training acc')
    plt.plot(epochs, val_acc, 'b', label='Validation acc')
    plt.title(f'Training and Validation Accuracy {title}')
    plt.legend()

    plt.subplot(1, 2, 2)
    plt.plot(epochs, loss, 'bo', label='Training loss')
    plt.plot(epochs, val_loss, 'b', label='Validation loss')
    plt.title(f'Training and Validation Loss {title}')
    plt.legend()

    plt.show()

#plot LSTM model results
plot_history(lstm, title='(LSTM)')

#plot GRU model results
plot_history(gru, title='(GRU)')

#evaluate models on test set
lstm_test_loss, lstm_test_acc = lstm_model.evaluate(x_test, y_test)
gru_test_loss, gru_test_acc = gru_model.evaluate(x_test, y_test)

print(f'LSTM Test Accuracy: {lstm_test_acc}\nGRU Test Accuracy: {gru_test_acc}')
