<a href="https://colab.research.google.com/github/https-deeplearning-ai/tensorflow-1-public/blob/master/C3/W3/ungraded_labs/C3_W3_Lab_3_Conv1D.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!wget https://raw.githubusercontent.com/doantronghieu/DEEP-LEARNING/main/helper_DL.py
!pip install colorama
import matplotlib.pyplot as plt
plt.rcParams.update({'font.size':15})
import seaborn           as sns
sns.set()
import helper_DL as helper

# Ungraded Lab: Using Convolutional Neural Networks

In this lab, you will look at another way of building your text classification model and this will be with a convolution layer. As you learned in Course 2 of this specialization, convolutions extract features by applying filters to the input. Let's see how you can use that for text data in the next sections.

## Download and prepare the dataset

In [None]:
import tensorflow_datasets as tfds

# Download the subword encoded pretokenized dataset
dataset, info = tfds.load('imdb_reviews/subwords8k', with_info = True, as_supervised = True)

# Get then tokenizer
tokenizer = info.features['text'].encoder

In [None]:
BUFFER_SIZE = 10000
BATCH_SIZE  = 256

# Get the train and test splits
train_data, test_data = dataset['train'], dataset['test']

# Shuffle the training data
train_dataset = train_data.shuffle(BUFFER_SIZE)

# Batch and pad the datasets to the maximum length of the sequences
train_dataset = train_dataset.padded_batch(BATCH_SIZE)
test_dataset  = test_data    .padded_batch(BATCH_SIZE)

## Build the Model

In Course 2, you were using 2D convolution layers because you were applying it on images. For temporal data such as text sequences, you will use [Conv1D](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Conv1D) instead so the convolution will happen over a single dimension. You will also append a pooling layer to reduce the output of the convolution layer. For this lab, you will use [GlobalMaxPooling1D](https://www.tensorflow.org/api_docs/python/tf/keras/layers/GlobalMaxPool1D) to get the max value across the time dimension. You can also use average pooling and you will do that in the next labs. See how these layers behave as standalone layers in the cell below.

In [None]:
import tensorflow as tf
import tensorflow.keras as tfk
from tensorflow import nn
from tensorflow.keras import layers, losses, optimizers, models, Model
import numpy as np

In [None]:
# Hyperparameters
BATCH_SIZE  = 1   # Batch size
TIMESTEPS   = 20  # Sequence length
FEATURES    = 20  # Embedding size
FILTERS     = 128
KERNEL_SIZE = 5

# Define array input with random values
random_input = np.random.rand(BATCH_SIZE, TIMESTEPS, FEATURES)
print(f'Shape of input array: {random_input.shape}')

# Pass array to convolution layer and inspect output shape
conv1d = layers.Conv1D(filters = FILTERS, kernel_size = KERNEL_SIZE, activation = nn.relu)
result = conv1d(random_input)
print(f'Shape of conv1d output: {result.shape}')

# Pass array to max pooling layer and inspect output shape
gmp = layers.GlobalMaxPooling1D()
result = gmp(result)
print(f'Shape of global max pooling output: {result.shape}')

You can build the model by simply appending the convolution and pooling layer after the embedding layer as shown below.

In [None]:
# Hyperparameters
EMBEDDING_DIM = 64
FILERS        = 128
KERNEL_SIZE   = 5
DENSE_DIM     = 64

# Buid the model
model = models.Sequential([
    layers.Embedding(tokenizer.vocab_size, EMBEDDING_DIM),
    layers.Conv1D(filters = FILTERS, kernel_size = KERNEL_SIZE, activation = nn.relu),
    layers.GlobalMaxPooling1D(),
    layers.Dense(DENSE_DIM, activation = nn.relu),
    layers.Dense(1, activation = nn.sigmoid)                         
])

model.summary()

# Set the training parameters
model.compile(loss = losses.binary_crossentropy,
              optimizer = optimizers.Adam(),
              metrics = ['accuracy'])

## Train the model

Training will take around 30 seconds per epoch and you will notice that it reaches higher accuracies than the previous models you've built.

In [None]:
NUM_EPOCHS = 10

history = model.fit(train_dataset, epochs = NUM_EPOCHS, validation_data = test_dataset)

In [None]:
helper.plot_history_curves(history)

## Wrap Up

In this lab, you explored another model architecture you can use for text classification. In the next lessons, you will revisit full word encoding of the IMDB reviews and compare which model works best when the data is prepared that way.