<a href="https://colab.research.google.com/github/soerenml/tensorflow-certificate/blob/master/TF_certificate.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Tensorflow certificate

The following certificate follows the tutorials and guides from the official tensorflow [website](https://www.tensorflow.org/tutorials) and Geron (2019). In most cases, the code is taken from the examples with argumentation and explainations done by me.

The notebook is intented as a general introdution to TF as well as preperation for the certificate.

# 0) Load packages

In [None]:
# Install relevant packages first
import tensorflow as tf
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt

# Cleaner model building
from tensorflow.keras import layers
from tensorflow.keras import Sequential
from tensorflow.keras import losses

# Text classification
import os
import shutil
import re
import string

%load_ext tensorboard

print("Tensorflow Version: {version}".format(version=tf.__version__))

# Make the program deterministic
# https://www.tensorflow.org/api_docs/python/tf/random/set_seed
tf.random.set_seed(11)

**Check the hardware**

In [None]:
print("Is there a GPU available: "),
print(tf.config.experimental.list_physical_devices("GPU"))

# 1) Getting familiar with Tensors

Differnt kinds of tensors exist: <code>tf.Variable, tf.constant, tf.placeholder, tf.SparseTensor, and tf.RaggedTensor</code>.

Tensors differ to numpy arrays in a way that:
+ they can be loaded into accelerators (GPU/TPU)
+ are not immutable

The transition between numpy arrays and tensors is fluid. 
Nevertheless, while numpy arrays live in memory, tensors can live in GPU/TPU which causes incompatability ([Source](https://www.tensorflow.org/tutorials/customization/basics)).

In [None]:
# Let's create a two dimensional Tensor
x = tf.constant(
    [
     [1,2,3], [4,5,6]
    ]
)

# Tensors attributes
print(x)
print(x.shape)
print(x.dtype)

# Tensors can also be indexed (same as numpy arrays)
print(x[1,:])

**Conversion and performance**

TF does not allow automatic conversion as this can lead to performance issues and unexpected incompatability issues.



In [None]:
zf = tf.constant(18, dtype=tf.float64)
zi = tf.constant(18, dtype=tf.int64)
print(zf, zi)

# When adding the two, an error get thrown.
zf + zi

In [None]:
# To fix this, we need to find a mutual format using cast
zf + tf.cast(zi, dtype=tf.float64)

**The fluid relationship between tensors and arrays**

Numpy can be used for Tensor operations.

In [None]:
n = np.array([1,2,3,4,5])
print(tf.square(n))
print(np.square(x))

**Variables**

All Tensor data types are immutable - except variables. For an overview of differnt TensorTypes see Geron 2019 p. 383.

In particular during training, immutable datatypes are needed (e.g. updating weights). For such cases variables are used which can be updates with <code>assign</code>. Same as for other datatypes, there is no automatic conversion and datatypes must be the same.

In [None]:
t_var = tf.Variable(10, dtype=tf.int64)
print(t_var)

# Update the variable
t_var.assign(11)

**TF Functions**

TF Functions cab be used to add Python code to a TF-Graph. You should have in mind that only TF constructs can be used. Hence, the operations must be convertible to TF.

In [None]:
@tf.function
def tf_power(x):
  return x**11

# Run the tensorflow function
print(tf_power(7))

# Or simply run it as a Python function
print(tf_power.python_function(7))

# 2) Preprocessing data

Data preprocessing is pivotal and one of the most complicated parts of Tensorflow. There are two core components: tf.transform and tf

In [None]:
X = tf.range(10)
dataset = tf.data.Dataset.from_tensor_slices(X)
print(dataset)

# Iterate over the dataset
for item in dataset:
  print(item)

**Shuffle the dataset**

Geron (2019) uses the analogy of shuffling a card deck (cards are sorted with the highest cards first): when shuffling you would like to shuffle the whole card deck and pick a card. What you not want to do is just pick the first three cards and shuffle them, pick one, add the next card (number four) and repeat. The reason is obvious: if the whole deck would be sorted your 'shuffled deck' have descending card value (higher cards first, lower cards at the end).

Long story short: set buffer size = number of observations in the dataset.

In [None]:
X = tf.range(10)
dataset = tf.data.Dataset.from_tensor_slices(X)

# When added functions to map to not use ()
def pre(x): 
  return x*2

# Let's add additional transformation
dataset = (dataset
           .map(pre)
           .shuffle(buffer_size=30) # Shuffle the dataset
           .repeat(3) # repeat the dataset three times
           .batch(7) # take seven observations from each dataset
           )

# Important: we repeat the dataset three times (i.e. 30 items)
# This is why we have 4*7 and 1*2 as batch size.
for item in dataset:
  print(item)

**First, let's get some data some github**

In [None]:
sets = ["train.csv", "test.csv"]

url = "https://raw.githubusercontent.com/agconti/kaggle-titanic/master/data/train.csv"
df_train = pd.read_csv(url, error_bad_lines=False)

In [None]:
# Reorder the columns
cols = df_train.columns.tolist()
cols = cols[2:] + cols[0:1] + cols[1:2]
df_train = df_train[cols]
df_train = df_train.drop(columns=['Name'])
df_train.to_csv('./train.csv', index=False)
df_train.head()

**Working high speed input pipeline**

A lot of different things are happening here at the same time:

+ <code>prefetch(1)</code> - we keep one batch always at hand. Hence, while one batch has already been fed to the training cyle, another has been already loaded.
+ <code>num_of_parallel_calls</code> - helps to split up preprocessing horizontally

In [None]:
filepath="/content/train.csv"

n_inputs = 10
def preprocess(line):
  """
  Preprocessing function for the csv file.

  Parameters: 
    line: row of a csv file
  Return:
    x: features
    y: target
  """
  defs = [0.] * n_inputs + [tf.constant([], dtype=tf.int32)] # define default values
  fields = tf.io.decode_csv(records=line, record_defaults=defs) # read csv file
  x = tf.stack(fields[:-1])
  y = tf.stack(fields[-1:])
  return x, y

def csv_reader_dataset(filepath):
  """
  Create tf.dateset
  
  Parameters:
    filpath: path of the file to read
  Return:
    tensorflow dataset object
  """
  dataset=tf.data.Dataset.list_files(filepath)
  dataset=dataset.shuffle(buffer_size=30) # Shuffle the dataset 
  dataset=dataset.map(preprocess,
                      num_parallel_calls=tf.data.experimental.AUTOTUNE) # Run preprocessing
  dataset=dataset.repeat(1) # Repeat the dataset three times
  return dataset.batch(20).prefetch(1) # Select batch and prefetch

train_set = csv_reader_dataset(filepath)
print(train_set)


for line in train_set.take(1):
  print(line.numpy())

# 3) Working with structured data (TODO)

In [None]:
# Load dataset
sets = ["train.csv", "test.csv"]
for i in sets:
  url = "https://raw.githubusercontent.com/agconti/kaggle-titanic/master/data/{dataset}".format(dataset = i)
  pd.read_csv(url, error_bad_lines=False).to_csv("./{dataset}".format(dataset = i))

filepath="/content/train.csv"

# Define preprocessing function
n_inputs = 10
def preprocess(line):
  """
  There is a lot of back and forth with arrays and lists.
  I am using this approach to select items in the list.
  """
  defs = [0.] * n_inputs + [tf.constant([])] # define default values
  fields = np.array(tf.io.decode_csv(records=line, record_defaults=defs)) # read csv file
  x = tf.stack(fields[[0,2,3,4,5,6,7,8,9,10]].tolist()) 
  y = tf.stack(fields[[1]].tolist())
  return x, y

def csv_reader_dataset(filepath):
  """
  Create tf.dateset
  """
  dataset=tf.data.Dataset.list_files(filepath)
  dataset=dataset.shuffle(buffer_size=30) # Shuffle the dataset 
  dataset=dataset.map(preprocess,
                      num_parallel_calls=tf.data.experimental.AUTOTUNE) # Run preprocessing
  dataset=dataset.repeat(3) # Repeat the dataset three times
  return dataset.batch(7).prefetch(1) # Select batch and prefetch

train_set = csv_reader_dataset(filepath)
train_set

# 4) Image recognition

## Load data

In [None]:
mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

## Getting familiar with data shapes

Most of ML is linear algebra. Hence, it is important to become familiar with vectors, matrices, tensors and multidimensional spaces in general. Our training data essentially uses two shapes: 3-dimensional tensors for training and validation containing our features and one dimensional tensors (vectors) for validation containing our targets.

In our case we have a 60000x28x28 feature tensor. You can imagine 60k as the numbers of images and 28x28 as the image grid (x, y positions of the pixels). You can imagine the target tensor as a vector with a lenght of 60000.

P.S.: Because we have a three-dimensional tensor our images are black and white. E.g. we lack a fourth dimension representing colours (RGB).

**Source for the plots can be found [here](https://matplotlib.org/3.1.0/gallery/pyplots/fig_axes_labels_simple.html).**

### Formal description of datashapes

In [None]:
# This is an example of how a three dimensional vector is structure.
# You can imagine it as several matrices.

x = np.array(
    [
     [
      [1,2,3],
      [4,5,6],
      [7,8,9]
     ],
     [
      [1,2,3],
      [4,5,6],
      [7,8,9]
     ]
    ]
    )

print(x)

# The shape indicates two matrices with a 3x3 dimension.
print(x.shape)

### A visual description of datashapes

In [None]:
import matplotlib.pyplot as plt
import numpy as np
from mpl_toolkits.mplot3d import Axes3D 
# Change styles of graphs
import matplotlib as mpl
mpl.style.use('seaborn') 

print("Properties of the traning vector (train_x): {}".format(x_train.shape))

x, y, z = np.indices((28, 100, 28))
cube1 = (x < 28) & (y < 1) & (z < 28)
voxels = cube1 
colors = np.empty(voxels.shape, dtype=object)
colors[cube1] = 'blue'
fig = plt.figure()
ax = fig.gca(projection='3d')
ax.voxels(voxels, facecolors=colors, edgecolor='k')
plt.show()

In [None]:
x, y, z = np.indices((28, 100, 28))
cube1 = (x < 1) & (y < 1) & (z < 100)
voxels = cube1 
colors = np.empty(voxels.shape, dtype=object)
colors[cube1] = 'blue'
fig = plt.figure()
ax = fig.gca(projection='3d')
ax.voxels(voxels, facecolors=colors, edgecolor='k')
plt.show()

In [None]:
plt.imshow(x_train[0,:,:], cmap='gray')
plt.show()
print("Number: {}".format(y_train[0]))

## Creating the model

**Excurse numerical computation**

We are not using a softmax function for the last layer in our model. This is due to the fact that a softmax function is not necessary numerically stable. A compunter needs to stop calculating numbers at a certain point in time. This might lead to numbers extremely small (underflow) or large (overflow) values resulting in 0 or $\infty$ leading to complications with softmax:

$x_{i} = \frac{\exp(x_i)}{\sum_j \exp(x_j)}$


Btw. if you ever asked yourself, 'why are we using the log-likelihood'... for the the same reason. The log helps us to avoid underfitting.

[Source](http://www.deeplearningbook.org/contents/numerical.html)

**Layer structure**

Our first layer has an output shape of (None, 784). We use <code>None</code> for all layers, as we did not define the numbers of examples per shape - which is not necessary overall. 

Our first layer is a flattening layer transposing a n-dimensional- matrix into a one-dimensional vector (28x28=784).

Important, the layer has no weights, as it's the input layer.

The following layers are dense- or fully connected layers. Both layers have weights which are the product of shapes (e.g. (784+1)x10, (10+1)x10.

In [None]:
# Build model
model = Sequential([
    layers.Flatten(input_shape=(28, 28)),
    layers.Dense(512, activation="relu"),
    layers.Dense(10, activation="softmax")
])

# Show model structure
model.summary()

In [None]:
# Plot the model as .png
from tensorflow.keras.utils import plot_model
plot_model(model,
           to_file='model_plot.png',
           show_shapes=True,
           show_layer_names=True)

The layers are [initialized](https://keras.io/api/layers/initializers/). Hence, from a functional perspective, the model can be already used for prediciont.

Neverthless, the model is not trained yet. Let's see, how the untrained model performs.

In [None]:
# Ingest image
predictions_untrained = model(x_train[:1]).numpy()

# Use a softmax function to increase interpretability.
# Softmax functions return 'regular' probabilities (SUM(p(i)) = 1)
predictions_untrained = tf.nn.softmax(predictions_untrained).numpy()
print(predictions_untrained)

print("""
Predicted value should be: {label}\n
Total probabilities sum up to: {sums}""".format(
    label=y_train[0],
    sums=np.sum(predictions_untrained[0])))

In [None]:
# Plot probabilities
def plot_bar(input):
  '''Plotting function for barcharts'''
  objects = ('0', '1', '2', '3', '4', '5', '6', '7', '8', '9')
  y_pos = np.arange(len(objects))
  plt.bar(y_pos, input, align='center', alpha=0.5)
  plt.xticks(y_pos, objects)
  plt.ylabel('Likelihood')
  plt.xlabel('Number')
  plt.show()

plot_bar(predictions_untrained[0])

### Use tf.Data for faster ingestion

the main reason to use tf.data is efficiency. For further information see [here](https://www.tensorflow.org/guide/data_performance).

In [None]:
train_ds = tf.data.Dataset.from_tensor_slices((x_train, y_train))
train_ds = train_ds.shuffle(10000).batch(32)
test_ds = tf.data.Dataset.from_tensor_slices((x_test, y_test)).batch(32)

### Define callbacks

Callbacks are function which are executed after each epoch. Callbacks can be used to calculate custom matrics or -even more important- to interrupt training.

In [None]:
# Tensorboard callback
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir="./logs")

# Early stopping callback
class myCallback(tf.keras.callbacks.Callback):
  def on_epoch_end(self, epoch, logs={}):
    if(logs.get('accuracy')>0.9):
      print("\n Stopped training: " +str(logs.get('accuracy')))
      self.model.stop_training = True

# Important, the class needs to be initiated: callbacks = myCallback()

## Train the model

In [None]:
from tensorflow.keras.losses import SparseCategoricalCrossentropy

model.compile(optimizer='adam',
              loss=SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

callback_stop = myCallback()

hist = model.fit(train_ds,
                 validation_data=test_ds, 
                 epochs=100, # setting the epochs up for the early stopping to work.
                 callbacks=[tensorboard_callback, callback_stop])

# Let's have a look into our hist object
print(hist.history)

## Tensorboard

In [None]:
%tensorboard --logdir "./logs"

## Making predictions

In [None]:
# Make pred. using model.predict
example_result = model.predict((x_train[:1]))

# In order to get meanigful probabilities we can interpret
# we are using a softmax function.
predictions_trained = tf.nn.softmax(example_result)
plot_bar(predictions_trained[0])

In [None]:
# Let's compare model performance between test and training
loss, accuracy = model.evaluate(x_test,  y_test, verbose=2)

print("""
Model performance training: {eval}\n
Model performance test: {test}
""".format(eval=accuracy,
           test=hist.history['accuracy'][4]))

In [None]:
%%bash
rm -r ./logs

## Using CNNs

Let's use CNNs for the same task. CNN are characterized by two key components:

- Convolutional-layers
- Pooling-layers

Convolutional layers are "filter" layers helping to emphasize "features in an image" (see [Kernel](https://en.wikipedia.org/wiki/Kernel_(image_processing)) for further information).

In [None]:
mnist = tf.keras.datasets.fashion_mnist
(training_images, training_labels), (test_images, test_labels) = mnist.load_data()

# TODO - add callback class mycallback(tf.keras.call)


def reshape_norm(train, test):
  """
  Reshape and normalize data
  """
  training_img=train.reshape(60000, 28, 28, 1)
  training_img=training_img / 255.0
  test_img = test.reshape(10000, 28, 28, 1)
  test_img=test_img/255.0
  return training_img, test_img

train_img, test_img = reshape_norm(training_images, test_images)

# Build model
model = tf.keras.models.Sequential([
  tf.keras.layers.Conv2D(1000, (3,3), activation='relu', input_shape=(28, 28, 1)),
  tf.keras.layers.MaxPooling2D(2, 2),
  tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
  tf.keras.layers.MaxPooling2D(2,2),
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(
    optimizer='adam',
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
    )

model.summary()

model.fit(train_img, training_labels, epochs=5)

test_loss = model.evaluate(test_img, test_labels)

### How convolutional layers are working

TODO - write out math behind.
See noteboook here: https://colab.research.google.com/github/lmoroney/dlaicourse/blob/master/Course%201%20-%20Part%206%20-%20Lesson%203%20-%20Notebook.ipynb

# ImageGenerator

The image generator is very useful as it automatically labels images according to the folder structure. It is not part of tf.data.

In [None]:
!wget --no-check-certificate \
    https://storage.googleapis.com/laurencemoroney-blog.appspot.com/horse-or-human.zip \
    -O /tmp/horse-or-human.zip

In [None]:
import os
import zipfile

local_zip = '/tmp/horse-or-human.zip'
zip_ref = zipfile.ZipFile(local_zip, 'r')
zip_ref.extractall('/tmp/horse-or-human')
zip_ref.close()

# Directory with our training horse pictures
train_horse_dir = os.path.join('/tmp/horse-or-human/horses')

# Directory with our training human pictures
train_human_dir = os.path.join('/tmp/horse-or-human/humans')

In [None]:
model = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(16, (3,3), activation='relu', input_shape=(300, 300, 3)),
    tf.keras.layers.MaxPooling2D(2, 2),
    tf.keras.layers.Conv2D(32, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(512, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

model.summary()

In [None]:
from tensorflow.keras.optimizers import RMSprop

model.compile(loss='binary_crossentropy',
              optimizer=RMSprop(lr=0.001),
              metrics=['accuracy'])

## Preprocesing

In [None]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# Flow training images in batches of 128 using train_datagen generator
train_generator = ImageDataGenerator(rescale=1/255).flow_from_directory(
        '/tmp/horse-or-human/',  # This is the source directory for training images
        target_size=(300, 300),  # All images will be resized to 300x300
        batch_size=128,
        # Since we use binary_crossentropy loss, we need binary labels
        class_mode='binary')

In [None]:
history = model.fit(
    train_generator,
    steps_per_epoch=8,  
    epochs=15,
    verbose=1)

### Use the model to predict (TODO)

In [None]:
import numpy as np
from google.colab import files
from keras.preprocessing import image

uploaded = files.upload()


img = image.load_img(path, target_size=(300, 300))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
images = np.vstack([x])
outcome = model.predict(images, batch_size=10)

In [None]:
import numpy as np
import random
from tensorflow.keras.preprocessing.image import img_to_array, load_img

# Let's define a new Model that will take an image as input, and will output
# intermediate representations for all layers in the previous model after
# the first.
successive_outputs = [layer.output for layer in model.layers[1:]]
#visualization_model = Model(img_input, successive_outputs)
visualization_model = tf.keras.models.Model(inputs = model.input, outputs = successive_outputs)
# Let's prepare a random input image from the training set.
horse_img_files = [os.path.join(train_horse_dir, f) for f in train_horse_names]
human_img_files = [os.path.join(train_human_dir, f) for f in train_human_names]
img_path = random.choice(horse_img_files + human_img_files)

img = load_img(img_path, target_size=(300, 300))  # this is a PIL image
x = img_to_array(img)  # Numpy array with shape (150, 150, 3)
x = x.reshape((1,) + x.shape)  # Numpy array with shape (1, 150, 150, 3)

# Rescale by 1/255
x /= 255

# Let's run our image through our network, thus obtaining all
# intermediate representations for this image.
successive_feature_maps = visualization_model.predict(x)

# These are the names of the layers, so can have them as part of our plot
layer_names = [layer.name for layer in model.layers[1:]]

# Now let's display our representations
for layer_name, feature_map in zip(layer_names, successive_feature_maps):
  if len(feature_map.shape) == 4:
    # Just do this for the conv / maxpool layers, not the fully-connected layers
    n_features = feature_map.shape[-1]  # number of features in feature map
    # The feature map has shape (1, size, size, n_features)
    size = feature_map.shape[1]
    # We will tile our images in this matrix
    display_grid = np.zeros((size, size * n_features))
    for i in range(n_features):
      # Postprocess the feature to make it visually palatable
      x = feature_map[0, :, :, i]
      x -= x.mean()
      x /= x.std()
      x *= 64
      x += 128
      x = np.clip(x, 0, 255).astype('uint8')
      # We'll tile each filter into this big horizontal grid
      display_grid[:, i * size : (i + 1) * size] = x
    # Display the grid
    scale = 20. / n_features
    plt.figure(figsize=(scale * n_features, scale))
    plt.title(layer_name)
    plt.grid(False)
    plt.imshow(display_grid, aspect='auto', cmap='viridis')

# 5) Text classification

## Load Data

In [None]:
url = "https://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz"
dataset = tf.keras.utils.get_file("aclImdb_v1.tar.gz", url, untar=True, cache_dir='.')
print(dataset)
dataset_dir = os.path.join(os.path.dirname(dataset), 'aclImdb')

In [None]:
%%bash
rm -r datasets/aclImdb/train/pos/

In [None]:
#TODO - I do not understand the splitting.

batch_size = 32
seed = 42

# Define training set
raw_train_ds = tf.keras.preprocessing.text_dataset_from_directory(
    'datasets/aclImdb/train', 
    batch_size=batch_size, 
    validation_split=0.2, 
    subset='training', 
    seed=seed)

# Define validation set
raw_val_ds = tf.keras.preprocessing.text_dataset_from_directory(
    'datasets/aclImdb/train', 
    batch_size=batch_size, 
    validation_split=0.2, 
    subset='validation', 
    seed=seed)

# Define test set
raw_test_ds = tf.keras.preprocessing.text_dataset_from_directory(
    'datasets/aclImdb/train', 
    batch_size=batch_size)

In [None]:
# Have a look at the data
for text_batch, label_batch in raw_train_ds.take(1):
  for i in range(1):
    print("Review", text_batch.numpy()[i])
    print("Label", label_batch.numpy()[i])

## Preprocess the data

- Standardize: clean the data (e.g. punctuation, HTML)
- Tokenize (split sentences into words)
- Vectorize (create embeddings)

In [None]:
# Standardize / clean the data
# TODO - why is my alternative version not working?
def custom_standardization(input_data):
  lowercase = tf.strings.lower(input_data)
  stripped_html = tf.strings.regex_replace(lowercase, '<br />', ' ')
  return tf.strings.regex_replace(
      stripped_html,
      '[%s]' % re.escape(string.punctuation),
      '')

max_features = 10000
sequence_length = 250

vectorize_layer = tf.keras.layers.experimental.preprocessing.TextVectorization(
    standardize=custom_standardization,
    max_tokens=max_features, # number of words
    output_mode='int',
    output_sequence_length=sequence_length)

In [None]:
#TODO understand this function.
# Make a text-only dataset (without labels), then call adapt
train_text = raw_train_ds.map(lambda x, y: x)
vectorize_layer.adapt(train_text)

In [None]:
def vectorize_text(text, label):
  text = tf.expand_dims(text, -1)
  return vectorize_layer(text), label

In [None]:
train_ds = raw_train_ds.map(vectorize_text)
val_ds = raw_val_ds.map(vectorize_text)
test_ds = raw_test_ds.map(vectorize_text)

## Create the model

In [None]:
from tensorflow.keras import layers

embedding_dim = 16

model = Sequential([
  layers.Embedding(max_features + 1, embedding_dim),
  layers.Dropout(0.2),
  layers.GlobalAveragePooling1D(),
  layers.Dropout(0.2),
  layers.Dense(1, activation="sigmoid")
])

model.summary()

In [None]:
# Compile the model
model.compile(loss=losses.BinaryCrossentropy(from_logits=True),
              optimizer='adam',
              metrics=tf.metrics.BinaryAccuracy(threshold=0.0))


tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir="./logs")

# Train the model
epochs = 10
history = model.fit(
    train_ds,
    validation_data=val_ds,
    epochs=epochs,
    callbacks=tensorboard_callback)

In [None]:
loss, accuracy = model.evaluate(test_ds)

print("Loss: ", loss)
print("Accuracy: ", accuracy)

In [None]:
%tensorboard --logdir "./logs"

## Export the model

We defining the training pipeline, we had to standardize, tokenize and vectorize. We can bake in those three steps, reusing our vectorization layer.

In [None]:
# TODO - Understand wherer the differnces in loss come from.

# Create new model
export_model = tf.keras.Sequential([
  vectorize_layer,
  model,
  layers.Activation('sigmoid')
])

# Re-compile model as we added additional layers
export_model.compile(
    loss=losses.BinaryCrossentropy(from_logits=False), optimizer="adam", metrics=['accuracy']
)

# Test it with `raw_test_ds`, which yields raw strings
loss, accuracy = export_model.evaluate(raw_test_ds)
print(accuracy)

# without 21s 11ms/step - loss: 1.6500 - accuracy: 0.8131
# with 21s 11ms/step - loss: 0.3836 - accuracy: 0.8191
# included in model: 