# TensorFlow 2: Creating Models


## Building a Neural Network in Tensorflow

TensorFlow Basics
1. Build a neural network that classifies images.
2. Train this neural network.
3. And, finally, evaluate the accuracy of the model.


In [0]:
import tensorflow as tf

Load and prepare the MNIST dataset. Convert the samples from integers to floating-point numbers:

In [2]:
mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


Build the tf.keras.Sequential model by stacking layers. 


In [0]:
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10)
])

For each example the model returns a vector of "logits" or "log-odds" scores, one for each class.

In [4]:
predictions = model(x_train[:1]).numpy()
predictions

array([[-0.6399462 ,  0.14820004,  0.08431803,  0.6188082 ,  0.79993546,
         0.8980921 ,  0.41747612, -0.20142302,  0.20845713,  0.9541938 ]],
      dtype=float32)

The tf.nn.softmax function converts these logits to "probabilities" for each class:

In [5]:
tf.nn.softmax(predictions).numpy()

array([[0.03407321, 0.07493775, 0.07030027, 0.1199729 , 0.14379564,
        0.15862608, 0.09809475, 0.05282765, 0.0795921 , 0.16777964]],
      dtype=float32)

The losses.SparseCategoricalCrossentropy loss takes a vector of logits and a True index and returns a scalar loss for each example.

In [0]:
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

This loss is equal to the negative log probability of the true class: It is zero if the model is sure of the correct class.

In [7]:
loss_fn(y_train[:1], predictions).numpy()

1.8412055

Compile & Train the Network


In [0]:
model.compile(optimizer='adam',
              loss=loss_fn,
              metrics=['accuracy'])

In [9]:
model.fit(x_train, y_train, epochs=5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x7fa4b3d44240>

The Model.evaluate method checks the models performance, usually on a "Validation-set".

In [10]:
model.evaluate(x_test,  y_test, verbose=2)

313/313 - 0s - loss: 0.0779 - accuracy: 0.9753


[0.07794024795293808, 0.9753000140190125]

If you want your model to return a probability, you can wrap the trained model, and attach the softmax to it:

In [0]:
probability_model = tf.keras.Sequential([
  model,
  tf.keras.layers.Softmax()
])

In [13]:
probability_model(x_test[:5])

<tf.Tensor: shape=(5, 10), dtype=float32, numpy=
array([[1.1606705e-08, 6.5395173e-10, 8.4186649e-07, 2.0216861e-05,
        4.4661132e-12, 8.8164725e-08, 3.7035089e-14, 9.9995780e-01,
        8.7970555e-08, 2.0872523e-05],
       [1.6621294e-07, 1.0018647e-04, 9.9988234e-01, 1.6736167e-05,
        1.1169377e-11, 3.7842895e-07, 1.0506418e-07, 3.9310018e-13,
        1.3571623e-07, 5.4494408e-12],
       [1.5751104e-07, 9.9866414e-01, 1.1103492e-04, 4.0976294e-05,
        5.4666463e-05, 1.9867939e-05, 9.9182034e-06, 9.1818837e-04,
        1.7924905e-04, 1.7674761e-06],
       [9.9972743e-01, 4.3432091e-10, 1.2488502e-04, 1.7088391e-06,
        8.7030031e-07, 4.1874751e-05, 3.2910371e-05, 6.3900887e-05,
        8.5489837e-09, 6.4630926e-06],
       [6.9461424e-07, 2.1402357e-08, 5.0425056e-06, 9.1892511e-09,
        9.9755633e-01, 2.5431945e-07, 1.2099058e-06, 5.1926854e-05,
        4.8793601e-07, 2.3839176e-03]], dtype=float32)>

## Sequential API

A sequential model, as the name suggests, allows you to create models layer-by-layer in a step-by-step fashion.

Keras Sequential API is by far the easiest way to get up and running with Keras, but it’s also the most limited — you cannot create models that:

* Share layers
* Have branches (at least not easily)
* Have multiple inputs
* Have multiple outputs

In [0]:
# import the necessary packages
from tensorflow.keras.models import Model
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import BatchNormalization
from tensorflow.keras.layers import AveragePooling2D
from tensorflow.keras.layers import MaxPooling2D
from tensorflow.keras.layers import Conv2D
from tensorflow.keras.layers import Activation
from tensorflow.keras.layers import Dropout
from tensorflow.keras.layers import Flatten
from tensorflow.keras.layers import Input
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import concatenate

In [0]:
def shallownet_sequential(width, height, depth, classes):
	# initialize the model along with the input shape to be
	# "channels last" ordering
	model = Sequential()
	inputShape = (height, width, depth)
	# define the first (and only) CONV => RELU layer
	model.add(Conv2D(32, (3, 3), padding="same",
		input_shape=inputShape))
	model.add(Activation("relu"))
	# softmax classifier
	model.add(Flatten())
	model.add(Dense(classes))
	model.add(Activation("softmax"))
	# return the constructed network architecture
	return model

In [0]:
SeqModel = shallownet_sequential(28,28,1,10)

In [0]:
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

In [0]:
SeqModel.compile(optimizer='adam',
              loss=loss_fn,
              metrics=['accuracy'])

In [0]:
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

# Agrega una dimension de canales
x_train = x_train[..., tf.newaxis]
x_test = x_test[..., tf.newaxis]

In [25]:
SeqModel.fit(x_train, y_train, epochs=5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x7fa4b02eb048>

In [26]:
SeqModel.evaluate(x_test,  y_test, verbose=2)

313/313 - 1s - loss: 1.4841 - accuracy: 0.9776


[1.4840813875198364, 0.9775999784469604]

# Functional API

Keras’ Functional API is easy to use and is typically favored by most deep learning practitioners who use the Keras deep learning library.

Using the Functional API you can:

* Create more complex models.
* Have multiple inputs and multiple outputs.
* Easily define branches in your architectures (ex., an Inception block, ResNet block, etc.).
* Design directed acyclic graphs (DAGs).
* Easily share layers inside the architecture.

In [0]:
def minigooglenet_functional(width, height, depth, classes):
    def conv_module(x, K, kX, kY, stride, chanDim, padding="same"):
      # define a CONV => BN => RELU pattern
      x = Conv2D(K, (kX, kY), strides=stride, padding=padding)(x)
      x = BatchNormalization(axis=chanDim)(x)
      x = Activation("relu")(x)
      # return the block
      return x
    
    def inception_module(x, numK1x1, numK3x3, chanDim):
      # define two CONV modules, then concatenate across the
      # channel dimension
      conv_1x1 = conv_module(x, numK1x1, 1, 1, (1, 1), chanDim)
      conv_3x3 = conv_module(x, numK3x3, 3, 3, (1, 1), chanDim)
      x = concatenate([conv_1x1, conv_3x3], axis=chanDim)
      # return the block
      return x

    def downsample_module(x, K, chanDim):
      # define the CONV module and POOL, then concatenate
      # across the channel dimensions
      conv_3x3 = conv_module(x, K, 3, 3, (2, 2), chanDim, padding="valid")
      pool = MaxPooling2D((3, 3), strides=(2, 2))(x)
      x = concatenate([conv_3x3, pool], axis=chanDim)
      # return the block
      return x

    # initialize the input shape to be "channels last" and the
    # channels dimension itself
    inputShape = (height, width, depth)
    chanDim = -1
    # define the model input and first CONV module
    inputs = Input(shape=inputShape)
    x = conv_module(inputs, 96, 3, 3, (1, 1), chanDim)
    # two Inception modules followed by a downsample module
    x = inception_module(x, 32, 32, chanDim)
    #x = inception_module(x, 32, 48, chanDim)
    x = downsample_module(x, 80, chanDim)
    # four Inception modules followed by a downsample module
    x = inception_module(x, 112, 48, chanDim)
    #x = inception_module(x, 96, 64, chanDim)
    #x = inception_module(x, 80, 80, chanDim)
    x = inception_module(x, 48, 96, chanDim)
    x = downsample_module(x, 96, chanDim)
    # two Inception modules followed by global POOL and dropout
    #x = inception_module(x, 176, 160, chanDim)
    x = inception_module(x, 176, 160, chanDim)
    #x = AveragePooling2D((7, 7))(x)
    x = Dropout(0.5)(x)
    # softmax classifier
    x = Flatten()(x)
    x = Dense(classes)(x)
    x = Activation("softmax")(x)
    # create the model
    model = Model(inputs, x, name="minigooglenet")
    # return the constructed network architecture
    return model

In [0]:
FunctionalModel = minigooglenet_functional(28,28,1,10)

In [0]:
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

In [7]:
fashion_mnist = tf.keras.datasets.fashion_mnist

(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

# Agrega una dimension de canales
x_train = x_train[..., tf.newaxis]
x_test = x_test[..., tf.newaxis]

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-labels-idx1-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-images-idx3-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-labels-idx1-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-images-idx3-ubyte.gz


In [0]:
INIT_LR = 1e-2
BATCH_SIZE = 128
NUM_EPOCHS = 20
opt = tf.keras.optimizers.SGD(lr=INIT_LR, momentum=0.9, decay=INIT_LR / NUM_EPOCHS)
#FunctionalModel.compile(optimizer='adam',
FunctionalModel.compile(optimizer=opt,
              loss=loss_fn,
              metrics=['accuracy'])


In [26]:
history = FunctionalModel.fit(x_train, y_train, epochs=60)

Epoch 1/60
Epoch 2/60
Epoch 3/60
Epoch 4/60
Epoch 5/60
Epoch 6/60
Epoch 7/60
Epoch 8/60
Epoch 9/60
Epoch 10/60
Epoch 11/60
Epoch 12/60
Epoch 13/60
Epoch 14/60
Epoch 15/60
Epoch 16/60
Epoch 17/60
Epoch 18/60
Epoch 19/60
Epoch 20/60
Epoch 21/60
Epoch 22/60
Epoch 23/60
Epoch 24/60
Epoch 25/60
Epoch 26/60
Epoch 27/60
Epoch 28/60
Epoch 29/60
Epoch 30/60
Epoch 31/60
Epoch 32/60
Epoch 33/60
Epoch 34/60
Epoch 35/60
Epoch 36/60
Epoch 37/60
Epoch 38/60
Epoch 39/60
Epoch 40/60
Epoch 41/60
Epoch 42/60
Epoch 43/60
Epoch 44/60
Epoch 45/60
Epoch 46/60
Epoch 47/60
Epoch 48/60
Epoch 49/60
Epoch 50/60
Epoch 51/60
Epoch 52/60
Epoch 53/60
Epoch 54/60
Epoch 55/60
Epoch 56/60
Epoch 57/60
Epoch 58/60
Epoch 59/60
Epoch 60/60


In [27]:
FunctionalModel.evaluate(x_test,  y_test, verbose=2)

313/313 - 1s - loss: 1.5322 - accuracy: 0.9306


[1.5322474241256714, 0.9305999875068665]

# Model Subclassing with Keras

* Model subclassing is fully-customizable and enables you to implement your own custom forward-pass of the model.

* However, this flexibility and customization, makes Model subclassing way harder to utilize than the Sequential or Functional APIs.

* Researchers have complet control over the network and training process (i.e.: custom layer implementation that performed an exotic type of convolution or pooling)

In [0]:
class MiniVGGNetModel(Model):
    def __init__(self, classes, chanDim=-1):
        # call the parent constructor
        super(MiniVGGNetModel, self).__init__()
        # initialize the layers in the first (CONV => RELU) * 2 => POOL
        # layer set
        self.conv1A = Conv2D(32, (3, 3), padding="same")
        self.act1A = Activation("relu")
        self.bn1A = BatchNormalization(axis=chanDim)
        self.conv1B = Conv2D(32, (3, 3), padding="same")
        self.act1B = Activation("relu")
        self.bn1B = BatchNormalization(axis=chanDim)
        self.pool1 = MaxPooling2D(pool_size=(2, 2))
        # initialize the layers in the second (CONV => RELU) * 2 => POOL
        # layer set
        self.conv2A = Conv2D(32, (3, 3), padding="same")
        self.act2A = Activation("relu")
        self.bn2A = BatchNormalization(axis=chanDim)
        self.conv2B = Conv2D(32, (3, 3), padding="same")
        self.act2B = Activation("relu")
        self.bn2B = BatchNormalization(axis=chanDim)
        self.pool2 = MaxPooling2D(pool_size=(2, 2))
        # initialize the layers in our fully-connected layer set
        self.flatten = Flatten()
        self.dense3 = Dense(512)
        self.act3 = Activation("relu")
        self.bn3 = BatchNormalization()
        self.do3 = Dropout(0.5)
        # initialize the layers in the softmax classifier layer set
        self.dense4 = Dense(classes)
        self.softmax = Activation("softmax")
    
    def call(self, inputs):
        # build the first (CONV => RELU) * 2 => POOL layer set
        x = self.conv1A(inputs)
        x = self.act1A(x)
        x = self.bn1A(x)
        x = self.conv1B(x)
        x = self.act1B(x)
        x = self.bn1B(x)
        x = self.pool1(x)
        # build the second (CONV => RELU) * 2 => POOL layer set
        x = self.conv2A(x)
        x = self.act2A(x)
        x = self.bn2A(x)
        x = self.conv2B(x)
        x = self.act2B(x)
        x = self.bn2B(x)
        x = self.pool2(x)
        # build our FC layer set
        x = self.flatten(x)
        x = self.dense3(x)
        x = self.act3(x)
        x = self.bn3(x)
        x = self.do3(x)
        # build the softmax classifier
        x = self.dense4(x)
        x = self.softmax(x)
        # return the constructed model
        return x

In [0]:
SubclassingModel = MiniVGGNetModel(10)

In [0]:
INIT_LR = 1e-2
BATCH_SIZE = 128
NUM_EPOCHS = 20
opt = tf.keras.optimizers.SGD(lr=INIT_LR, momentum=0.9, decay=INIT_LR / NUM_EPOCHS)
SubclassingModel.compile(optimizer=opt,
              loss=loss_fn,
              metrics=['accuracy'])

In [20]:
history = SubclassingModel.fit(x_train, y_train, epochs=60)

Epoch 1/60
Epoch 2/60
Epoch 3/60
Epoch 4/60
Epoch 5/60
Epoch 6/60
Epoch 7/60
Epoch 8/60
Epoch 9/60
Epoch 10/60
Epoch 11/60
Epoch 12/60
Epoch 13/60
Epoch 14/60
Epoch 15/60
Epoch 16/60
Epoch 17/60
Epoch 18/60
Epoch 19/60
Epoch 20/60
Epoch 21/60
Epoch 22/60
Epoch 23/60
Epoch 24/60
Epoch 25/60
Epoch 26/60
Epoch 27/60
Epoch 28/60
Epoch 29/60
Epoch 30/60
Epoch 31/60
Epoch 32/60
Epoch 33/60
Epoch 34/60
Epoch 35/60
Epoch 36/60
Epoch 37/60
Epoch 38/60
Epoch 39/60
Epoch 40/60
Epoch 41/60
Epoch 42/60
Epoch 43/60
Epoch 44/60
Epoch 45/60
Epoch 46/60
Epoch 47/60
Epoch 48/60
Epoch 49/60
Epoch 50/60
Epoch 51/60
Epoch 52/60
Epoch 53/60
Epoch 54/60
Epoch 55/60
Epoch 56/60
Epoch 57/60
Epoch 58/60
Epoch 59/60
Epoch 60/60


In [21]:
SubclassingModel.evaluate(x_test,  y_test, verbose=2)

313/313 - 1s - loss: 1.5370 - accuracy: 0.9246


[1.53695809841156, 0.9246000051498413]