# TensorFlow 2 quickstart

This short introduction uses [Keras](https://www.tensorflow.org/guide/keras/overview) to:

1. Load a prebuilt dataset.
1. Build a neural network machine learning model that classifies images.
2. Train this neural network.
3. Evaluate the accuracy of the model.

## Set up TensorFlow

Import TensorFlow into your program to get started:

In [40]:
import numpy
import tensorflow as tf
from keras.models import Sequential 
from keras.layers import Activation, Dense, Reshape, Flatten
from keras import initializers 
from keras import regularizers 
from keras import constraints 
print("TensorFlow version:", tf.__version__)

TensorFlow version: 2.8.2


## Load a dataset

Load and prepare the [MNIST dataset](http://yann.lecun.com/exdb/mnist/). 

The MNIST database of handwritten digits, has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image.It is a good database for people who want to try learning techniques and pattern recognition methods on real-world data while spending minimal efforts on preprocessing and formatting.

Convert the sample data from integers to floating-point numbers:

In [25]:
mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


Activations 

In machine learning, activation function is a special function used to find whether a specific neuron is activated or not. Basically, the activation function does a nonlinear transformation of the input data and thus enable the neurons to learn better. Output of a neuron depends on the activation function.


In [3]:
#creates a new model using Sequential API.
model = Sequential()
# creates a new Dense layer and add it into the model.
# Dense is an entry level layer provided by Keras, 
# which accepts the number of neurons or units (32) as its required parameter. 
# If the layer is first layer, then we need to provide Input Shape, (16,) as well. 
# Otherwise, the output of the previous layer will be used as input of the next layer.

# input_shape represent the shape of input data.
# kernel_initializer represent initializer to be used. 
#    he_uniform function is set as value.
# kernel_regularizer represent regularizer to be used. 
#    None is set as value.
# kernel_constraint represent constraint to be used. 
#    MaxNorm function is set as value.
# activation represent activation to be used. 
# relu function is set as value.

model.add(Dense(32, input_shape=(16,), kernel_initializer = 'he_uniform', 
   kernel_regularizer = None, kernel_constraint = 'MaxNorm', activation = 'relu')) 
model.add(Dense(16, activation = 'relu')) 
#creates final Dense layer with 8 units.
model.add(Dense(8))

In [4]:
#Generates 0 for all inputs

my_init = initializers.Zeros() 
model = Sequential() 
model.add(Dense(512, activation = 'relu', input_shape = (784,), 
   kernel_initializer = my_init))


In [9]:
#Generates 1 for all inputs

my_init = initializers.Ones() 
model.add(Dense(512, activation = 'relu', input_shape = (784,), 
   kernel_initializer = my_init))

In [11]:
#Generate value using uniform distribution of input data

my_init = initializers.RandomUniform(minval = -0.05, maxval = 0.05, seed = None) 
model.add(Dense(512, activation = 'relu', input_shape = (784,), 
   kernel_initializer = my_init))

# minval represent the lower bound of the random values to generate
# maxval represent the upper bound of the random values to generate

In [14]:
#Generates value based on the input shape and output shape of the layer along with the specified scale.

my_init = initializers.VarianceScaling(
   scale = 1.0, mode = 'fan_in', distribution = 'normal', seed = None) 
model.add(Dense(512, activation = 'relu', input_shape = (784,), 
   kernel_initializer = my_init))

# fan_in represents the number of input units
# fan_out represents the number of output units
# scale represent the scaling factor
# mode represent any one of fan_in, fan_out and fan_avg values
# distribution represent either of normal or uniform

In [18]:
#Generates identity matrix

my_init = initializers.Identity(gain = 1.0)
model.add(Dense(512, activation = 'relu', input_shape = (784,), kernel_initializer = my_init))

In [51]:
#Constraints weights to be unit form

my_constrain = constraints.UnitNorm(axis = 0) 
model = Sequential() 
model.add(Dense(512, activation = 'relu', input_shape = (784,), 
   kernel_constraint = my_constrain))

In [20]:
#Constraints weight to norm less than or equals to the given value.

my_constrain = constraints.MaxNorm(max_value = 2, axis = 0) 
model = Sequential() 
model.add(Dense(512, activation = 'relu', input_shape = (784,), 
   kernel_constraint = my_constrain))

In [21]:
# Applying Sigmoid function 

model = Sequential() 
model.add(Dense(512, activation = 'sigmoid', input_shape = (784,)))

In [32]:
#Applies Rectified Linear Unit.

model = Sequential() 
model.add(Dense(512, activation = 'relu', input_shape = (784,)))

In [33]:
#Applies Scaled exponential linear unit.

model = Sequential() 
model.add(Dense(512, activation = 'selu', input_shape = (784,)))

In [38]:
#Reshape

model = Sequential() 
layer_1 = Dense(16, input_shape = (8,8)) 
model.add(layer_1) 
layer_2 = Reshape((16, 8)) 
model.add(layer_2) 


In [41]:
#Flattern

model = Sequential() 
layer_1 = Dense(16, input_shape=(8,8)) 
model.add(layer_1) 
layer_2 = Flatten() 
model.add(layer_2) 

## Build a machine learning model

Build a `tf.keras.Sequential` model by stacking layers.

For each example, the model returns a vector of [logits](https://developers.google.com/machine-learning/glossary#logits) or [log-odds](https://developers.google.com/machine-learning/glossary#log-odds) scores, one for each class.

In [42]:
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10)
])
predictions = model(x_train[:1]).numpy()
predictions

array([[ 0.09333167, -0.16077016, -0.4021266 ,  0.10139152,  0.06933364,
         0.68370056, -0.04405575,  0.3915459 , -0.3089633 ,  0.10826982]],
      dtype=float32)

The `tf.nn.softmax` function converts these logits to *probabilities* for each class: 

In [43]:
tf.nn.softmax(predictions).numpy()

array([[0.09923732, 0.07696973, 0.06046447, 0.10004038, 0.09688415,
        0.17908901, 0.08649846, 0.13371734, 0.06636827, 0.10073085]],
      dtype=float32)

Note: It is possible to bake the `tf.nn.softmax` function into the activation function for the last layer of the network. While this can make the model output more directly interpretable, this approach is discouraged as it's impossible to provide an exact and numerically stable loss calculation for all models when using a softmax output. 

Define a loss function for training using `losses.SparseCategoricalCrossentropy`, which takes a vector of logits and a `True` index and returns a scalar loss for each example.

In [23]:
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

This loss is equal to the negative log probability of the true class: The loss is zero if the model is sure of the correct class.

This untrained model gives probabilities close to random (1/10 for each class), so the initial loss should be close to `-tf.math.log(1/10) ~= 2.3`.

In [44]:
loss_fn(y_train[:1], predictions).numpy()

1.7198724

Before you start training, configure and compile the model using Keras `Model.compile`. Set the [`optimizer`](https://www.tensorflow.org/api_docs/python/tf/keras/optimizers) class to `adam`, set the `loss` to the `loss_fn` function you defined earlier, and specify a metric to be evaluated for the model by setting the `metrics` parameter to `accuracy`.

In [45]:
model.compile(optimizer='adam',
              loss=loss_fn,
              metrics=['accuracy'])

## Train and evaluate your model

Use the `Model.fit` method to adjust your model parameters and minimize the loss: 

In [46]:
model.fit(x_train, y_train, epochs=5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x7f225e899a90>

The `Model.evaluate` method checks the models performance, usually on a "[Validation-set](https://developers.google.com/machine-learning/glossary#validation-set)" or "[Test-set](https://developers.google.com/machine-learning/glossary#test-set)".

In [47]:
model.evaluate(x_test,  y_test, verbose=2)

313/313 - 1s - loss: 0.0711 - accuracy: 0.9782 - 512ms/epoch - 2ms/step


[0.07109930366277695, 0.9782000184059143]

The image classifier is now trained to ~98% accuracy on this dataset. To learn more, read the [TensorFlow tutorials](https://www.tensorflow.org/tutorials/).

If you want your model to return a probability, you can wrap the trained model, and attach the softmax to it:

In [48]:
probability_model = tf.keras.Sequential([
  model,
  tf.keras.layers.Softmax()
])

In [49]:
probability_model(x_test[:5])

<tf.Tensor: shape=(5, 10), dtype=float32, numpy=
array([[1.2262951e-08, 5.5873182e-09, 3.2069303e-05, 2.3566019e-04,
        1.0957610e-11, 8.6055032e-08, 7.1358230e-14, 9.9973029e-01,
        6.1915432e-07, 1.2540502e-06],
       [9.7944017e-09, 1.8437070e-04, 9.9974543e-01, 5.4759064e-05,
        4.4921193e-14, 2.6727346e-07, 3.8467618e-08, 5.0208125e-14,
        1.5167878e-05, 1.2717443e-12],
       [3.5975575e-07, 9.9912924e-01, 2.7252705e-04, 3.2698677e-05,
        2.6192691e-05, 5.1455363e-06, 1.8770234e-05, 1.7716298e-04,
        3.2339510e-04, 1.4557519e-05],
       [9.9991477e-01, 1.8184121e-08, 7.3918331e-05, 1.4835710e-07,
        2.3475707e-07, 2.0281620e-06, 4.6315595e-06, 6.7319769e-07,
        2.7045933e-08, 3.5114751e-06],
       [2.6022110e-06, 1.5341131e-08, 6.3212174e-06, 3.7769635e-07,
        9.9380022e-01, 1.2045654e-06, 5.1483453e-06, 5.9232429e-05,
        4.0267332e-06, 6.1208564e-03]], dtype=float32)>