# **Keras Tutorial**

> Use the `Table of Content` tab to nagivate through this tutorial and activate the [line numbering option](https://stackoverflow.com/questions/49536407/show-code-line-numbers-in-jupyterlabhttps://stackoverflow.com/questions/49536407/show-code-line-numbers-in-jupyterlab) (In JupyterLab, go to menu View > Show Line Number) to display the line numbers in cells and read the commented explainations.

# Introduction

Deep learning is one of the major subfield of machine learning framework. Machine learning is the study of design of algorithms, inspired from the model of human brain. Deep learning is becoming more popular in data science fields like robotics, artificial intelligence(AI), audio & video recognition and image recognition. Artificial neural network is the core of deep learning methodologies. Deep learning is supported by various libraries such as TensorFlow, Mxnet, PyTorch etc., Keras is one of the most powerful and easy to use python library for creating deep learning models.

Keras is based on minimal structure that provides a clean and easy way to create deep learning models based on TensorFlow or Theano. Keras is designed to quickly define deep learning models. Well, Keras is an optimal choice for deep learning applications.

**Features**

Keras leverages various optimization techniques to make high level neural network API easier and more performant. It supports the following features −

- Consistent, simple and extensible API.
- Minimal structure - easy to achieve the result without any frills.
- It supports multiple platforms and backends.
- It is user friendly framework which runs on both CPU and GPU.
- Highly scalability of computation.

**Benefits**

Keras is highly powerful and dynamic framework and comes up with the following advantages:

- Larger community support.
- Easy to test.
- Keras neural networks are written in Python which makes things simpler.
- Keras supports both convolution and recurrent networks.
- Deep learning models are discrete components, so that, you can combine into many ways.

# Keras Framework

Keras provides a complete framework to create any type of neural networks. Keras is innovative as well as very easy to learn. It supports simple neural network to very large and complex neural network model. Let us understand the architecture of Keras framework and how Keras helps in deep learning in this chapter.

Architecture of Keras

Core Keras API can be divided into three main categories:

- Model
- Layer
- Core Modules

In Keras, every ANN is represented by  **Keras Models**. In turn, every Keras Model is composition of  **Keras Layers**  and represents ANN layers like input, hidden layer, output layers, convolution layer, pooling layer, etc., Keras model and layer access  **Keras modules**  for activation function, loss function, regularization function, etc., Using Keras model, Keras Layer, and Keras modules, any ANN algorithm (CNN, RNN, etc.,) can be represented in a simple and efficient manner.

The following diagram depicts the relationship between model, layer and core modules −

![](img/Picture1.jpg)

# Keras ecosystem

source: [Keros ecosystem](https://keras.io/getting_started/ecosystem/)

The Keras project isn't limited to the **core Keras API** for building and training neural networks. It spans a wide range of related initiatives that cover every step of the machine learning workflow.

## KerasTuner
KerasTuner Documentation - KerasTuner GitHub repository

KerasTuner is an easy-to-use, scalable hyperparameter optimization framework that solves the pain points of hyperparameter search. Easily configure your search space with a define-by-run syntax, then leverage one of the available search algorithms to find the best hyperparameter values for your models. KerasTuner comes with Bayesian Optimization, Hyperband, and Random Search algorithms built-in, and is also designed to be easy for researchers to extend in order to experiment with new search algorithms.

## KerasNLP
KerasNLP Documentation - KerasNLP GitHub repository

KerasNLP is a natural language processing library that supports users through their entire development cycle. Our workflows are built from modular components that have state-of-the-art preset weights and architectures when used out-of-the-box and are easily customizable when more control is needed. We emphasize in-graph computation for all workflows so that developers can expect easy productionization using the TensorFlow ecosystem.

## AutoKeras
AutoKeras Documentation - AutoKeras GitHub repository

AutoKeras is an AutoML system based on Keras. It is developed by DATA Lab at Texas A&M University. The goal of AutoKeras is to make machine learning accessible for everyone. It provides high-level end-to-end APIs such as ImageClassifier or TextClassifier to solve machine learning problems in a few lines, as well as flexible building blocks to perform architecture search.

## KerasCV
KerasCV Documentation - KerasCV GitHub repository

KerasCV is a repository of modular building blocks (layers, metrics, losses, data-augmentation) that applied computer vision engineers can leverage to quickly assemble production-grade, state-of-the-art training and inference pipelines for common use cases such as image classification, object detection, image segmentation, image data augmentation, etc.

KerasCV can be understood as a horizontal extension of the Keras API: the components are new first-party Keras objects (layers, metrics, etc) that are too specialized to be added to core Keras, but that receive the same level of polish and backwards compatibility guarantees as the rest of the Keras API and that are maintained by the Keras team itself (unlike TFAddons).

## TensorFlow Cloud
Managed by the Keras team at Google, TensorFlow Cloud is a set of utilities to help you run large-scale Keras training jobs on GCP with very little configuration effort. Running your experiments on 8 or more GPUs in the cloud should be as easy as calling model.fit().

## TensorFlow.js
TensorFlow.js is TensorFlow's JavaScript runtime, capable of running TensorFlow models in the browser or on a Node.js server, both for training and inference. It natively supports loading Keras models, including the ability to fine-tune or retrain your Keras models directly in the browser.

## TensorFlow Lite
TensorFlow Lite is a runtime for efficient on-device inference that has native support for Keras models. Deploy your models on Android, iOS, or on embedded devices.

## Model optimization toolkit
The TensorFlow Model Optimization Toolkit is a set of utilities to make your inference models faster, more memory-efficient, and more power-efficient, by performing post-training weight quantization and pruning-aware training. It has native support for Keras models, and its pruning API is built directly on top on the Keras API.

## TFX integration
TFX is an end-to-end platform for deploying and maintaining production machine learning pipelines. TFX has native support for Keras models.



Let us see the overview of Keras models, Keras layers and Keras modules.

# Keras Overview

## Installing Keras

To use Keras, will need to have the TensorFlow package installed. [See detailed instructions](https://www.tensorflow.org/install).

Once TensorFlow is installed, just import Keras via:

In [None]:
import tensorflow as tf
from tensorflow import keras

assert tf.__version__ >= "2.0"

# or print tensorflow version
tf.__version__

The Keras codebase is also available on GitHub at [keras-team/keras](https://github.com/keras-team/keras).

## Sequential Model

**Sequential model** is basically a linear composition of Keras Layers. Sequential model is easy, minimal as well as has the ability to represent nearly all available neural networks.

A simple sequential model is as follows:

In [None]:
# TensorFlow ≥2.0 is required
import tensorflow as tf


from tensorflow import keras
from keras.models import Sequential

from keras.layers import Dense, Activation

model = Sequential()

model.add(Dense(units=512, 
                activation = 'relu', 
                input_shape = (784,)))

model.summary()


Where,

- **Line 6**  imports  **Sequential**  model from Keras models
- **Line 8**  imports  **Dense**  layer and  **Activation**  module
- **Line 10**  create a new sequential model using  **Sequential**  API
- **Line 13**  adds a dense layer (Dense API) with  **relu**  activation (using Activation module) function.
- **Line 16** prints a summary of the model (aka Neural Network).

> Note that the **Sequential** model exposes **Model** class to create customized models as well. Alternatively, we can use **sub-classing** concept or Keras **Functional API** to create our own complex models. These options will be covered later.

Many Keras functions can be written **without** passing the keyword arguments. For example:

In [None]:
model = Sequential()

model.add(Dense(512, 'relu', input_shape = (784,)))

In this example, the key arguments `units` and `activation` are not present in the Dense function. It is recommended to include the Keyword arguments to improve the code readability. 

## Dense Layer

Each Keras `layer` in the Keras `model` represent the corresponding layer (input layer, hidden layer and output layer) in the actual proposed neural network model. Keras provides a lot of pre-build layers so that any complex neural network can be easily created. Some of the important Keras layers are specified below,

- Core Layers (including the Dense layer)
- Convolution Layers
- Pooling Layers
- Recurrent Layers
- etc.

A simple python code to represent a neural network model using the **Sequential**  model and the Dense layer is given below:

In [None]:
from keras.models import Sequential
from keras.layers import Dense, Activation, Dropout 

num_classes =2

model = Sequential()
model.add(Dense(units=512, activation = 'relu', input_shape = (784,)))
model.add(Dropout(0.2))
model.add(Dense(units=512, activation = 'relu')) 
model.add(Dropout(0.2))
model.add(Dense(num_classes, activation = 'softmax'))


Where,

- **Line 1**  imports  **Sequential**  model from Keras models
- **Line 2**  imports  **Dense**  layer and  **Activation**  module
- **Line 6**  create a new sequential model using  **Sequential**  API
- **Line 7**  adds a dense layer (Dense API) with  **relu**  activation (using Activation module) function.
- **Line 8**  adds a dropout layer (Dropout API) to handle over-fitting.
- **Line 9**  adds another dense layer (Dense API) with  **relu**  activation (using Activation module) function.
- **Line 10**  adds another dropout layer (Dropout API) to handle over-fitting.
- **Line 11**  adds final dense layer (Dense API) with  **softmax**  activation (using Activation module) function.

Keras also provides options to create our own customized layers. Customized layer can be created by sub-classing the  **Keras.Layer**  class and it is similar to sub-classing Keras models.

**Core Modules**

Keras also provides a lot of built-in neural network related functions to properly create the Keras model and Keras layers. Some of the function are as follows:



- **Initializers**  − Provides a list of initializers function. We can learn it in details in the Keras _layer section. during model creation phase of machine learning.
- **Regularizers**  − Provides a list of regularizers functions like L1 regularizer, L2 regularizer, etc.,
- **Constraints**  − Provides a list of constraints functions. We can learn it in details in _Keras Layers_ chapter.
- **Activations**  − Provides a list of activator functions. We can learn it in details in _Keras Layers_ chapter.
- **Losses**  − Provides a list of loss functions like mean\_squared\_error, mean\_absolute\_error, poisson, etc., 
- **Metrics**  − Provides a list of metrics functions. We can learn it in details in _Model Training_ chapter.
- **Optimizers**  − Provides a list of optimizer functions like adam, sgd, etc.,like adam, sgd, etc.,
- **Callback**  − Provides a list of callback function. We can use it during the training process to print the intermediate data as well as to stop the training itself ( **EarlyStopping**  method) based on some condition.
- **Utilities**  − Provides lot of utility function useful in deep learning.

for a complete list of all modules please check the [Keras API referenceKeras API reference](https://keras.io/api/https://keras.io/api/).

# Keras - Modules

As we learned earlier, Keras provides `Model` and `Layer`s modules with pre-defined classes, functions and variables to build and design deep learning algorithms. 

Let us first see the list of modules available in Keras as of today:

**Callbacks API**
- Usage of callbacks via the built-in fit() loop
- Using custom callbacks
- Available callbacks

**Optimizers**
- Usage with compile() & fit()
- Usage in a custom training loop
- Learning rate decay / scheduling
- Available optimizers
- Core Optimizer API
    - apply_gradients method
    - variables method

**Metrics**
- Accuracy metrics
- Probabilistic metrics
- Regression metrics
- Classification metrics based on True/False positives & negatives
- Image segmentation metrics
- Hinge metrics for "maximum-margin" classification
- Usage with compile() & fit()
- Standalone usage
- Creating custom metrics
- The add_metric() API

**Losses**
- Probabilistic losses
- Regression losses
- Hinge losses for "maximum-margin" classification
- Usage of losses with compile() & fit()
- Standalone usage of losses
- Creating custom losses
- The add_loss() API


**Data loading**
- Available dataset loading utilities
- Image data loading
- Timeseries data loading
- Text data loading
- Audio data loading


**Built-in small datasets**
- MNIST digits classification dataset
- CIFAR10 small images classification dataset
- CIFAR100 small images classification dataset
- IMDB movie review sentiment classification dataset
- Reuters newswire classification dataset
- Fashion MNIST dataset, an alternative to MNIST
- Boston Housing price regression dataset


**Utilities**
- Model plotting utilities
- Serialization utilities
- Python & NumPy utilities
- Backend utilities

**KerasTuner API**
- HyperParameters
- Tuners
- Oracles
- HyperModels


**KerasCV API**
- Layers
- Metrics
- Models
- Bounding box formats and utilities


**KerasNLP**
- Models
- Tokenizers
- Preprocessing Layers
- Modeling Layers
- Metrics
- Utils

Let us see the  **utils**  model to learn useful functions

# `utils` module

**utils** provides useful utilities function for deep learning. Some of the methods provided by the **utils** module is as follows:

## to_categorical

It is used to convert class vector (integers) to binary class matrix. E.g. for use with categorical_crossentropy.E.g. for use with categorical_crossentropy.

In [None]:
from tensorflow.keras.utils import to_categorical

labels = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
to_categorical(labels)

In [None]:
from tensorflow.keras.utils import normalize

normalize([1, 2, 3, 4, 5])

## plot_model

It is used to create the model representation in dot format and save it to file.

In [None]:
from keras.utils import plot_model

plot_model(model,to_file = 'image.png')

model = Sequential()
model.add(Dense(units=512, activation = 'relu', input_shape = (784,)))

model.add(Dropout(0.2))
model.add(Dense(units=512, activation = 'relu')) 
model.add(Dropout(0.2))
model.add(Dense(num_classes, activation = 'softmax'))


dot_img_file = '/tmp/model_1.png'
tf.keras.utils.plot_model(model, to_file=dot_img_file, show_shapes=True)


This **plot\_model** will generate an image to understand the performance of model.

## set_random_seed function


Sets all random seeds for the program (Python, NumPy, and TensorFlow).

You can use this utility to make almost any Keras program fully deterministic. Some limitations apply in cases where network communications are involved (e.g. parameter server distribution), which creates additional sources of randomness, or when certain non-deterministic cuDNN ops are involved.

Calling this utility is equivalent to the following:

In [None]:
import random
import numpy as np
import tensorflow as tf

seed=12345
random.seed(seed)

np.random.seed(seed)
tf.random.set_seed(seed)


## split_dataset function

Split a dataset into a left half and a right half (e.g. train / test).

Arguments

* dataset: A tf.data.Dataset object, or a list/tuple of arrays with the same length.
* left_size: If float (in the range [0, 1]), it signifies the fraction of the data to pack in the left dataset. If integer, it signifies the number of samples to pack in the left dataset. If None, it defaults to the complement to right_size.
* right_size: If float (in the range [0, 1]), it signifies the fraction of the data to pack in the right dataset. If integer, it signifies the number of samples to pack in the right dataset. If None, it defaults to the complement to left_size.
* shuffle: Boolean, whether to shuffle the data before splitting it.
* seed: A random seed for shuffling.
Returns

A tuple of two tf.data.Dataset objects: the left and right splits.

Example

In [None]:
data = np.random.random(size=(1000, 4))

left_ds, right_ds = tf.keras.utils.split_dataset(data, left_size=0.8)
int(left_ds.cardinality())

int(right_ds.cardinality())


# Layers

As learned earlier, Keras layers are the primary building block of Keras models. Each layer receives input information, do some computation and finally output the transformed information. The output of one layer will flow into the next layer as its input. Let us learn complete details about layers in this tutorial.



| **No** | **Examples of Layers & their Description** |
| --- | --- |
| 1 | [Dense Layer](https://www.tutorialspoint.com/keras/keras_dense_layer.htm) **Dense layer**  is the regular deeply connected neural network layer. |
| 2 | [Dropout Layers](https://www.tutorialspoint.com/keras/keras_dropout_layers.htm)_ **Dropout** _ is one of the important concept in the machine learning. |
| 3 | [Flatten Layers](https://www.tutorialspoint.com/keras/keras_flatten_layers.htm) **Flatten**  is used to flatten the input. |
| 4 | [Reshape Layers](https://www.tutorialspoint.com/keras/keras_reshape_layers.htm)_ **Reshape** _ is used to change the shape of the input. |
| 5 | [Permute Layers](https://www.tutorialspoint.com/keras/keras_permute_layers.htm) **Permute**  is also used to change the shape of the input using pattern. |
| 6 | [RepeatVector Layers](https://www.tutorialspoint.com/keras/keras_repeatvector_layers.htm)_ **RepeatVector** _ is used to repeat the input for set number, n of times. |
| 7 | [Lambda Layers](https://www.tutorialspoint.com/keras/keras_lambda_layers.htm)_ **Lambda** _ is used to transform the input data using an expression or function. |
| 8 | [Convolution Layers](https://www.tutorialspoint.com/keras/keras_convolution_layers.htm)Keras contains a lot of layers for creating Convolution based ANN, popularly called as _Convolution Neural Network (CNN)_. |
| 9 | [Pooling Layer](https://www.tutorialspoint.com/keras/keras_pooling_layer.htm)It is used to perform max pooling operations on temporal data. |
| 10 | [Locally connected layer](https://www.tutorialspoint.com/keras/keras_locally_connected_layer.htm)Locally connected layers are similar to Conv1D layer but the difference is Conv1D layer weights are shared but here weights are unshared. |
| 11 | [Merge Layer](https://www.tutorialspoint.com/keras/keras_merge_layer.htm)It is used to merge a list of inputs. |
| 12 | [Embedding Layer](https://www.tutorialspoint.com/keras/keras_embedding_layer.htm)It performs embedding operations in input layer. |

## The `Layer` class

Layers are the basic building blocks of neural networks in Keras. A layer consists of a tensor-in tensor-out computation function (the layer's call method) and some state, held in TensorFlow variables (the layer's weights). 

the Layer class has the following format:


```Python
tf.keras.layers.Layer(
    trainable=True, name=None, dtype=None, dynamic=False, **kwargs
)
```

This is the class from which all layers inherit.

A layer is a callable object that takes as input one or more tensors and that outputs one or more tensors.


Users will just instantiate a layer and then treat it as a callable like a function.

**Arguments**

- trainable: Boolean, whether the layer's variables should be trainable.
- name: String name of the layer.
- dtype: The dtype of the layer's computations and weights. Can also be a tf.keras.mixed_precision.Policy, which allows the computation and weight dtype to differ. Default of None means to use tf.keras.mixed_precision.global_policy(), which is a float32 policy unless set to different value.
- dynamic: Set this to True if your layer should only be run eagerly, and should not be used to generate a static computation graph. This would be the case for a Tree-RNN or a recursive network, for example, or generally for any layer that manipulates tensors using Python control flow. If False, we assume that the layer can safely be used to generate a static computation graph.


**Attributes**

- name: The name of the layer (string).
- dtype: The dtype of the layer's weights.
- variable_dtype: Alias of dtype.
- compute_dtype: The dtype of the layer's computations. Layers automatically cast inputs to this dtype which causes the computations and output to also be in this dtype. When mixed precision is used with a tf.keras.mixed_precision.Policy, this will be different than variable_dtype.
- dtype_policy: The layer's dtype policy. See the tf.keras.mixed_precision.Policy documentation for details.
- trainable_weights: List of variables to be included in backprop.
- non_trainable_weights: List of variables that should not be included in backprop.
- weights: The concatenation of the lists trainable_weights and non_trainable_weights (in this order).
- trainable: Whether the layer should be trained (boolean), i.e. whether its potentially-trainable weights should be returned as part of layer.trainable_weights.

In this example, we see the a customized implementation of a fully connected layer with the linear activation function.

In [None]:
class SimpleDense(Layer):

  def __init__(self, units=6):
      super(SimpleDense, self).__init__()
      self.units = units

  def build(self, input_shape):  # Create the state of the layer (weights)
    w_init = tf.random_normal_initializer()
    self.w = tf.Variable(
        initial_value=w_init(shape=(input_shape[-1], self.units),
                             dtype='float32'),
        trainable=True)
    b_init = tf.zeros_initializer()
    self.b = tf.Variable(
        initial_value=b_init(shape=(self.units,), dtype='float32'),
        trainable=True)

  def call(self, inputs):  # Defines the computation from inputs to outputs
      return tf.matmul(inputs, self.w) + self.b

Let's test this layers

In [None]:
# Instantiates the layer.
linear_layer = SimpleDense(6)

# This will also call `build(input_shape)` and create the weights.
y = linear_layer(tf.ones((2, 2)))
assert len(linear_layer.weights) == 2

# These weights are trainable, so they're listed in `trainable_weights`:
assert len(linear_layer.trainable_weights) == 2

In [None]:
my_sum = SimpleDense(6)

x = tf.ones((2, 2))

y = my_sum(x)
print(y.numpy()) 


A layer is a callable object that takes as input one or more tensors and that outputs one or more tensors. 

It involves computation, defined in the call() method, and a state (weight variables). State can be created in various places, at the convenience of the subclass implementer:

- in __init__();
- in the optional build() method, which is invoked by the first __call__() to the layer, and supplies the shape(s) of the input(s), which may not have been known at initialization time;
- in the first invocation of call(), with some caveats discussed below.

Layers are recursively composable: If you assign a Layer instance as an attribute of another Layer, the outer layer will start tracking the weights created by the inner layer. Nested layers should be instantiated in the __init__() method.

Users will just instantiate a layer and then treat it as a callable.

This implementation can be obtained with the Dense layer, which is a Keras Core layer by instanciating 

In [None]:
from keras.layers import Dense

MySimpleDense = Dense(units=6, activation='linear')

Let's review couple of core layers

## Core layers

Layers are recursively composable: If you assign a Layer instance as an attribute of another Layer, the outer layer will start tracking the weights created by the inner layer.

There core layers can be nested.
- Input object
- Dense layer
- Activation layer
- Embedding layer
- Masking layer
- Lambda layer

For example,

In [None]:
model = Sequential()

model.add(Input(shape=(32,)))
model.add(Dense(units=512, 
                activation = 'softmax'))

this model can be rewritten as:

In [None]:
model = Sequential()

model.add(Dense(units=512, 
                activation = 'softmax', 
                input_shape = (32,)))

or using the function API (to discuss later)

In [None]:
from keras.layers import Input
from keras import Model

# this is a logistic regression in Keras
x = Input(shape=(32,))
y = Dense(16, activation='softmax')(x)

model = Model(x, y)

## Defining Layers

A Keras layer requires **shape of the input** (```input_shape```) to understand the structure of the input data,  **initializer** to set the weight for each input and finally activators to transform the output to make it non-linear.

In between, constraints restricts and specify the range in which the weight of input data to be generated and regularizer will try to optimize the layer (and the model) by dynamically applying the penalties on the weights during optimization process.

To summarise, Keras layer requires below minimum details to create a complete layer.

- Shape of the input data
- Number of neurons / units in the layer
- Initializers
- Regularizers
- Constraints
- Activations

Let us understand the basic concept in the next chapter. Before understanding the basic concept, let us create a simple Keras layer using Sequential model API to get the idea of how Keras model and layer works.

In [None]:
from keras.models import Sequential
from keras.layers import Activation,Dense

from keras import initializers
from keras import regularizers
from keras import constraints

model =Sequential()
model.add(Dense(units=32, 
                input_shape=(16,), 
                kernel_initializer ='he_uniform',
                kernel_regularizer =None, 
                kernel_constraint ='MaxNorm', activation ='relu'))

model.add(Dense(16, activation ='relu'))
model.add(Dense(8))



where,

- **Line 1-6**  imports the necessary modules.
- **Line 8**  creates a new model using Sequential API.
- **Line 10**  creates a new _ **Dense** _ layer and add it into the model. _ **Dense** _ is an entry level layer provided by Keras, which accepts the number of neurons or units (32) as its required parameter. If the layer is first layer, then we need to provide **Input Shape, (16,)** as well. Otherwise, the output of the previous layer will be used as input of the next layer.

All other parameters are optional.
  - **units**, is the first parameter represents the number of units (neurons).
  - **input\_shape** represent the shape of input data.
  - **kernel\_initializer** represent initializer to be used. **he\_uniform** _ function is set as value.
  - **kernel\_regularizer** represent  **regularizer**  to be used. None is set as value.
  - **kernel\_constraint** represent constraint to be used. **MaxNorm** function is set as value.
  - **activation** represent activation to be used. relu function is set as value.
  
- **Line 15**  creates second _ **Dense** _ layer with 16 units and set _ **relu** _ as the activation function.
- **Line 16**  creates final Dense layer with 8 units.

Let us understand the basic concept of layer as well as how Keras supports each concept.

## Input shape

In machine learning, all type of input data like text, images or videos will be first converted into array of numbers and then feed into the algorithm. Input numbers may be single dimensional array, two dimensional array (matrix) or multi-dimensional array. We can specify the dimensional information using  **shape** , a tuple of integers. For example, **(4,2)** represent matrix with four rows and two columns.

In [None]:
import numpy as np
shape =(4,2)
input = np.zeros(shape)
print(input)

Similarly, **(3,4,2)** three dimensional matrix having three collections of 4x2 matrix (two rows and four columns).

In [None]:
import numpy as np

shape =(3,4,2)
input = np.zeros(shape)
print(input)

To create the first layer of the model (or input layer of the model), shape of the input data should be specified.

## Input Layer

## Layer weight initializers

Initializers define the way to set the initial random weights of Keras layers. The keyword arguments used for passing initializers to layers depends on the layer.

The **Initializers** Layer provides different functions to set these initial weight. Usually, it is simply `kernel_initializer` and `bias_initializer`:

In [None]:
from tensorflow.keras import layers
from tensorflow.keras import initializers

layer = layers.Dense(
    units=64,
    kernel_initializer=initializers.RandomNormal(stddev=0.01),
    bias_initializer=initializers.Zeros()
)

All built-in initializers can also be passed via their string identifier:

In [None]:
layer = layers.Dense(
    units=64,
    kernel_initializer='random_normal',
    bias_initializer='zeros'
)

The following built-in initializers are available as part of the `tf.keras.initializers` module:

### Zeros

Generates  **0**  for all weights.

In [None]:
from keras.models import Sequential
from keras.layers import Activation,Dense
from keras import initializers

my_init = initializers.Zeros()
model =Sequential()

model.add(Dense(512, activation ='relu', 
                input_shape =(784,),
                kernel_initializer = my_init))

Where, _ **kernel\_initializer** _ represent the initializer for kernel of the model.

### Ones

Generates  **1**  for layer weights.

In [None]:
from keras.models import Sequential
from keras.layers import Activation,Dense
from keras import initializers

my_init = initializers.Ones()

model.add(Dense(512, activation ='relu', 
                input_shape =(784,),

kernel_initializer = my_init))

### Constant

Generates a constant value (say,  **5** ) specified by the user for all input data.

In [None]:
from keras.models import Sequential
from keras.layers import Activation,Dense
from keras import initializers

my_init = initializers.Constant(value=0) 

model.add(Dense(512, activation ='relu', 
                input_shape =(784,), 
                kernel_initializer = my_init))

where,  **value**  represent the constant value

### RandomNormal

Generates value using normal distribution of layer weights.

In [None]:
from keras.models import Sequential
from keras.layers import Activation,Dense
from keras import initializers

my_init = initializers.RandomNormal(mean=0.0, 
                                    stddev =0.05, 
                                    seed =None)

model.add(Dense(512, activation ='relu', 
                input_shape =(784,),
                kernel_initializer = my_init))

where,

- _ **mean** _ represent the mean of the random values to generate
- _ **stddev** _ represent the standard deviation of the random values to generate
- _ **seed** _ represent the values to generate random number

### RandomUniform

Generates value using uniform distribution of layer weights.

In [None]:
from keras import initializers

my_init = initializers.RandomUniform(minval =-0.05, 
                                     maxval =0.05, 
                                     seed =None)

model.add(Dense(512, activation ='relu', 
                input_shape =(784,),

kernel_initializer = my_init))

where,

- _ **minval** _ represent the lower bound of the random values to generate
- _ **maxval** _ represent the upper bound of the random values to generate

### TruncatedNormal

Generates value using truncated normal distribution of input data.

In [None]:
from keras.models import Sequential
from keras.layers import Activation,Dense
from keras import initializers

my_init = initializers.TruncatedNormal(mean =0.0, 
                                       stddev =0.05, 
                                       seed =None)

model.add(Dense(512, 
                activation ='relu', 
                input_shape =(784,),
                kernel_initializer = my_init))

### VarianceScaling

Generates value based on the input shape and output shape of the layer along with the specified scale.

In [None]:
from keras.models import Sequential
from keras.layers import Activation,Dense
from keras import initializers

my_init = initializers.VarianceScaling(scale =1.0, 
                                       mode ='fan_in', 
                                       distribution ='normal', 
                                       seed =None)

model.add(Dense(512, activation ='relu', 
                input_shape =(784,),
                kernel_initializer = my_init))

where,

- **scale**  represent the scaling factor
- **mode**  represent any one of  **fan\_in, fan\_out**  and  **fan\_avg**  values
- **distribution**  represent either of  **normal**  or  **uniform**

The VarianceScaling finds the _ **stddev** _ value for normal distribution using below formula and then find the weights using normal distribution,

```Python 
stddev = sqrt(scale/n)
```

where  **n**  represent,

- number of input units for mode = fan\_in
- number of out units for mode = fan\_out
- average number of input and output units for mode = fan\_avg

Similarly, it finds the _limit_ for uniform distribution using below formula and then find the weights using uniform distribution,

```Python 
limit = sqrt(3 * scale / n)
```

### lecun_normal

Generates value using lecun normal distribution of input data.

In [None]:
from keras.models import Sequential
from keras.layers import Activation,Dense
from keras import initializers

my_init = initializers.RandomUniform(minval =-0.05, 
                                     maxval =0.05, 
                                     seed =None)

model.add(Dense(512, activation ='relu', 
                input_shape =(784,),
                kernel_initializer = my_init))

It finds the _ **stddev** _ using the below formula and then apply normal distribution

```Python 
stddev = sqrt(1 / fan_in)
```

where, **fan\_in** _ represent the number of input units.

### lecun\_uniform

Generates value using lecun uniform distribution of input data.

In [None]:
from keras.models import Sequential
from keras.layers import Activation,Dense
from keras import initializers

my_init = initializers.lecun_uniform(seed =None)

model.add(Dense(512, activation ='relu', 
                input_shape =(784,),
                kernel_initializer = my_init))

It finds the **limit** _ using the below formula and then apply uniform distribution

```Python 
limit = sqrt(3 / fan_in)
```

where,

- **fan\_in** _ represents the number of input units
- **fan\_out** _ represents the number of output units

## Layer weight constraints


Classes from the `tf.keras.constraints` module allow setting constraints (eg. non-negativity) on model parameters during training. They are per-variable projection functions applied to the target variable after each gradient update (when using fit()).

The exact API will depend on the layer, but the layers Dense, Conv1D, Conv2D and Conv3D have a unified API.

These layers expose two keyword arguments:

kernel_constraint for the main weights matrix
bias_constraint for the bias.Classes from the tf.keras.constraints module allow setting constraints (eg. non-negativity) on model parameters during training. They are per-variable projection functions applied to the target variable after each gradient update (when using fit()).

The exact API will depend on the layer, but the layers Dense, Conv1D, Conv2D and Conv3D have a unified API.

These layers expose two keyword arguments:
- `kernel_constraint` for the main weights matrix
- `bias_constraint` for the bias.



In [None]:
from tensorflow.keras.constraints import max_norm

model.add(Dense(64, kernel_constraint=max_norm(2.)))

### NonNeg

Constrains weights to be non-negative.

In [None]:
from keras.models import Sequential
from keras.layers import Activation,Dense
from keras import initializers

my_init = initializers.Identity(gain =1.0) 

model.add(Dense(512, activation ='relu', 
                input_shape =(784,),
                kernel_initializer = my_init)
         )

where, _ **kernel\_constraint** _ represent the constraint to be used in the layer.

### UnitNorm

Constrains weights to be unit norm.

In [None]:
from keras.models import Sequential
from keras.layers import Activation,Dense
from keras import initializers

my_constrain = constraints.UnitNorm(axis =0)

model =Sequential()

model.add(Dense(512, activation ='relu', 
                input_shape =(784,),
                kernel_constraint = my_constrain))

### MaxNorm

Constrains weight to norm less than or equals to the given value.

In [None]:
from keras.models import Sequential
from keras.layers import Activation,Dense
from keras import initializers

my_constrain = constraints.MaxNorm(max_value =2, axis =0)

model =Sequential()
model.add(Dense(512, activation ='relu', 
                input_shape =(784,),
                kernel_constraint = my_constrain))

where,

- _ **max\_value** _ represent the upper bound
- _axis_ represent the dimension in which the constraint to be applied. e.g. in Shape (2,3,4) axis 0 denotes first dimension, 1 denotes second dimension and 2 denotes third dimension

### MinMaxNorm

Constrains weights to be norm between specified minimum and maximum values.

In [None]:
from keras.models import Sequential
from keras.layers import Activation,Dense
from keras import initializers

my_constrain = constraints.MinMaxNorm(min_value =0.0, max_value =1.0, rate =1.0, axis =0)

model =Sequential()

model.add(Dense(512, activation ='relu', 
                input_shape =(784,),
                kernel_constraint = my_constrain))

where, _ **rate** _ represent the rate at which the weight constrain is applied.

## Layer weight regularizers

Regularizers allow you to apply penalties on layer parameters or layer activity during optimization. These penalties are summed into the loss function that the network optimizes.

Regularization penalties are applied on a per-layer basis. The exact API will depend on the layer, but many layers (e.g. Dense, Conv1D, Conv2D and Conv3D) have a unified API.

These layers expose 3 keyword arguments:

- kernel_regularizer: Regularizer to apply a penalty on the layer's kernel
- bias_regularizer: Regularizer to apply a penalty on the layer's bias
- activity_regularizer: Regularizer to apply a penalty on the layer's outputRegularizers allow you to apply penalties on layer parameters or layer activity during optimization. These penalties are summed into the loss function that the network optimizes.

Regularization penalties are applied on a per-layer basis. The exact API will depend on the layer, but many layers (e.g. Dense, Conv1D, Conv2D and Conv3D) have a unified API.

These layers expose 3 keyword arguments:

- `kernel_regularizer`: Regularizer to apply a penalty on the layer's kernel
- `bias_regularizer`: Regularizer to apply a penalty on the layer's bias
- `activity_regularizer`: Regularizer to apply a penalty on the layer's output

### L1 Regularizer

It provides L1 based regularization.

In [None]:
from keras.models import Sequential
from keras.layers import Activation,Dense
from keras import initializers

from keras import regularizers

# A regularizer that applies a L1 regularization penalty.
my_regularizer = regularizers.l1(0.1)

model = Sequential()
model.add(Dense(units = 512, activation = 'relu', 
                input_shape = (784,),
                kernel_regularizer = my_regularizer))

where, **kernel\_regularizer** represents the rate at which the weight constrain is applied.

The L1 regularization penalty is computed as: `loss = l1 * reduce_sum(abs(x))`

L1 may be passed to a layer as a string identifier:

In [None]:
dense = tf.keras.layers.Dense(3, kernel_regularizer='l1')

In this case, the default value used is `l1=0.01`.

### L2 Regularizer

A regularizer that applies a L2 regularization penalty.

The L2 regularization penalty is computed as: `loss = l2 * reduce_sum(square(x))`

In [None]:
from keras.models import Sequential
from keras.layers import Activation,Dense
from keras import initializers

from keras import regularizers

my_regularizer = regularizers.l2(0.1)

model =Sequential()
model.add(Dense(512, activation ='relu', 
                input_shape =(784,),
                kernel_regularizer = my_regularizer))

L2 may be passed to a layer as a string identifier:

In [None]:
dense = tf.keras.layers.Dense(3, kernel_regularizer='l2')

In this case, the default value used is l2=0.01.

### L1 and L2 Regularizer

It provides both L1 and L2 based regularization.

## Activations

In machine learning, activation function is a special function used to find whether a specific neuron is activated or not. Basically, the activation function does a nonlinear transformation of the input data and thus enable the neurons to learn better. Output of a neuron depends on the activation function.

As you recall the concept of single perception, the output of a perceptron (neuron) is simply the result of the activation function, which accepts the summation of all input multiplied with its corresponding weight plus overall bias, if any available.

>result = Activation(SUMOF(input \* weight) + bias)

So, activation function plays an important role in the successful learning of the model. Keras provides a lot of activation function in the activations module. Let us learn all the activations available in the module.

In [None]:
from keras.models import Sequential
from keras.layers import Activation,Dense
from keras import initializers

my_regularizer = regularizers.l2(0.)

model =Sequential()

model.add(Dense(512, activation ='relu', 
                input_shape =(784,),
                kernel_regularizer = my_regularizer))

### linear

Applies Linear function. Does nothing.

In [None]:
import tensorflow as tf
from tensorflow import keras

from keras.models import Sequential
from keras.layers import Activation, Dense

model = Sequential()

model.add(Dense(units = 512, activation = 'linear', 
                input_shape = (784,)))

Where, _ **activation** _ refers the activation function of the layer. It can be specified simply by the name of the function and the layer will use corresponding activators.

### elu

Applies Exponential linear unit.

In [None]:
from keras.models import Sequential
from keras.layers import Activation, Dense

model =Sequential()

model.add(Dense(512, activation ='elu', 
                input_shape =(784,)))

### selu

Applies Scaled exponential linear unit.

In [None]:
from keras.models import Sequential
from keras.layers import Activation, Dense

model =Sequential()
model.add(Dense(512, activation ='selu', 
                input_shape =(784,)))

### relu

Applies Rectified Linear Unit.

In [None]:
from keras.models import Sequential
from keras.layers import Activation, Dense

model =Sequential()

model.add(Dense(512, activation ='relu', 
                input_shape =(784,)))


### softmax

Applies Softmax function.

In [None]:
from keras.models import Sequential
from keras.layers import Activation,Dense

model =Sequential()
model.add(Dense(512, activation ='softmax', 
                input_shape =(784,)))

### softplus

Applies Softplus function.

In [None]:
from keras.models import Sequential
from keras.layers import Activation, Dense

model =Sequential()
model.add(Dense(512, activation ='softplus', 
                input_shape =(784,)))

### softsign

Applies Softsign function.

In [None]:
from keras.models import Sequential
from keras.layers import Activation, Dense

model =Sequential()

model.add(Dense(512, activation ='softsign', 
                input_shape =(784,)))

### tanh

Applies Hyperbolic tangent function.

In [None]:
from keras.models import Sequential
from keras.layers import Activation, Dense

model =Sequential()

model.add(Dense(512, activation ='tanh', input_shape =(784,)))

### sigmoid

Applies Sigmoid function.

In [None]:
from keras.models import Sequential
from keras.layers import Activation, Dense

model =Sequential()

model.add(Dense(512, activation ='sigmoid', input_shape =(784,)))

### hard_sigmoid

Applies Hard Sigmoid function.

In [None]:
from keras.models import Sequential
from keras.layers import Activation, Dense

model =Sequential()

model.add(Dense(512, activation ='hard_sigmoid', 
                input_shape =(784,)))

### exponential

Applies exponential function.

In [None]:
from keras.models import Sequential
from keras.layers import Activation, Dense

model =Sequential()

model.add(Dense(units = 512, 
                activation ='exponential', 
                input_shape =(784,)))

## Customized Layer

Keras allows to create our own customized layer. Once a new layer is created, it can be used in any model without any restriction. Let us learn how to create new layer in this chapter.

Keras provides a base  **layer**  class, Layer which can sub-classed to create our own customized layer. Let us create a simple layer which will find weight based on normal distribution and then do the basic computation of finding the summation of the product of input and its weight during training.

Step 1: Import the necessary module

First, let us import the necessary modules −

In [None]:
from keras import backend as K

from keras.layers import Layer

Here,

- **backend**  is used to access the  **dot**  function.
- **Layer**  is the base class and we will be sub-classing it to create our layer

Step 2: Define a layer class

Let us create a new class,  **MyCustomLayer**  by sub-classing  **Layer class**  −

In [None]:
class MyCustomLayer(Layer):

...

Step 3: Initialize the layer class

Let us initialize our new class as specified below −

In [None]:
def __init__(self, output_dim, **kwargs):
    self.output_dim = output_dim
    super(MyCustomLayer, self).__init__(**kwargs)

Here,

- **Line 2**  sets the output dimension.
- **Line 3**  calls the base or super layer's  **init**  function.



Step 4: Implement build method

**build**  is the main method and its only purpose is to build the layer properly. It can do anything related to the inner working of the layer. Once the custom functionality is done, we can call the base class  **build**  function. Our custom  **build**  function is as follows −

In [None]:
def build(self, input_shape):
    self.kernel =self.add_weight(name ='kernel',
                                 shape =(input_shape[1],self.output_dim),
                                 initializer ='normal', trainable =True)

    super(MyCustomLayer,self).build(input_shape)

Here,

- **Line 1**  defines the  **build**  method with one argument,  **input\_shape**. Shape of the input data is referred by input\_shape.
- **Line 2**  creates the weight corresponding to input shape and set it in the kernel. It is our custom functionality of the layer. It creates the weight using 'normal' initializer.
- **Line 6**  calls the base class,  **build**  method.

Step 5: Implement call method

**call**  method does the exact working of the layer during training process.

Our custom  **call**  method is as follows

In [None]:
def call(self, input_data):
    return K.dot(input_data, self.kernel)

Here,

- **Line 1**  defines the  **call**  method with one argument,  **input\_data**. input\_data is the input data for our layer.
- **Line 2**  return the dot product of the input data,  **input\_data**  and our layer's kernel,  **self.kernel**

Step 6: Implement compute_output_shape method

In [None]:
def compute_output_shape(self, input_shape): return (input_shape[0], self.output_dim)

Here,

- **Line 1**  defines  **compute\_output\_shape**  method with one argument  **input\_shape**
- **Line 2**  computes the output shape using shape of input data and output dimension set while initializing the layer.

Implementing the  **build, call**  and  **compute\_output\_shape**  completes the creating a customized layer. The final and complete code is as follows

In [None]:
from keras import backend as K 
from keras.layers import Layer

class MyCustomLayer(Layer): 
    def __init__(self, output_dim, **kwargs): 
        self.output_dim = output_dim 
        super(MyCustomLayer, self).__init__(**kwargs) 
    
    def build(self, input_shape): 
        self.kernel = self.add_weight(name = 'kernel', 
                        shape = (input_shape[1], self.output_dim),
                        initializer = 'normal', trainable = True) 
        super(MyCustomLayer, self).build(input_shape)

Be sure to call this at the end

In [None]:
def call(self, input_data):return K.dot(input_data,self.kernel)

def compute_output_shape(self, input_shape):return(input_shape[0],
                                                   self.output_dim)

Using our customized layer

Let us create a simple model using our customized layer as specified below −

In [None]:
from keras.models import Sequential
from keras.layers import Dense

model =Sequential()

model.add(MyCustomLayer(32, input_shape =(16,)))

model.add(Dense(8, activation ='softmax')) 
model.summary()

Here,

- Our  **MyCustomLayer**  is added to the model using 32 units and **(16,)** as input shape

Running the application will print the model summary as below −

# Model API

As learned earlier, Keras model API represents the actual neural network model. 
There are three ways to create Keras models:

- **The Sequential model**, which is very straightforward (a simple list of layers), but is limited to single-input, single-output stacks of layers (as the name gives away).
- **The Functional API**, which is an easy-to-use, fully-featured API that supports arbitrary model architectures. For most people and most use cases, this is what you should be using. This is the Keras "industry strength" model.
- **Model subclassing**, where you implement everything from scratch on your own. Use this if you have complex, out-of-the-box research use cases.

## Models API overview

**The Model class**
- Model class
- summary method
- get_layer method


**The Sequential class**
- Sequential class
- add method
- pop method

**Model training APIs**
- compile method
- fit method
- evaluate method
- predict method
- train_on_batch method
- test_on_batch method
- predict_on_batch method
- run_eagerly property

**Model saving & serialization APIs**
- save method
- save_model function
- load_model function
- get_weights method
- set_weights method
- save_weights method
- load_weights method
- get_config method
- from_config method
- model_from_config function
- to_json method
- model_from_json function
- clone_model function

[source](https://keras.io/api/)

 Let us learn now to create model using both **Sequential** and **Functional API** in this section.

## Sequential

The core idea of _ **Sequential API** _ is simply arranging the Keras layers in a sequential order and so, it is called _Sequential API_.

Most of the ANN also has layers in sequential order and the data flows from one layer to another layer in the given order until the data finally reaches the output layer.

In [None]:
## The Sequential class
tf.keras.Sequential(layers=None, name=None)


A ANN model can be created by simply calling **Sequential()** API as specified below:

In [None]:
import tensorflow as tf
from tensorflow import keras

from keras.models import Sequential
model = Sequential()

### Add layers

To add a layer, simply create a layer using Keras layer API and then pass the layer through add() function as specified below −

In [None]:
from keras.models import Sequential

model =Sequential()

input_layer =Dense(units=32, input_shape=(8,)) 

model.add(input_layer)

# or model.add(Dense(units=32, input_shape=(8,)))


hidden_layer =Dense(64, activation='relu'); 
model.add(hidden_layer)

output_layer =Dense(8)
model.add(output_layer)

Here, we have created one input layer, one hidden layer and one output layer.

### Access the model

Keras provides few methods to get the model information like layers, input data and output data. They are as follows:

- **model.layers** − Returns all the layers of the model as list.

In [None]:
layers = model.layers
layers

- _ **model.inputs** _ − Returns all the input tensors of the model as list.

In [None]:
inputs = model.inputs
inputs

In [None]:
outputs = model.outputs
outputs

- **model.get\_weights** − Returns all the weights as NumPy arrays.
- _**model.set\_weights(weight\_numpy\_array)**_ − Set the weights of the model.

### Serialize the model

Keras provides methods to serialize the model into object as well as json and load it again later. They are as follows −

- _**get\_config()**_ − IReturns the model as an object.

In [None]:
config = model.get_config()

- _**from\_config()**_ − It accept the model configuration object as argument and create the model accordingly.

In [None]:
new_model = Sequential.from_config(config)

- _**to\_json()**_ − Returns the model as an json object.

In [None]:
json_string = model.to_json()
json_string

- _**model\_from\_json()**_ − Accepts json representation of the model and create a new model.

In [None]:
from keras.models import model_from_json

new_model = model_from_json(json_string)

### Summarise the model

Understanding the model is very important phase to properly use it for training and prediction purposes. Keras provides a simple method, summary to get the full information about the model and its layers.

A summary of the model created in the previous section is as follows −

In [None]:
model.summary()

### Train and Predict the model

Model provides function for training, evaluation and prediction process. They are as follows −

- **compile**  − Configure the learning process of the model
- **fit**  − Train the model using the training data
- **evaluate**  − Evaluate the model using the test data
- **predict**  − Predict the results for new input.

## Functional API

Sequential API is used to create models layer-by-layer. Functional API is an alternative approach of creating more complex models. Functional model, you can define multiple input or output that share layers. First, we create an instance for model and connecting to the layers to access input and output to the model. This section explains about functional model in brief.

Create a model

Import an input layer using the below module −

In [None]:
from keras.layers import Input

Now, create an input layer specifying input dimension shape for the model using the below code −

In [None]:
data = Input(shape=(2,3))

Define layer for the input using the below module −

In [None]:
from keras.layers import Dense

Add Dense layer for the input using the below line of code −

In [None]:
layer =Dense(2)(data)
print(layer)

Define model using the below module −

In [None]:
from keras.models import Model

Create a model in functional way by specifying both input and output layer −

In [None]:
model = Model(inputs = data, outputs = layer)

The complete code to create a simple model is shown below −

In [None]:
from keras.layers import Input
from keras.models import Model
from keras.layers import Dense

data =Input(shape=(2,3))

layer =Dense(2)(data) 

model =Model(inputs=data,outputs=layer) 

model.summary()

We will cover the **Keras Functional API** later with more details

## Model Compilation

Previously, we studied the basics of how to create model using Sequential and Functional API. This chapter explains about how to compile the model. The compilation is the final step in creating a model. Once the compilation is done, we can move on to training phase.

Let us learn few concepts required to better understand the compilation process.

### Loss

In machine learning,  **Loss**  function is used to find error or deviation in the learning process. Keras requires loss function during model compilation process.

Keras provides quite a few loss function in the  **losses**  module and they are as follows −

- mean\_squared\_error
- mean\_absolute\_error
- mean\_absolute\_percentage\_error
- mean\_squared\_logarithmic\_error
- squared\_hinge
- hinge
- categorical\_hinge
- logcosh
- huber\_loss
- categorical\_crossentropy
- sparse\_categorical\_crossentropy
- binary\_crossentropy
- kullback\_leibler\_divergence
- poisson
- cosine\_proximity
- is\_categorical\_crossentropy

All above loss function accepts two arguments −

- **y\_true**  − true labels as tensors
- **y\_pred**  − prediction with same shape as  **y\_true**

Import the losses module before using loss function as specified below −

from keras import losses

### Optimizer

In machine learning,  **Optimization**  is an important process which optimize the input weights by comparing the prediction and the loss function. Keras provides quite a few optimizer as a module, _optimizers_ and they are as follows:

Import the optimizers module before using optimizers as specified below:

In [None]:
from keras import optimizers

**SGD**  − Stochastic gradient descent optimizer.

In [None]:
keras.optimizers.SGD(learning_rate = 0.01, momentum = 0.0, nesterov = False)

**RMSprop**  − RMSProp optimizer.

In [None]:
keras.optimizers.RMSprop(learning_rate = 0.001, rho = 0.9)

**Adagrad**  − Adagrad optimizer.

In [None]:
keras.optimizers.Adagrad(learning_rate = 0.01)

**Adadelta**  − Adadelta optimizer.

In [None]:
keras.optimizers.Adadelta(learning_rate = 1.0, rho = 0.95)

**Adam**  − Adam optimizer.

In [None]:
keras.optimizers.Adam(learning_rate =0.001, beta_1 =0.9, beta_2 =0.999, amsgrad =False)

**Adamax**  − Adamax optimizer from Adam.

In [None]:
keras.optimizers.Adamax(learning_rate = 0.002, beta_1 = 0.9, beta_2 = 0.999)

**Nadam**  − Nesterov Adam optimizer.

In [None]:
keras.optimizers.Nadam(learning_rate = 0.002, beta_1 = 0.9, beta_2 = 0.999)

We will cover the Optimizers later with more details

### Metrics

In machine learning,  **Metrics**  is used to evaluate the performance of your model. It is similar to loss function, but not used in training process. Keras provides quite a few metrics as a module,  **metrics**  and they are as follows

- accuracy
- binary\_accuracy
- categorical\_accuracy
- sparse\_categorical\_accuracy
- top\_k\_categorical\_accuracy
- sparse\_top\_k\_categorical\_accuracy
- cosine\_proximity
- clone\_metric

Similar to loss function, metrics also accepts below two arguments −

- **y\_true**  − true labels as tensors
- **y\_pred**  − prediction with same shape as  **y\_true**

Import the metrics module before using metrics as specified below −

In [None]:
from keras import metrics

### Compile the model

Keras model provides a method, **compile()** to compile the model. The argument and default value of the **compile()** method is as follows:

```Python
Model.compile(
    optimizer="rmsprop",
    loss=None,
    metrics=None,
    loss_weights=None,
    weighted_metrics=None,
    run_eagerly=None,
    steps_per_execution=None,
    jit_compile=None,
    **kwargs
)
```

The important arguments are as follows −

- loss function
- Optimizer
- metrics

A sample code to compile the mode is as follows −

In [None]:
from keras import losses
from keras import optimizers
from keras import metrics

model.compile(loss ='mean_squared_error',

optimizer ='sgd', metrics =[metrics.categorical_accuracy])

where,

- loss function is set as  **mean\_squared\_error**
- optimizer is set as  **sgd**
- metrics is set as  **metrics.categorical\_accuracy**

### Model Training

Models are trained by NumPy arrays using _**fit()**_. 

The main purpose of this fit function is used to evaluate your model on training. This can be also used for graphing model performance. It has the following syntax:

```Python
Model.fit(
    x=None,
    y=None,
    batch_size=None,
    epochs=1,
    verbose="auto",
    callbacks=None,
    validation_split=0.0,
    validation_data=None,
    shuffle=True,
    class_weight=None,
    sample_weight=None,
    initial_epoch=0,
    steps_per_epoch=None,
    validation_steps=None,
    validation_batch_size=None,
    validation_freq=1,
    max_queue_size=10,
    workers=1,
    use_multiprocessing=False,
)

```

Arguments

- **X, y**  − It is a tuple to evaluate your data.
- **epochs**  − no of times the model is needed to be evaluated during training.
- **batch\_size**  − training instances.

**Example**


Let us take a simple example of numpy random data to use this concept.

**Create data**

Let us create a random data using numpy for x and y with the help of below mentioned command −

In [None]:
import numpy as np

x_train = np.random.random((100,4,8))
y_train = np.random.random((100,10))

In [None]:
x_train.shape, y_train.shape

Now, create random validation data,

In [None]:
x_val = np.random.random((100,4,8))
y_val = np.random.random((100,10))

****Create model****

Let us create simple sequential model −

In [None]:
from keras.models import Sequential 
model = Sequential()


**Add layers**

Create layers to add to the model.

In [None]:
from keras.layers import LSTM, Dense

# add a sequence of vectors of dimension 16

model.add(LSTM(16, return_sequences = True))
model.add(Dense(10, activation = 'softmax'))

**compile model**

Now model is defined. You can compile using the below command −

In [None]:
model.compile(

loss = 'categorical_crossentropy', 
    optimizer = 'sgd', 
    metrics = ['accuracy']
)

**Apply fit()**

Now we apply _fit()_ function to train our data −

In [None]:
model.fit(x_train, y_train, 
          batch_size = 32, 
          epochs = 5, 
          validation_data = (x_val, y_val))

## Dataset module

Before creating a model, we need to choose a problem, need to collect the required data and convert the data to NumPy array. Once data is collected, we can prepare the model and train it by using the collected data. Data collection is one of the most difficult phase of machine learning. Keras provides a special module, datasets to download the online machine learning data for training purposes. It fetches the data from online server, process the data and return the data as training and test set. Let us check the data provided by Keras dataset module. The data available in the module are as follows,

- CIFAR10 small image classification
- CIFAR100 small image classification
- IMDB Movie reviews sentiment classification
- Reuters newswire topics classification
- MNIST database of handwritten digits
- Fashion-MNIST database of fashion articles
- Boston housing price regression dataset

In the following two examples, we will use **MNIST database of handwritten digits** and  **Boston Housing** to build two simple MLP for classification and regression.

## Multi-Layer Perceptron (MLP)

We have learned to create, compile and train the Keras models.

Let us apply our learning and create a simple MPL based ANN.

### MLP for classification

Let us use the **MNIST database of handwritten digits** (or minst) as our input. minst is a collection of 60,000, 28x28 grayscale images. It contains 10 digits. It also contains 10,000 test images.

Below code can be used to load the dataset −

In [None]:
from keras.datasets import mnist

(x_train, y_train),(x_test, y_test)= keras.datasets.mnist.load_data()

where

- **Line 1**  imports  **minst**  from the keras dataset module.
- **Line 3**  calls the  **load\_data**  function, which will fetch the data from online server and return the data as 2 tuples, First tuple, **(x\_train, y\_train)** represent the training data with shape, **(number\_sample, 28, 28)** and its digit label with shape, **(number\_samples, )**. Second tuple, **(x\_test, y\_test)** represent test data with same shape.

#### Load data and build the model

Let us choose a simple multi-layer perceptron (MLP) as represented below and try to create the model using Keras.

![](img/Picture2.jpg)

The core features of the model are as follows −

- Input layer consists of 784 values (28 x 28 = 784).
- First hidden layer,  **Dense**  consists of 512 neurons and 'relu' activation function.
- Second hidden layer,  **Dropout**  has 0.2 as its value.
- Third hidden layer, again Dense consists of 512 neurons and 'relu' activation function.
- Fourth hidden layer,  **Dropout**  has 0.2 as its value.
- Fifth and final layer consists of 10 neurons and 'softmax' activation function.
- Use  **categorical\_crossentropy**  as loss function.
- Use **RMSprop()** as Optimizer.
- Use  **accuracy**  as metrics.
- Use 128 as batch size.
- Use 20 as epochs.



**Step 1 − Import the modules**

Let us import the necessary modules.

In [None]:
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense,Dropout
from tensorflow.keras.optimizers import RMSprop
from tensorflow.keras.utils import to_categorical

import numpy as np

**Step 2 − Load data**

Let us import the mnist dataset.

In [None]:
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
assert x_train.shape == (60000, 28, 28)
assert x_test.shape == (10000, 28, 28)
assert y_train.shape == (60000,)
assert y_test.shape == (10000,)

**Step 3 − Process the data**

Let us change the dataset according to our model, so that it can be feed into our model.

In [None]:
x_train = x_train.reshape(60000,784)
x_test = x_test.reshape(10000,784)

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')

x_train /=255
x_test /=255

y_train = to_categorical(y_train,10)
y_test = to_categorical(y_test,10)


Where

- **reshape**  is used to reshape the input from (28, 28) tuple to (784, )
- **to\_categorical**  is used to convert vector to binary matrix

**Step 4 − Create the model**

Let us create the actual model.

In [None]:
model =Sequential()

model.add(Dense(512, activation ='relu', 
                input_shape =(784,)))
model.add(Dropout(0.2))

model.add(Dense(512, activation ='relu'))
model.add(Dropout(0.2))

model.add(Dense(10, activation ='softmax'))


**Step 5 − Compile the model**

Let us compile the model using selected loss function, optimizer and metrics.

In [None]:
model.compile(loss ='categorical_crossentropy', 
              optimizer =RMSprop(),
              metrics =['accuracy'])

**Step 6 − Train the model**

Let us train the model using _**fit()**_ method.

In [None]:
history = model.fit(x_train, y_train,
                    batch_size =128,
                    epochs =20,
                    verbose =1,
                    validation_data =(x_test, y_test))


Final thoughts

We have created the model, loaded the data and also trained the data to the model. We still need to evaluate the model and predict output for unknown input, which we learn in upcoming chapter.

In [None]:
from tensorflow import keras

from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout
from tensorflow.keras.optimizers import RMSprop
from tensorflow.keras.utils import to_categorical

import numpy as np

(x_train, y_train),(x_test, y_test)= keras.datasets.mnist.load_data()
x_train = x_train.reshape(60000,784)
x_test = x_test.reshape(10000,784)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')

x_train /=255
x_test /=255

y_train = to_categorical(y_train,10)
y_test = to_categorical(y_test,10)

model =Sequential()

model.add(Dense(512, activation='relu', input_shape =(784,)))

model.add(Dropout(0.2))

model.add(Dense(512, activation ='relu')) 
model.add(Dropout(0.2))

model.add(Dense(10, activation ='softmax'))

model.compile(loss ='categorical_crossentropy',
              optimizer =RMSprop(),
              metrics =['accuracy'])

history = model.fit(x_train, y_train,
                    batch_size =128, 
                    epochs =20, 
                    verbose =1,
                    validation_data =(x_test, y_test))


Let us begin by evaluating the trained model.

#### Model Evaluation

Evaluation is a process during development of the model to check whether the model is best fit for the given problem and corresponding data. Keras model provides a function, evaluate which does the evaluation of the model. It has three main arguments,

- Test data
- Test data label
- verbose - true or false

Let us evaluate the model, which we created in the previous section using test data.

In [None]:
score = model.evaluate(x_test, y_test, verbose =0)

print('Test loss:', score[0])

print('Test accuracy:', score[1])


The test accuracy is 98.51%. We have created a best model to identify the handwriting digits. 

On the positive side, we can still scope to improve our model.

#### Model Prediction

_ **Prediction** _ is the final step and our expected outcome of the model generation. Keras provides a method, `predict` to get the prediction of the trained model. The signature of the _predict_ method is as follows,

```python 
predict(x,
        batch_size =None,
        verbose =0,
        steps =None,
        callbacks =None,
        max_queue_size =10,
        workers =1,
        use_multiprocessing =False
       )
```

Here, all arguments are optional except the first argument, which refers the unknown input data. The shape should be maintained to get the proper prediction.

Let us do prediction for our MPL model created in previous chapter using below code −

In [None]:
pred = model.predict(x_test)

pred = np.argmax(pred, axis =1)[:5]
label = np.argmax(y_test,axis =1)[:5]

print(pred)
print(label)

Here,

- **Line 1**  call the predict function using test data.
- **Line 2**  gets the first five prediction
- **Line 3**  gets the first five labels of the test data.
- **Line 5 - 6**  prints the prediction and actual label.

The output of both array is identical and it indicate that our model predicts correctly the first five images.

### MLP for regression


In this section, let us write a simple MPL based ANN to do regression prediction.

Till now, we have only done the classification based prediction. 

Now, we will try to predict the next possible value by analyzing the previous (continuous) values and its influencing factors.

In this example, we build a Regression MPL to predict hourse price and use the **Boston Housing** dataset which is 
 provided by Keras. It represents a collection of housing information in Boston area, each having 13 features.


![](img/Picture3.jpg)

#### Load data and build the model

Let us choose a simple multi-layer perceptron (MLP) for regression and create the model using Keras.

The core features of the MLP model are as follows −

- Input layer consists of (13,) values.
- First layer, _Dense_ consists of 64 units and 'relu' activation function with 'normal' kernel initializer.
- Second layer, _Dense_ consists of 64 units and 'relu' activation function.
- Output layer, _Dense_ consists of 1 unit.
- Use  **mse**  as loss function.
- Use  **RMSprop**  as Optimizer.
- Use  **accuracy**  as metrics.
- Use 128 as batch size.
- Use 500 as epochs.

**Step 1 − Import the modules**

Let us import the necessary modules.

In [None]:
import keras

from keras.datasets import boston_housing
from keras.models import Sequential
from keras.layers import Dense

from tensorflow.keras.optimizers import RMSprop

from keras.callbacks import EarlyStopping

from sklearn import preprocessing

from sklearn.preprocessing import scale

**Step 2 − Load data**

Let us import the Boston housing dataset. 
Below code can be used to load the dataset.

In [None]:
(x_train, y_train), (x_test, y_test) = boston_housing.load_data()

**Step 3 − Process the data**

Let us change the dataset according to our model, so that, we can feed into our model. The data can be changed using below code −

In [None]:
x_train_scaled = preprocessing.scale(x_train)

scaler = preprocessing.StandardScaler().fit(x_train)

x_test_scaled = scaler.transform(x_test)

Here, we have normalized the training data using  **sklearn.preprocessing.scale**  function. **preprocessing.StandardScaler().fit** function returns a scalar with the normalized mean and standard deviation of the training data, which we can apply to the test data using  **scalar.transform**  function. This will normalize the test data as well with the same setting as that of training data.

**Step 4 − Create the model**

Let us create the actual model.

In [None]:
model =Sequential()

model.add(Dense(64, 
                kernel_initializer ='normal', 
                activation ='relu',

input_shape =(13,)))

model.add(Dense(64, activation ='relu')) 
model.add(Dense(1))

**Step 5 − Compile the model**

Let us compile the model using selected loss function, optimizer and metrics.

In [None]:
model.compile(
    loss ='mse',
    optimizer =RMSprop(),
    metrics =['mean_absolute_error']
)

**Step 6 − Train the model**

Let us train the model using **fit()** method.

In [None]:
history = model.fit(
    x_train_scaled, y_train,
    batch_size=128,
    epochs =500,
    verbose =1,
    validation_split =0.2,
    callbacks =[EarlyStopping(monitor ='val_loss', patience =20)]

)

Here, we have used callback function,  **EarlyStopping**. The purpose of this callback is to monitor the loss value during each epoch and compare it with previous epoch loss value to find the improvement in the training. If there is no improvement for the  **patience**  times, then the whole process will be stopped.

Executing the application will give the below information as output −

#### Model Evaluation

Evaluation is a process during development of the model to check whether the model is best fit for the given problem and corresponding data. Keras model provides a function, `evaluate` which does the evaluation of the model. 

**Step 7 − Evaluate the model**

Let us evaluate the model using test data.

In [None]:
score = model.evaluate(x_test_scaled, y_test, verbose =0)

print('Test loss:', score[0])

print('Test accuracy:', score[1])

#### Model Prediction

**Step 8 − Predict**

Finally, predict using test data as below −

In [None]:
prediction = model.predict(x_test_scaled)

print(prediction.flatten())

print(y_test)

The output of the above application is as follows −

The output of both array have around 10-30% difference and it indicate our model predicts with reasonable range.

> Keras - Time Series Prediction using LSTM RNN

> Keras - Applications

**Acknowledgment:**

> This tutorial is adapted from the [Keras Quick Guide](https://www.tutorialspoint.com/keras/keras\_quick\_guide.htm) authored by  [Tutorialpoint](https://www.tutorialspoint.com). Please visit tutorial point for learning materials on technical and non-technical subjects.
> In addition, this tutorial include examples from [Keras.io](https://keras.io) website
