# Deep Neural Networks

- _input layer_ takes in number. Input data such as text, speech, image is converted to numbers.
- `scalar` is a single value
- `vector` is a 1D array of values, e.g. list of numbers
- `matrix` is a 2D array of values, e.g. pixels in a black-white image
- `tensor` is a 3D or more array of values, e.g. an image with three channels
- `normalization` is transforming features to have the same scale [^1]
- `standardization` is converting numbers to have mean zero, standard deviation of one

[^1]: https://developers.google.com/machine-learning/data-prep/transform/normalization

In [1]:
import tensorflow

2023-03-01 23:30:32.766524: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


In [3]:
# Always check the version used.
tensorflow.__version__

'2.11.0'

In [7]:
from tensorflow.keras import Input

# Input?
Input(shape=(13,))

<KerasTensor: shape=(None, 13) dtype=float32 (created by layer 'input_1')>

We specify that the input should be a row with 13 elements. 

- the `None` in `shape` refers to the batch size, which is unknown
- the `dtype` defaults to single-precision, `float32`

## Basic Dense Layer

In [8]:
from tensorflow.keras import Input, Model
from tensorflow.keras.layers import Dense

In [10]:
inputs = Input((13,))  # Takes an array with 13 elements
input = Dense(10)(inputs)
hidden = Dense(10)(input)
output = Dense(1)(hidden)
model = Model(input, output)
model.summary()

Model: "model_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_5 (InputLayer)        [(None, 10)]              0         
                                                                 
 dense_4 (Dense)             (None, 10)                110       
                                                                 
 dense_5 (Dense)             (None, 1)                 11        
                                                                 
Total params: 121
Trainable params: 121
Non-trainable params: 0
_________________________________________________________________


## Activation Functions

Activation functions assist in finding the nonlinear separation.

Types of activation function:
- rectified linear unit (ReLU)
- sigmoid
- softmax

In [11]:
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense, ReLU

model = Sequential()
model.add(Dense(10, input_shape=(13,)))
model.add(ReLU())  # Convention is to add ReLU to each non-output layer.
model.add(Dense(10))
model.add(ReLU())  # Convention is to add ReLU to each non-output layer.
model.add(Dense(1))
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_6 (Dense)             (None, 10)                140       
                                                                 
 re_lu (ReLU)                (None, 10)                0         
                                                                 
 dense_7 (Dense)             (None, 10)                110       
                                                                 
 re_lu_1 (ReLU)              (None, 10)                0         
                                                                 
 dense_8 (Dense)             (None, 1)                 11        
                                                                 
Total params: 261
Trainable params: 261
Non-trainable params: 0
_________________________________________________________________


## Shorthand syntax

> `summary` doesn't show the `relu` layer if we use the shorthand syntax.

In [16]:
model = Sequential()
model.add(Dense(10, input_shape=(13,), activation="relu"))
model.add(Dense(10, activation="relu"))
model.add(Dense(1))
model.summary()

Model: "sequential_4"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_18 (Dense)            (None, 10)                140       
                                                                 
 dense_19 (Dense)            (None, 10)                110       
                                                                 
 dense_20 (Dense)            (None, 1)                 11        
                                                                 
Total params: 261
Trainable params: 261
Non-trainable params: 0
_________________________________________________________________


There is a third-option, using `Activation` function.

In [17]:
from tensorflow.keras.layers import Activation

In [18]:
model = Sequential()
model.add(Dense(10, input_shape=(13,)))
model.add(Activation("relu"))
model.add(Dense(10))
model.add(Activation("relu"))
model.add(Dense(1))
model.summary()

Model: "sequential_5"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_21 (Dense)            (None, 10)                140       
                                                                 
 activation (Activation)     (None, 10)                0         
                                                                 
 dense_22 (Dense)            (None, 10)                110       
                                                                 
 activation_1 (Activation)   (None, 10)                0         
                                                                 
 dense_23 (Dense)            (None, 1)                 11        
                                                                 
Total params: 261
Trainable params: 261
Non-trainable params: 0
_________________________________________________________________


## DNN Binary Classifier

_Binary classifier_ (aka _logistic classifier_) predicts whether the input is or is not something.

a) A `sigmoid` is used for a binary classification
b) Common convention for loss and optimizer for binary classifier

In [14]:
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense

model = Sequential()
model.add(Dense(10, input_shape=(13,), activation="relu"))
model.add(Dense(10, activation="relu"))
model.add(Dense(1, activation="sigmoid"))  # a)
model.compile(
    loss="binary_crossentropy", optimizer="rmsprop", metrics=["accuracy"]
)  # b)
model.summary()

Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_12 (Dense)            (None, 10)                140       
                                                                 
 dense_13 (Dense)            (None, 10)                110       
                                                                 
 dense_14 (Dense)            (None, 1)                 11        
                                                                 
Total params: 261
Trainable params: 261
Non-trainable params: 0
_________________________________________________________________


## DNN Multiclass Classifier

This model classify the inputs to one or more labels.

In [20]:
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense

model = Sequential()
model.add(Dense(10, input_shape=(4,), activation="relu"))
model.add(Dense(10, activation="relu"))
model.add(
    Dense(5, activation="softmax")
)  # Softmax activation is used for a multiclass classifier.
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])
model.summary()

Model: "sequential_7"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_27 (Dense)            (None, 10)                50        
                                                                 
 dense_28 (Dense)            (None, 10)                110       
                                                                 
 dense_29 (Dense)            (None, 5)                 55        
                                                                 
Total params: 215
Trainable params: 215
Non-trainable params: 0
_________________________________________________________________


## DNN Multilabel Multiclass Classifier

This model predicts two or more classes per input. We use `functional` instead of `sequential` model.

In [23]:
from tensorflow.keras import Input, Model
from tensorflow.keras.layers import Dense

inputs = Input((3,))
x = Dense(10, activation="relu")(inputs)
x = Dense(10, activation="relu")(x)
output1 = Dense(5, activation="softmax")(x)  # Predicts one of five classes.
output2 = Dense(2, activation="softmax")(x)  # Predicts one of two classes.
model = Model(inputs, [output1, output2])
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])
model.summary()

Model: "model_3"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 input_7 (InputLayer)           [(None, 3)]          0           []                               
                                                                                                  
 dense_34 (Dense)               (None, 10)           40          ['input_7[0][0]']                
                                                                                                  
 dense_35 (Dense)               (None, 10)           110         ['dense_34[0][0]']               
                                                                                                  
 dense_36 (Dense)               (None, 5)            55          ['dense_35[0][0]']               
                                                                                            