#### Deep Neural Network Architecture - Understanding

#### Keras

In [55]:
import tensorflow as tf
import keras

##### Sequential models (from Keras) - are not appropriate when there are multiple inputs and outputs, or any layers has multiple inputs and outputs, or there is a need for non-linearity topology. It is limited to single-input and single-output. That is, exactly one input tensor and one output tensor

##### Functional API (from Keras) - helps to create models for non-lineary topology. It allows to aggregate models (non-sequential) and create an ensemble model. Example, encoder and decoder models

In [45]:
# Sequential modeling,
seq_model = keras.Sequential(name="my_seq_model")
seq_model.add(keras.Input(shape=(250, 250, 3)))
seq_model.add(keras.layers.Dense(32,activation='relu', name='layer-1'))
seq_model.add(keras.layers.Dense(32,activation='relu', name='layer-2'))
seq_model.add(keras.layers.Dense(1,activation='sigmoid', name='layer-3'))
seq_model.summary()

In [44]:
# Functional API
inputs = keras.Input(shape=(256,25,3))
dense = keras.layers.Dense(32, activation="relu", name="G-x")
x = dense(inputs)
x = keras.layers.Dense(32, activation="relu", name='G-q')(x)
outputs = keras.layers.Dense(10, name='G-k')(x)
model = keras.Model(inputs=inputs, outputs=outputs, name="my_func_model")
model.summary()

#### Transfer Learning - How is to done?
##### Models are partially trainable. Often top layers are trainable, bottom layers are non-trainable or frozen. Another method, the base model is non-trainable.

In [51]:
# freeze the last layer of sequential model
for layer in seq_model.layers[:-1]:
    layer.trainable=False

seq_model.compile()
seq_model.summary()

In [56]:
#freeze the base model. Xception are pre-trained models
base_model = keras.applications.Xception(
    weights='imagenet',
    include_top=False,
    pooling='avg')

base_model.trainable = False
# Use a Sequential model to add a trainable classifier on top
ext_model = keras.Sequential([
    base_model,
    keras.layers.Dense(1000),
])
ext_model.compile()
ext_model.summary()

##### CNN and Dense - top level choice of Neural Network

##### Conv layers' parameters are associated with convolutional filters which can learn. In image processing context, the forwad pass, a spatial filter (h x w x 3) slide across (convolve) an input image volume (H x W), computes a dot product from the overlapping matrices and produces a 2 dimensional activation map or feature map. The filters extract/learn different features from each Conv layer. Example - 1st Layer is for contour detection, 2nd Layer for edge detection. Finally all the activation maps are stacked along depth dimension and output volume is produced.

##### Dense / Fully Connected layers performs linear operation on the input vector. Hence layer before Dense layer are flattened so that it produce a long vector to be processed by Dense network

### Lower Rank Matrices

##### Rank of Matrix -  dimensionality of data. Measures amount of information content

##### The number of independent rows (or columns) present in the matrix determines the rank \begin{bmatrix} 1 & 4 & 3\\ 2 & 8 & 6 \\3 & 12 & 9\end{bmatrix}
##### 2-row = 2 x 1-row ; 3-row = 3 x 1-row.  Since 1 row is the only independent row, the matrix has a rank of 1.       
Traditional fine tunning a model requires whole model to be adjusted. **LoRA** leverages adjusting smaller subset parameters (i.e., low rank matrics) to save computational performance