# Types based on depth

- Shallow neural network: The neural networks with typically 1 or 2 hidden layers
- Deep neural network: The neural networks with many hidden layers with large neurons in each layer


# Convolutional Neural Network

<center><img src="images/01.01.png"  style="width: 400px, height: 300px;"/></center>

- Similar to neural networks
    - normal neural networks take N X 1  inputs for N no of columns
    - CNNs take R X C X N  for R no of rows, C no of columns and N no of channels in an image
- Takes inputs as images
- Allows us to incorporate certain properties into the architecture for images
    - Smooth forward propagation
- Reduced parameters
    - Convolution helps us to reduce parameters and fasten computation 
    - Helps us to retain special dimensions and informations
- Working Process:
    1. Convolution applies filters to sort out special dimensions
    2. Pooling helps to extract significant pattern in the spatial dimension
    3. Fully connected layer flattens the last Convolution or Pooling layer and connect with all the nodes of the flattenned layer with all the nodes in output layer in a dense manner

### Convolution Layer

<center><img src="images/01.02.png"  style="width: 400px, height: 300px;"/></center>
<center><img src="images/01.03.png"  style="width: 400px, height: 300px;"/></center>
<center><img src="images/01.04.png"  style="width: 400px, height: 300px;"/></center>


- filter is slided through the pixels of an image
- Sliding is done from left to right, up to down with a specified step size (stride)
- It is the dot product of filter and the overlapping pixel values
- Each filter creates a new reduced matrix from the original/previous pixel matrix
- ReLU filters the output of convolution step
    - Helps to retain positive information and discard negative information
    - Overcomes the vanishing gradient problem introduced by sigmoid

### Pooling Layer

<center><img src="images/01.05.png"  style="width: 400px, height: 300px;"/></center>
<center><img src="images/01.06.png"  style="width: 400px, height: 300px;"/></center>


- Retains the information within a specified region 
- Collects spatial variants that helps the neural network to identify patterns in images
- Reduce the dimension by pooling out the sumarized information
- Max pooling : Pool out the max value in the specified region
- Average pooling : Pool out the average value in the specified region

### Fully Connected Layer

- Flatten output of last convolutional layer
- Connect every node of the current layer with every other node in the next layer
- eg: all nodes from convolution or pooling layer is connected with the next layers N number of nodes in output (for N classification problem) 

# Sample CNN

In [None]:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Conv2D, MaxPooling2D, Flatten
from tensorflow.keras.utils import to_categorical

# convert the target variable to categorical
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

# define the CNN model
model = Sequential()
model.add(Conv2D(16, kernel_size=(2,2), strides=(1,1), activation='relu', input_shape=(128,128,3)))
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2)))
model.add(Conv2D(32, kernel_size=(2,2), strides=(1,1), activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(n_class, activation='softmax'))

# compile the model
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# train the model
model.fit(X_train, y_train, epochs=10, batch_size=32, verbose=1, validation_data=(X_test, y_test))

# evaluate the model
loss, accuracy = model.evaluate(X_test, y_test)
print('Categorical Crossentropy Loss:', loss)
print('Accuracy:', accuracy)


# Recurrent Neural Network

<center><img src="images/02.01.png"  style="width: 400px, height: 300px;"/></center>


- Traditional neural networks take independent scenes as inputs
- Recurrent neural networks incorporates dependency of sequence within a neural network
- RNNs are networks with loops
- All nodes compute: input_data * input_weight + previous_node_output * node_weight = new_output
- example: LSTM

# Auto-encoders

<center><img src="images/02.02.png"  style="width: 400px, height: 300px;"/></center>


- It is a data compression algorithm
- Unsupervised neural network model
- Uses back-propagation by setting target as same as input to train itself without using labels
- It tries to find the approximation of the identity function
- Neural networks use non-linear activation functions, so autoencoders can learn data projections that are more interesting than PCA or other basic techniques
- Compression and de-compression functions are learned automatically from data
- Data-specific (Only be able to compress data that they are trained on)
- How it works:
    - takes image as input
    - use encoder to find optimal compressed representation of the input image
    - Usse a decoder to restore the original image
- example : RBM (Restricted Boltzmann Machine)