## Neural Networks
- inference (prediction)
- training

### Neurons and the brain

Neural networks try to mimic the brain.  
speech recognition > images (computer vision) > text (NLP) ..

### Neural network architecture
- Consist of 
 1. Input layer (vector of inputs) $\vec{x}$
 2. Hidden layer - multiple neurons (vector of activation values) $\vec{a}$
 3. Output layer
- Can have multiple hidden layers (multi-layer perceptron)
- Choosing appropriate architecture ( no of hidden layers and nodes per hidden layer) affects performance

### Recognising Images (Computer Vision)

1. Concept: Input image > Output
 - Face recognition: Image > Identity of person in picture
 - If 1000x1000 pixels it will be converted to matrix, each pixel ranges from 0 to 255
 - Become a list/ feature vector of 1,000,000 pixel intensity values
 - $\vec{x}$ > Hideen layers > probability of person being 'XYZ'
 - Each hidden layer window progressively getting bigger (lines > singular face parts > portions of face)
 - Activations are higher level features

### Neural network layer (A layer of neurons)


- Each neuron has its own logistic regression unit
- Neurons (has parameters w1,b1; w2,b2) ... and outputs vector activation values
- $\vec{a}^{[1]}$ - Activation value vector of layer 1
-  $\vec{a}^{[1]}$ becomes input for layer 2

<figure>
   <img src="./images/Week4_1.png"  style="width:540px;height:300px;" >
</figure>


 ${\large  \vec{a}^{[l]}_j = g({\vec{w}^{[l]}_j} \centerdot \vec{a}^{[l-1]} + \vec{b}^{[l]}_j)}$  
 $\vec{a}^{[l]}_j$ = Activation value of layer $l$, unit(neuron) $j$  
 ${\vec{w}^{[l]}_j}$, $\vec{b}^{[l]}_j$ = Parameters $w,b$ of layer $l$, unit(neuron) $j$  
 $\vec{a}^{[l-1]}$ = output of layer $l-1$(previous layer)  
 $g$ = sigmoid / activation function



### Inference: Making predicitons (forward propagation)


### Building models using TensorFlow

In [6]:
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import Dense, Input
from tensorflow.keras import Sequential
from tensorflow.keras.losses import MeanSquaredError, BinaryCrossentropy
from tensorflow.keras.activations import sigmoid

In [15]:
x = np.array([[200.0, 17.0]])
layer_1 = Dense(units=3, activation='sigmoid')
a1=layer_1(x)
x

array([[200.,  17.]])

In [14]:
layer_2 = Dense(units=1, activation='sigmoid')
a2=layer_2(a1)
a2

<tf.Tensor: shape=(1, 1), dtype=float32, numpy=array([[0.2483764]], dtype=float32)>