<img src="https://firebasestorage.googleapis.com/v0/b/deep-learning-crash-course.appspot.com/o/Logo.png?alt=media&token=06318ee3-d7a0-44a0-97ae-2c95f110e3ac" width="100" height="100" align="right"/>

## 3 Neural Networks in TensorFlow

## 3.1 Artificial Intelligence, Machine Learning and Neural Network


### <font color='Orange'> Artificial Intelligence </font>
> <font size="3">**“Artificial Intelligence is the science and engineering of making intelligent machines, especially intelligent computer programs.” by John McCarthy, 1955**</font>
    
> <font size="3">**In essence, AI is a <span style="color:#4285F4">machine</span> with <span style="color:#4285F4">cognitive functions</span> to solve problems that are usually done by humans with our <span style="color:#4285F4">natural intelligence</span>**</font>

### <font color='Orange'> Machine Learning </font>

<font size="3">**Machine learning is an application of artificial intelligence that provides machines the ability to <span style="color:#4285F4">detect patterns</span> & <span style="color:#4285F4">make predictions and recommendations</span>.**</font>

<font size="3">**Machine learning algorithms are often categorized as**</font>
> <font size="3">**Supervised learning**</font>

> <font size="3">**Unsupervised learning**</font>

> <font size="3">**Reinforcement learning**</font>

<img src="https://firebasestorage.googleapis.com/v0/b/deep-learning-crash-course.appspot.com/o/3Sup%2C%20Unsup%2C%20Rein.png?alt=media&token=4baee322-267b-4aab-b7b9-101b2c88685e" width="800" align="center"/>

### <font color='Orange'> Neural Network </font>

<font size="3">**A neural network is a massively parallel distributed processor <span style="color:#4285F4">(network)</span> made up of simple processing units <span style="color:#4285F4">(neurons)</span>. It has a natural propensity for:**</font>

> <font size="3">**storing experiential knowledge through <span style="color:#4285F4">learning</span>**</font>

> <font size="3">**making it available for use <span style="color:#4285F4">(classification/prediction)</span>**</font>

<font size="3">**Neural network resembles the brain in two respects:**</font>
> <font size="3">**Knowledge is acquired by the <span style="color:#4285F4">network</span> through a <span style="color:#4285F4">learning process</span>**</font>

> <font size="3">**Interneuron connection strengths, known as <span style="color:#4285F4">synaptic weights</span>, are used to store the acquired knowledge**</font>


## 3.2 A Gentle Introduction of Machine Learning and Neural Network

### <font color='Orange'> Rule-based Expert System </font>

<font size="3">**A rule-based expert system is the simplest form of artificial intelligence and uses prescribed knowledge-based rules to solve a problem. The objective of an expert system is to take knowledge from a human expert and convert this into a number of well defined rules to apply explicitly to the input data.**</font>

<font size="3">**In the most basic form, the rules are commonly conditional statements <span style="color:#4285F4">(if a, then x, else if b, then y)</span>. These systems should be applied to smaller problems. It is mainly because the more complex a system is, the more rules that are required to describe it, and thus increased difficulty to define all the rules for all possible outcomes.**</font>

### <font color='#176BEF'> Examples </font>
<hr style="border:2px solid #E1F6FF"> </hr>

<img src="https://firebasestorage.googleapis.com/v0/b/deep-learning-crash-course.appspot.com/o/3AndGate.png?alt=media&token=0ffecb41-4b43-468f-8215-556e2fbefbdd" width="500" align="center"/>

In [2]:
def AndGate(A, B):
    if A == 0:
        if B == 0:
            C = 0
        elif B == 1:
            C = 0
    elif A == 1:
        if B == 0:
            C = 0
        elif B == 1:
            C =1
    return C

In [3]:
print(AndGate(0,0))
print(AndGate(0,1))
print(AndGate(1,0))
print(AndGate(1,1))

0
0
0
1


<hr style="border:2px solid #E1F6FF"> </hr>

### <font color='Orange'> Machine Learning </font>

<font size="3">**Machine learning is an application of artificial intelligence that provides systems the ability to automatically learn and improve from experience without being explicitly programmed. Machine learning focuses on the development of computer programs that can access data and use it to learn for themselves.**</font>

<img src="https://firebasestorage.googleapis.com/v0/b/deep-learning-crash-course.appspot.com/o/3MLConcept.png?alt=media&token=fbe246b6-ee85-45e5-b4a7-ba15e00043d7" width="1000" align="center"/>

### <font color='#176BEF'> Examples </font>
<hr style="border:2px solid #E1F6FF"> </hr>

<img src="https://firebasestorage.googleapis.com/v0/b/deep-learning-crash-course.appspot.com/o/3MLExample.png?alt=media&token=153c3e16-5f9b-4fb9-a47e-b348b976262b" width="700" align="center"/>

In [4]:
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

X=np.array([[0, 0], 
            [0, 1],
            [1, 0], 
            [1, 1]])

y=np.array([0, 0, 0, 1]).reshape(4,1)

In [5]:
model = Sequential()
model.add(Dense(4, input_shape=(X.shape[1],)))
model.add(Dense(1, activation ='sigmoid'))
model.compile(optimizer='RMSprop', loss='binary_crossentropy', metrics=['accuracy'])
model.fit(X, y, epochs=200)

Epoch 1/200
Epoch 2/200
Epoch 3/200
Epoch 4/200
Epoch 5/200
Epoch 6/200
Epoch 7/200
Epoch 8/200
Epoch 9/200
Epoch 10/200
Epoch 11/200
Epoch 12/200
Epoch 13/200
Epoch 14/200
Epoch 15/200
Epoch 16/200
Epoch 17/200
Epoch 18/200
Epoch 19/200
Epoch 20/200
Epoch 21/200
Epoch 22/200
Epoch 23/200
Epoch 24/200
Epoch 25/200
Epoch 26/200
Epoch 27/200
Epoch 28/200
Epoch 29/200
Epoch 30/200
Epoch 31/200
Epoch 32/200
Epoch 33/200
Epoch 34/200
Epoch 35/200
Epoch 36/200
Epoch 37/200
Epoch 38/200
Epoch 39/200
Epoch 40/200
Epoch 41/200
Epoch 42/200
Epoch 43/200
Epoch 44/200
Epoch 45/200
Epoch 46/200
Epoch 47/200
Epoch 48/200
Epoch 49/200
Epoch 50/200
Epoch 51/200
Epoch 52/200
Epoch 53/200
Epoch 54/200
Epoch 55/200
Epoch 56/200
Epoch 57/200
Epoch 58/200
Epoch 59/200
Epoch 60/200
Epoch 61/200
Epoch 62/200
Epoch 63/200
Epoch 64/200
Epoch 65/200
Epoch 66/200
Epoch 67/200
Epoch 68/200
Epoch 69/200
Epoch 70/200
Epoch 71/200
Epoch 72/200
Epoch 73/200
Epoch 74/200
Epoch 75/200
Epoch 76/200
Epoch 77/200
Epoch 78

Epoch 168/200
Epoch 169/200
Epoch 170/200
Epoch 171/200
Epoch 172/200
Epoch 173/200
Epoch 174/200
Epoch 175/200
Epoch 176/200
Epoch 177/200
Epoch 178/200
Epoch 179/200
Epoch 180/200
Epoch 181/200
Epoch 182/200
Epoch 183/200
Epoch 184/200
Epoch 185/200
Epoch 186/200
Epoch 187/200
Epoch 188/200
Epoch 189/200
Epoch 190/200
Epoch 191/200
Epoch 192/200
Epoch 193/200
Epoch 194/200
Epoch 195/200
Epoch 196/200
Epoch 197/200
Epoch 198/200
Epoch 199/200
Epoch 200/200


<tensorflow.python.keras.callbacks.History at 0x1b1a5f92940>

In [6]:
Loss, Acc = model.evaluate(X, y, verbose=0)
print('The accuracy is:', Acc*100, '%')

The accuracy is: 75.0 %


<hr style="border:2px solid #E1F6FF"> </hr>

### <font color='Orange'> Machine Learning Requires Big Data </font>

<font size="3">**The process of learning begins with observations or data, such as examples, direct experience, or instruction, in order to look for patterns in data and make future prediction or classification based on the examples that it learnt. Therefore, machine learning algorithms become more effective and accurate as the size of training datasets grows, and require big data to work. Without large, well maintained datasets, machine learning algorithms fall far short of their potential.**</font>

### <font color='#176BEF'> Examples </font>
<hr style="border:2px solid #E1F6FF"> </hr>

In [7]:
X=np.array([[0, 0], 
            [0, 1],
            [1, 0], 
            [1, 1]])

y=np.array([0, 0, 0, 1]).reshape(4,1)

X_new = np.repeat(X, 50, axis=0)
y_new = np.repeat(y, 50, axis=0)

In [8]:
model = Sequential()
model.add(Dense(4, input_shape=(X.shape[1],)))
model.add(Dense(1, activation ='sigmoid'))
model.compile(optimizer='RMSprop', loss='binary_crossentropy', metrics=['accuracy'])

model.fit(X_new, y_new, epochs=100)

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78

<tensorflow.python.keras.callbacks.History at 0x1b1a64f1610>

In [9]:
Loss_new, Acc_new = model.evaluate(X_new, y_new, verbose=0)
print('The accuracy is:', Acc_new*100, '%')

The accuracy is: 100.0 %


<font size="5"><span style="background-color:#EA4335; color:white">&nbsp;!&nbsp;</span></font>
<font size="3">**With large enough data, not only the accuracy is higher, but also the convergence rate is faster.**</font>

<hr style="border:2px solid #E1F6FF"> </hr>

## 3.3 Logistic Regression

<font size="3">**Logistic regression is a simple form of a neural network:**</font>

> <font size="3">**Contain only <span style="color:#4285F4">one layer neural network</span>**</font>

> <font size="3">**<span style="color:#4285F4">Classifies</span> data categorically (e.g. 0 or 1)**</font>

<font size="3">**There are two main steps in logistic regression:**</font>

<font size="3">**1. Parameters Initialization**</font>
> <font size="3">**Weights <span style="color:#4285F4">w</span>**</font> <br>
> <font size="3">**Biases <span style="color:#4285F4">b</span>**</font> <br>

<font size="3">**2. Loop:**</font> <br>
> <font size="3">**1. Forward propagation - Calculate <span style="color:#4285F4">Loss</span>**</font> <br>
> <font size="3">**2. Backward propagation - Calculate <span style="color:#4285F4">Gradient</span>**</font> <br>
> <font size="3">**3. Gradient descent - <span style="color:#4285F4">Update parameters</span>**</font>

<hr style="border:2px solid #34A853"> </hr>

### <font color='#34A853'> Logistic Regression - Forward Propagation </font>

> <font size="3">**1. Takes input and calculate <span style="color:#4285F4">weighted sum</span>**</font>

<img src="https://firebasestorage.googleapis.com/v0/b/deep-learning-crash-course.appspot.com/o/3Logistic1.png?alt=media&token=9be673c5-9f47-49f3-a13e-278bb0731053" width="350" align="center"/>

> <font size="3">**2. Passes the weighted sum through <span style="color:#4285F4">sigmoid function</span>**</font>

<img src="https://firebasestorage.googleapis.com/v0/b/deep-learning-crash-course.appspot.com/o/3Logistic2.png?alt=media&token=41b063cd-8f50-486e-a7ba-49f83a3baab9" width="350" align="center"/>

> <font size="3">**3. Returns an output of <span style="color:#4285F4">probability</span> between 0 & 1**</font>

<img src="https://firebasestorage.googleapis.com/v0/b/deep-learning-crash-course.appspot.com/o/3Logistic3.png?alt=media&token=ff57a33f-7d76-4858-8a73-f3302f66e6de" width="350" align="center"/>

> <font size="3">**4. Calculate <span style="color:#4285F4">loss</span>**</font> <br>
> <font size="3"><img src="https://firebasestorage.googleapis.com/v0/b/deep-learning-crash-course.appspot.com/o/3Logistic4.png?alt=media&token=99feb6f4-3947-4f4a-ae80-66060ee1b09a" width="500" align="center"/>

<img src="https://firebasestorage.googleapis.com/v0/b/deep-learning-crash-course.appspot.com/o/3Logistic5.png?alt=media&token=26dccc8a-76e1-49b2-b118-181186c0a7c3" width="550" align="center"/>

### <font color='#34A853'> Logistic Regression - Backward Propagation </font>

> <font size="3">**1. <span style="color:#4285F4">Loss function</span> represent one training sample**</font> <br>

> <font size="3">**2. <span style="color:#4285F4">Cost function</span>, <span style="color:#4285F4">J</span>, for all training samples**</font> <br>
> <font size="3"><img src="https://firebasestorage.googleapis.com/v0/b/deep-learning-crash-course.appspot.com/o/3Logistic6.png?alt=media&token=2a7f2cde-c88d-41fa-a5cc-a2f3a016c4ff" width="350" align="center"/>
    
> <font size="3">**3. Calculate <span style="color:#4285F4">gradient</span> w.r.t weights and bias**</font>
> <font size="3"><img src="https://firebasestorage.googleapis.com/v0/b/deep-learning-crash-course.appspot.com/o/3Logistic7.png?alt=media&token=71fe0f8d-a0c2-49f4-9175-581acc20f530" width="350" align="center"/>

### <font color='#34A853'> Logistic Regression - Gradient Descent </font>

> <font size="3">**1. Set <span style="color:#4285F4">learning rate</span>, <span style="color:#4285F4">𝜸</span>**</font> <br>

> <font size="3">**2. Update parameters <span style="color:#4285F4">weights</span> and <span style="color:#4285F4">bias</span>**</font> <br>
> <font size="3"><img src="https://firebasestorage.googleapis.com/v0/b/deep-learning-crash-course.appspot.com/o/3Logistic8.png?alt=media&token=942b04f4-ed1d-4d19-bc7e-1e3850344bfe" width="350" align="center"/>

### <font color='Orange'> Logistic Regression - Full Training Algorithm </font>

<font size="3">**For loop (<span style="color:#4285F4">epochs</span>):**</font> <br>
> <font size="3">**1. Forward propagation – Calculate <span style="color:#4285F4">Loss</span>**</font> <br>
> <font size="3">**2. Backward propagation – Calculate <span style="color:#4285F4">Gradient</span>**</font> <br>
> <font size="3">**3. Gradient descent – Update <span style="color:#4285F4">parameters</span>**</font>

    
<hr style="border:2px solid #34A853"> </hr>

## 3.4 From Logistic Regression to Neural Network

<font size="3">**Logistic regression is a simple form of a neural network with <span style="color:#4285F4">one layer </span> and <span style="color:#4285F4">classifies</span> data categorically (e.g. 0 or 1)**</font>

<font size="3">**Neural network can be much complex in reality. But for understanding the concept, neural network can be considered to be formed by stacking together a lot of little sigmoid units.**</font>

<img src="https://firebasestorage.googleapis.com/v0/b/deep-learning-crash-course.appspot.com/o/3NN1.png?alt=media&token=dd89a015-37c0-4849-a639-3e7bedeed65c" width="650" align="center"/>

<font size="3">**Similar to logistic regression, there are two main steps in neural network:**</font>

<font size="3">**1. Parameters Initialization**</font>
> <font size="3">**Weights <span style="color:#4285F4">w</span>**</font> <br>
> <font size="3">**Biases <span style="color:#4285F4">b</span>**</font> <br>

<font size="3">**2. For loop (<span style="color:#4285F4">epochs</span>):**</font> <br>
> <font size="3">**1. Forward propagation - Calculate <span style="color:#4285F4">Loss</span>**</font> <br>
> <font size="3">**2. Backward propagation - Calculate <span style="color:#4285F4">Gradient</span>**</font> <br>
> <font size="3">**3. Gradient descent - <span style="color:#4285F4">Update parameters</span>**</font>

<hr style="border:2px solid #34A853"> </hr>

### <font color='#34A853'> Neural Network - Forward Propagation </font>

> <font size="3">**STEP 1 - <font color='Red'>Hidden Layer 1 </font> <font color='#7F00FF'>Node 1</font>**</font>
>> <font size="3">**1. Takes input and calculate <span style="color:#4285F4">weighted sum</span>**</font> <br>
>> <font size="3">**2. Passes the weighted sum through <span style="color:#4285F4">sigmoid function</span>**</font>

<img src="https://firebasestorage.googleapis.com/v0/b/deep-learning-crash-course.appspot.com/o/3NN2.png?alt=media&token=b8f4c510-9e58-4cd9-a22e-b784142c94b3" width="850" align="center"/>

<font size="5"><span style="background-color:#EA4335; color:white">&nbsp;!&nbsp;</span></font>
<font size="3">**Notation: Z<sub>1</sub><sup>[1]</sup>, where**</font>
> <font size="3">**[1] represents the corresponding <font color='Red'>layer</font>**</font> <br>
> <font size="3">**1 represents the corresponding <font color='#7F00FF'>node</font>**</font>

> <font size="3">**STEP 1 - <font color='Red'>Hidden Layer 1 </font> <font color='#7F00FF'>Node 2</font>**</font>
>> <font size="3">**1. Takes input and calculate <span style="color:#4285F4">weighted sum</span>**</font> <br>
>> <font size="3">**2. Passes the weighted sum through <span style="color:#4285F4">sigmoid function</span>**</font>

<img src="https://firebasestorage.googleapis.com/v0/b/deep-learning-crash-course.appspot.com/o/3NN3.png?alt=media&token=20212b88-b355-4450-ab76-725cbb59d4a4" width="850" align="center"/>

<font size="5"><span style="background-color:#EA4335; color:white">&nbsp;!&nbsp;</span></font>
<font size="3">**z, w, b, a are all vectors and sigmoid function is applied element-wise to z. Therefore, vectorization can be applied to avoid any for loop to improve the efficiency.**</font>

### <font color='Orange'> Vectorization </font>

<img src="https://firebasestorage.googleapis.com/v0/b/deep-learning-crash-course.appspot.com/o/3NN4.png?alt=media&token=96cfb2b4-ad8c-4f27-8621-8d0120dc90b2" width="850" align="center"/>

<font size="5"><span style="background-color:#EA4335; color:white">&nbsp;!&nbsp;</span></font>
<font size="3">**Notation: All the vectors and matrix are combined and subscripts are removed to represent them as their corresponding vectors and matrix.**</font>

<img src="https://firebasestorage.googleapis.com/v0/b/deep-learning-crash-course.appspot.com/o/3NN5.png?alt=media&token=7d616726-cb81-4ed7-b67c-1d82bfc423ec" width="850" align="center"/>

> <font size="3">**STEP 1 - <font color='Red'>Hidden Layer 2 </font> <font color='#7F00FF'>Node 1</font>**</font>
>> <font size="3">**1. Takes input and calculate <span style="color:#4285F4">weighted sum</span>**</font> <br>
>> <font size="3">**2. Passes the weighted sum through <span style="color:#4285F4">sigmoid function</span>**</font>

<img src="https://firebasestorage.googleapis.com/v0/b/deep-learning-crash-course.appspot.com/o/3NN6.png?alt=media&token=fb1ee7b7-5841-42dd-bd49-14baf9312e1e" width="850" align="center"/>

>> <font size="3">**3. Returns an output of <span style="color:#4285F4">probability</span> between 0 & 1**</font><br>
>> <font size="3">**4. Calculate <span style="color:#4285F4">loss</span> and <span style="color:#4285F4">cost</span>**</font> <br>

<img src="https://firebasestorage.googleapis.com/v0/b/deep-learning-crash-course.appspot.com/o/3NN7.png?alt=media&token=2767cccb-de9f-45ab-b3cb-72535b9768a5" width="850" align="center"/>

### <font color='#34A853'> Neural Network - Backward Propagation and Gradient Descent </font>

> <font size="3">**STEP 2**</font>
>> <font size="3">**1. Calculate <span style="color:#4285F4">gradient</span> w.r.t weights and bias**</font>
>> <font size="3"><img src="https://firebasestorage.googleapis.com/v0/b/deep-learning-crash-course.appspot.com/o/3Logistic7.png?alt=media&token=71fe0f8d-a0c2-49f4-9175-581acc20f530" width="350" align="center"/>
>> <font size="3">**2. Set <span style="color:#4285F4">learning rate</span>, <span style="color:#4285F4">𝜸</span>**</font> <br>
>> <font size="3">**3. Update parameters <span style="color:#4285F4">weights</span> and <span style="color:#4285F4">bias</span>**</font> <br>
>> <font size="3"><img src="https://firebasestorage.googleapis.com/v0/b/deep-learning-crash-course.appspot.com/o/3Logistic8.png?alt=media&token=942b04f4-ed1d-4d19-bc7e-1e3850344bfe" width="350" align="center"/>

### <font color='Orange'> Neural Network - Full Training Algorithm </font>

<font size="3">**For loop (<span style="color:#4285F4">epochs</span>):**</font> <br>
> <font size="3">**1. Forward propagation – Calculate <span style="color:#4285F4">Loss</span>**</font> <br>
> <font size="3">**2. Backward propagation – Calculate <span style="color:#4285F4">Gradient</span>**</font> <br>
> <font size="3">**3. Gradient descent – Update <span style="color:#4285F4">parameters</span>**</font>

<img src="https://firebasestorage.googleapis.com/v0/b/deep-learning-crash-course.appspot.com/o/3NN8.png?alt=media&token=6c532cd1-8a5c-4395-884f-812026603f15" width="850" align="center"/>

<hr style="border:2px solid #34A853"> </hr>

## 3.5 Build your first Neural Network

<img src="https://firebasestorage.googleapis.com/v0/b/deep-learning-crash-course.appspot.com/o/1Keras.png?alt=media&token=9f4add09-14d3-49ed-bc11-f0497f6e96f1" width="200" height="200" align="right"/>


<font size="3">**Keras is a simple tool for constructing a neural network. It is a high-level API of TensorFlow 2:**</font> 

> <font size="3">**an approachable, highly-productive interface for solving machine learning problems, with a focus on modern deep learning.**</font>

<font size="3">**The core data structures of Keras are layers and models.**</font>

> <font size="3">**The simplest type of model is the <span style="color:#4285F4">Sequential model</span>, a linear stack of layers.**</font>

> <font size="3">**For more complex architectures, the Keras <span style="color:#4285F4">Functional API</span> should be used, which allows to build arbitrary graphs of layers, or write models entirely from scratch.**</font> 

### <font color='Orange'>*Sequential model - When to use*</font>

<font size="3">**A Sequential model is appropriate for**</font> 
> <font size="3">**<span style="color:#4285F4">a plain stack of layers</span> where each layer has <span style="color:#4285F4">exactly one input tensor and one output tensor</span>.**</font> 

<font size="3">**This is not appropriate when:**</font> 

> <font size="3">**Your model has <span style="color:#4285F4">multiple inputs or multiple outputs</span>**</font> <br>
> <font size="3">**Any of your layers has <span style="color:#4285F4">multiple inputs or multiple outputs</span>**</font> <br>
> <font size="3">**You need to do <span style="color:#4285F4">layer sharing</span>**</font><br>
> <font size="3">**You want <span style="color:#4285F4">non-linear topology</span> (e.g. a residual connection, a multi-branch model)**</font>

Reference: https://keras.io/guides/sequential_model/

### <font color='Orange'>*Sequential model - How to use*</font>

<font size="3">**You can create a <span style="color:#4285F4">Sequential model</span> by**</font> 
> <font size="3">**Passing a list of layers to a Sequential constructor**</font> 

> <font size="3">**<span style="background-color: #ECECEC; color:#0047bb">.add()</span> method to incrementally setup layers**</font> 

### <font color='#176BEF'> Examples </font>
<hr style="border:2px solid #E1F6FF"> </hr>

In [1]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

In [2]:
model = Sequential(
    [
        Dense(2, input_shape=(3,)),
        Dense(1, activation="sigmoid"),
    ]
)
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense (Dense)                (None, 2)                 8         
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 3         
Total params: 11
Trainable params: 11
Non-trainable params: 0
_________________________________________________________________


In [3]:
model = Sequential()
model.add(Dense(2, input_shape=(3,)))
model.add(Dense(1, activation ='sigmoid'))
model.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_2 (Dense)              (None, 2)                 8         
_________________________________________________________________
dense_3 (Dense)              (None, 1)                 3         
Total params: 11
Trainable params: 11
Non-trainable params: 0
_________________________________________________________________


<hr style="border:2px solid #E1F6FF"> </hr>

### <font color='Orange'>*Output Shape*</font>
> <font size="3">**In the output shape of the layers, the model expects the input to have a batch size as the outermost (left most) dimension. Therefore, <span style="color:#4285F4">null</span> values is assigned for greater flexibility.**</font><br>

> <font size="3">**The second parameter of output shape simply equals to the number of neurons in the same layer.**</font><br>

### <font color='Orange'>*Parameters*</font>
> <font size="3">**<span style="color:#4285F4">Dense Layer</span>: Param = (Input Size + 1) x Number of Neurons**</font><br>

> <font size="3">**+1 is because of Biases <span style="color:#4285F4">b</span>**</font><br>
>> <font size="3">**First Dense Layer: Param = (3 + 1) x 2 = 8**</font><br>
>> <font size="3">**Second Dense Layer: Param = (2 + 1) x 1 = 3**</font>

### <font color='Orange'>*Best Practice for Deep Learning*</font>
<font size="3">**When building a new Sequential model, it is useful to**</font> 

> <font size="3">**1. incrementally stack layers with <span style="background-color: #ECECEC; color:#0047bb">.add()</span>**</font> 

> <font size="3">**2. frequently print model summaries with <span style="background-color: #ECECEC; color:#0047bb">.summary()</span>**</font> 

<font size="3">**This enables you to monitor how the stack of layers are connected, which is especially useful for deep network architecture.**</font> 

<hr style="border:2px solid #34A853"> </hr>

### <font color='#34A853'> 6 lines of code builds a neural network  </font>

<font size="3">**1. Import <span style="color:#4285F4">Sequential model</span> from TensorFlow Keras**</font>

<font size="3">**2. Import <span style="color:#4285F4">Dense Layer</span> from TensorFlow Keras**</font>

<font size="3">**3. Create <span style="color:#4285F4">Sequential model</span> object**</font>

<font size="3">**4. Add <span style="color:#4285F4">1<sup>st</sup></span> layer with <span style="color:#4285F4">2</span> neurons and <span style="color:#4285F4">3</span> features as input**</font>

<font size="3">**5. Add <span style="color:#4285F4">2<sup>nd</sup></span> layer with <span style="color:#4285F4">1</span> neuron as output and <span style="color:#4285F4">sigmoid</span> as activation function**</font>

<font size="3">**6. Compile the neural network model with <span style="color:#4285F4">optimizer, loss function, evaluation metrics</span>**</font>

In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
model = Sequential()
model.add(Dense(2, input_shape = (3,)))
model.add(Dense(1, activation = 'sigmoid'))
model.compile(optimizer = 'RMSprop', loss = 'binary_crossentropy', metrics = ['accuracy'])

<img src="https://firebasestorage.googleapis.com/v0/b/deep-learning-crash-course.appspot.com/o/3NN9.png?alt=media&token=664be587-f0fe-43ec-8217-5ca7779ca0dd" width="350" align="center"/>

<hr style="border:2px solid #34A853"> </hr>

### <font color='Orange'> 3. Create Sequential model object </font>
> <font size="3">**Keras provides two ways for constructing neural network, i.e. <span style="color:#4285F4">Sequential model</span> and <span style="color:#4285F4">Functional API</span>.**</font>

> <font size="3">**In this step, an <span style="color:#4285F4">empty</span> Sequential model object is created.**</font>


### <font color='Orange'> 4. Add 1<sup>st</sup> layer with 2 neurons and 3 features as input </font>
### <font color='Orange'> 5. Add 2<sup>nd</sup> layer with 1 neuron as output and sigmoid as activation function </font>
> <font size="3">**Once an <span style="color:#4285F4">empty</span> Sequential model object is created, layers can be added via <span style="background-color: #ECECEC; color:#0047bb">.add()</span> function.**</font>

> <font size="3">**Keras provides plenty of pre-built layers for different neural network architectures.**</font>
>> <font size="3">**Core layer:** The <span style="color:#4285F4">dense layer</span> is one of the core layers. It is a standard neural network layer. It is helpful to produce output in the desired form.</font><br>
<br>
>> <font size="3">**Convolution layer:** This layer creates a convolution kernel. It is convolved over a single input to produce a tensor of outputs.</font><br>
<br>
>> <font size="3">**Embedding layer:** This layer is used as the first layer of neural network model to turn positive integers into dense vectors of fixed sizes.</font><br>
<br>
>> <font size="3">**Merge layer:** This layer helps merge a list of inputs. It provides many functions to make tasks easy. These functions are: Add(), subtract(), multiply(), average(), maximum(), minimum(), etc.</font><br>
<br>
>> <font size="3">**Dropout layer:** Dropout can be implemented by added Dropout layers into network architecture. It will help dropping-out based on user-defined hyperparameters</font><br>
<br>
>> <font size="3">**Pooling layer:** It is a new layer added for the convolution layer. It helps to implement pooling operations. This layer can be added to a CNN between the layers and is useful for max-pooling operations on temporal data.</font><br>
<br>
>> <font size="3">**Noise layer:** This layer helps add external noise to model.</font><br>
<br>
>> <font size="3">**Normalization layer:** This layer helps transfer the input to a standardized form. This layer will have a mean of zero and a standard deviation of one. Keras supports normalization via the BatchNormalization layer.</font><br>
<br>
>> <font size="3">**Recurrent layer:** These layers are present for abstract batch class. There are two parameters: return_state, and return_sequences.</font><br>
<br>
>> <font size="3">**Locally-connected layer:** It works similarly to convolutional layer. Except, this layer does not share weights.</font>

Reference: https://techvidvan.com/tutorials/keras-layers/

> <font size="3">**Dense Layers take different parameters. Here are few commonly used parameters:**</font>
>> <font size="3">**1<sup>st</sup> parameter: <span style="color:#4285F4">Number of neurons</span>**</font><br>
<br>
>> <font size="3">**2<sup>nd</sup> parameter: <span style="color:#4285F4">Activation</span>**</font><br>
<br>
>> <font size="3">**3<sup>rd</sup> parameter (Only if it is a first layer): <span style="color:#4285F4">Input shape</span>**</font><br>
>>> <font size="3">**If the Dense Layer is the first layer, model needs to know what input shape it should expect. For this reason, the first layer in a Sequential model needs to receive additional parameter about its input shape.**</font><br>
<br>
>> <font size="3">**There are several possible ways to do this, e.g. passing an <span style="background-color: #ECECEC; color:#0047bb">input_dim</span> or <span style="background-color: #ECECEC; color:#0047bb">input_shape</span>.**</font>

### <font color='Orange'> 6. Compile the neural network model with optimizer, loss function, evaluation metrics </font>
> <font size="3">**Once the neural network architecture is setup and added into the Sequential model object, the model can be compiled with the use of <span style="background-color: #ECECEC; color:#0047bb">.compile()</span> function.**</font>

> <font size="3">**<span style="background-color: #ECECEC; color:#0047bb">.compile()</span> allows for different parameters. The most important parameters are:**</font>

>> <font size="3">**1<sup>st</sup> parameter: <span style="color:#4285F4">Optimizer</span>**</font><br>
<br>
>> <font size="3">**2<sup>nd</sup> parameter: <span style="color:#4285F4">Loss function</span>**</font><br>
<br>
>> <font size="3">**3<sup>rd</sup> parameter: <span style="color:#4285F4">Metrics</span>**</font>

<img src="https://firebasestorage.googleapis.com/v0/b/deep-learning-crash-course.appspot.com/o/3NN10.png?alt=media&token=9223446e-8108-4082-b9c9-225018f9f54e" width="550" align="center"/>

<font size="3">**<span style="color:#4285F4">Loss Function</span>**</font>
> <font size="3">**Once the neural network architecture is setup and added into the Sequential model object, samples are <span style="color:#4285F4">foward propagated</span> and the corresponding estimates, $\hat{y}$ are calculated.**</font><br>

> <font size="3">**<span style="color:#4285F4">Loss function</span> is then applied to estimate the <span style="color:#4285F4">loss values</span> between the true values (i.e. Labels, y) and predicted values (i.e. Estimates, $\hat{y}$).**</font>

<font size="3">**<span style="color:#4285F4">Optimizer</span>**</font>
> <font size="3">**Based on the <span style="color:#4285F4">loss values</span>, <span style="color:#4285F4">optimizer backward propagates</span> and calculates the <span style="color:#4285F4">gradients</span> w.r.t weights, W and bias, b.**</font>

<font size="5"><span style="background-color:#EA4335; color:white">&nbsp;!&nbsp;</span></font> 
<font size="3">**The training will be stopped either when:**</font>
> <font size="3">**The maximum number of epochs in <span style="background-color: #ECECEC; color:#0047bb">.fit()</span> function is reached; OR**</font>

> <font size="3">**A monitored quantity <span style="background-color: #ECECEC; color:#0047bb">.EarlyStopping()</span> function has stopped improving.**</font>

<font size="3">**<span style="color:#4285F4">Metrics</span>**</font>
> <font size="3">**A metric is an addition evaluation function that is used to judge the performance of the model**</font>

> <font size="3">**Metric functions are similar to loss functions, except that the results from evaluating a metric are not used when training the model. Therefore, any loss function can also be used as a metric**</font>

> <font size="3">**The main reason is because it is difficult to judge the performance based on loss values, such as mean squared error (MSE) and root mean squared error (RMSE). Therefore, sometimes, an extra metric, such as accuracy and mean absolute error (MAE), is used for additional evaluation.**</font>

<font size="5"><span style="background-color:#EA4335; color:white">&nbsp;!&nbsp;</span></font> 
<font size="3">**Combinations of output layer activation and loss functions**</font>
<br>
<img src="https://firebasestorage.googleapis.com/v0/b/deep-learning-crash-course.appspot.com/o/3NN11.png?alt=media&token=9d57d341-c9ad-4126-918e-526bde571a1b" width="950" align="left"/>