# DIY your first Neural Network


## [A bit of context](https://people.eecs.berkeley.edu/~russell/intro.html)


**Artificial Intelligence** is a branch of Computer Science that studies and researches to develop **entities** -devices or machines- that have intelligence similar to a human being. Although there is not an official definition about what AI is. Some definitions draw the insights listed below. You can read more about the complexity behind trying to define what the human intelligence is, in the first chapter of ["Artificial Intelligence A Modern Approach"](https://people.eecs.berkeley.edu/~russell/intro.html) by [Peter Norvig](https://en.wikipedia.org/wiki/Peter_Norvig) and [Stuart Russell](https://en.wikipedia.org/wiki/Stuart_J._Russell).

* Systems that think like humans
* Systems that think rationally
* Systems that act like humans	
* Systems that act rationally


An outcome of these four ideas might be that an AI agent can learn from experience and deal with situations smartly, taking profit of the situation. 

On the other side, **Machine Learning** is a branch of AI. It allows systems to automatically learn and improve from experience without being explicitly programmed. Meanwhile, **Deep Learning** is an aspect of Machine Learning that is concerned with emulating the learning approach that human beings use to get certain types of knowledge. Deep Learning uses statistical models, which are vaguely inspired by information processing and communication patterns in biological nervous system, and are so-called 'neurons', ['cartoon' versions of a biological neuron](https://www.youtube.com/watch?v=vdqu6fvjc5c).

<img src='DIY-assets/ai-ml-dl-differences.png' width="600"/>
<center>Figure 1- AI, Machine Learning, Deep Learning . [Image Source](https://blogs.nvidia.com/blog/2016/07/29/whats-difference-artificial-intelligence-machine-learning-deep-learning-ai/)</center>



## Types of Machine Learning

Since we don not have ['The Master Algorithm'](https://www.amazon.com/Master-Algorithm-Ultimate-Learning-Machine/dp/1501299387) yet, we need to use different kind of models and algorithms depending on the problem. This way, we have different kind of Machine Learning algorithms to address the different kind of problems, the human beings are so usually to deal with, like: 

* Classification problems
* Association problems
* Probabilistic problems/Decision making


### Supervised learning

This kind of algorithms are related to classification tasks. All the data is labelled, and it is quite similar to the firt learning tasks a kid can take from his/her parents. 

<img src="DIY-assets/supervised-learning.png" width="500" />
<center>Figure 2- Supervised Learning. [Image Source](http://ywanvanloon.com/the-basics-of-machine-learning-for-recruitment/)</center>

### Unsupervised learning

This nature of algorithms are related to association or clustering information. We do not have always access to labelled data. So if you walk down the street you will not see any label saying that this is a 'dog', a 'cat' or a 'horse'. However if someone ask you, what does have this animals in common, the most likely response would be, they are 'animals' or 'mammals' or 'four-leg animals'. 

<img src="DIY-assets/unsupervised-learning.jpg" width="600" />
<center>Figure 3- Unsupervised Learning.[image source](https://allagora.wordpress.com/2017/07/19/how-can-computers-learn-to-think-from-machine-learning-to-machine-thinking/)</center>


### [Reinforcement learning](https://devblogs.nvidia.com/deep-learning-nutshell-reinforcement-learning/)


This family of algorithms are related to decision making. An agent take actions in an environment aimed at maximizing their cumulative reward. An everyday example to illustrate this, is when you learnt to ride your bike for the first time. You probably learnt from trial and error, and from the many times you felt, how to move your feet, to fit your speed to the environment, and to use the breaks. 


<img src="DIY-assets/reinforcement-learning.png" width="400"><center>Figure 4- Reinforcement Learning.[image source](https://becominghuman.ai/the-very-basics-of-reinforcement-learning-154f28a79071?gi=f0be0f4c0213)</center>


## Neural Nets, a bit of history


<img src="DIY-assets/nn_timeline.jpg" />


### Deep Neural Nets

As we said previously, Neural nets or Neural Networks are statistical models that emulate a biollogical neuronal tissue. In Deep Learning we work with Neural Nets, and we can address the three already listed nature of problems: classification problems, clustering problems and decision making problems. But how does a Neural Network look like? 

If a biologicall neuron is usually depicted like the following animation:

<img src="DIY-assets/neuron.gif" width="400">
<center>Figure 5- Information flow through a biological neuron</center>

A Neural Network might be seen as something like the following: 

<img src="DIY-assets/artificial_neuron.gif" width="400">
<center>Figure 6- Information flow through a neural network</center>


And we use the word 'might' because we have many different architectures of neural networks, depending on the problem and the performance we need to solve a particular problem. A neuronal network can turn on a very very complex model like one shown below. 


<img src="DIY-assets/neural-nets-deep-nets.png" width="600">
<center>Figure 7- Neural Nets vs Deep Neural Nets [Image soure](https://stats.stackexchange.com/questions/182734/what-is-the-difference-between-a-neural-network-and-a-deep-neural-network-and-w)</center>



### [How the information is computed insided a NN](https://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec2.pdf)

#### How to visualize it

I do encorage you to read **Rosie Campbell's** Speech ['Demystifying Deep Neural Nets'](https://medium.com/manchester-futurists/demystifying-deep-neural-nets-efb726eae941). In her words: Imagine you are trying to decide whether to go to a festival, you might consider things like:

* Will the weather be nice?
* What’s the music like?
* Do I have anyone to go with?
* Can I afford it?

<img src="DIY-assets/manchester-futurists_nn1.png" width="600">
<center>Figure 8- Inside a neuron [Image soure](https://medium.com/manchester-futurists/demystifying-deep-neural-nets-efb726eae941)</center>

To make my decision, you would consider how important each factor is to you, weigh them all up, and see if the result was over a certain threshold. If so, You will go to the festival! It’s a bit like the process we often use of weighing up pros and cons to make a decision. 

<img src="DIY-assets/manchester-futurists_nn2.png" width="600">
<center>Figure 9- Inside a neuron. Weighting the information. [Image soure](https://medium.com/manchester-futurists/demystifying-deep-neural-nets-efb726eae941)</center>

#### The perceptron

The most simplified representation of how the information is processed in a Neural Network is using a **Perceptron**.

<img src="DIY-assets/perceptron.png" width="600">
<center>Figure 10- Neural Nets vs Deep Neural Nets. Image credit: [Geoff Hinton](https://en.wikipedia.org/wiki/Geoffrey_Hinton)</center>

A perceptron was the first generation of neural networks. 

1. Convert the raw input vector into a vector of **feature activations**. 

2. Learn how to weight each of the feature activations to get a single scalar quantity.

3. If this quantity is above some threshold, decide that the input vector is a positive example of the target class. 

<img src="DIY-assets/multiclassif_problem.png" width="600">
<center>Figure 9- Inside a neuron. Weighting the information. [Image soure](https://medium.com/manchester-futurists/demystifying-deep-neural-nets-efb726eae941)</center>



### Training the network 

1. Randomly initialise the network weights and biases.
2. Gather training data.
3. Explore and Transform your dataset according to the features activations.
4. Feed it into the network
5. **Check whether the network gets it right** 
6. If not, how wrong was it? Or, how right was it? (What probability or ‘confidence’ did it assign to its guess?)
7. Nudge the weights a little to increase the probability of the network more confidently getting the answer right
8. Repeat

#### How to know how well the Network is learning? Feed Forward and Loss function

The loss function -aka error function or cost function-. The loss function chosen depends on your data and the nature of the problem you are facing. The goal of training is to find the weights and biases that minimise the loss function. We plot the loss against the weights.

<img src="DIY-assets/first_time_progammer.png" width="600">
<center>Figure 11- Feed Forward, Weights vs Loss [Image soure](https://medium.com/manchester-futurists/demystifying-deep-neural-nets-efb726eae941)</center>

#### Backpropagation and weight updates

We start the process at the output layer, and work towards the input layer, propagating the changes backwards throughout the network. We calculate the gradient of the slope at each layer mathematically by taking the partial derivative of the loss with respect to the weights.

## Goal of this activity

In this activity you are going to build and train your first neural network to address a classification problem. Dealing with a multiclass problem, we need to give answer to the following questions in order to know what kind of model and which architecture we need. 

* Are we dealing with continous data? 
* Are we dealing with categorical data?

A classification problem is about finding patterns in data. This is a very typical problem in computer vision. In this activity you might wonder how are we able to differenciate things? what makes us to think that a chair is different than a table or a banana different from a hot dog? We all see patterns in thinks: linear patterns, furr, textures and sometimes we use colors to differenciate things as well. 

<img src="DIY-assets/chihuahua-or-muffin.jpeg" width="300" />


### Ingredients

To this task you are going to use: 

* Dataset: [Pima Indian Diabets from Kaggle](https://www.kaggle.com/uciml/pima-indians-diabetes-database) (already provided) 
* Frameworks: Keras and TensorFlow
* Environment: Conda + Jupyter Notebook (already provided) 


### Environment

1. Conda + environment file provided. 

2. Since these models are computationally too expensive -several multiplications and additions-, it is really tough to run these models in your laptop. If you have a gaming station, you might be able to run a not very complex model. Instead, if you want to experiment, you can use FloyHub, a Deep Learning service in the Cloud. You can run a job at a time -one training-, for free. They also provide good tutorials for image classification problems. 

> https://www.floydhub.com/signup


### Steps

The steps you are going to cover in this tutorial are as follows:

1. Load Data.
2. Define Model.
3. Compile Model.
4. Fit Model.
5. Evaluate Model.
6. Tie It All Together.

##### Check if TF is already intalled and working

In [1]:
import tensorflow as tf
print ("TensorFlow version: " + tf.__version__)

TensorFlow version: 1.0.0


##### Import necessary modules

In [2]:
from keras import applications
from keras import backend as k 

Using TensorFlow backend.


In [3]:
# Create your first MLP in Keras
from keras.models import Sequential
from keras.layers import Dense
import numpy as np
import pandas as pd

Using TensorFlow backend.


##### Preprocess your data: Features extraction, relevant information

In [5]:
# fix random seed for reproducibility
np.random.seed(7)

# load pima indians dataset

dataset = pd.read_csv("data/pima-indians-diabetes-no-preprocessed.csv")
dataset.head()

Unnamed: 0,Pregnancies,Glucose,BloodPressure,SkinThickness,Insulin,BMI,DiabetesPedigreeFunction,Age,Outcome
0,6,148,72,35,0,33.6,0.627,50,1
1,1,85,66,29,0,26.6,0.351,31,0
2,8,183,64,0,0,23.3,0.672,32,1
3,1,89,66,23,94,28.1,0.167,21,0
4,0,137,40,35,168,43.1,2.288,33,1


In [9]:
# load pima indians dataset
dataset = np.loadtxt("data/pima-indians-diabetes.csv", delimiter=",")
# split into input (X) and output (Y) variables
X = dataset[:,0:8]
Y = dataset[:,8]

print(X[:10])
print(Y[:10]) # binary data

[[6.000e+00 1.480e+02 7.200e+01 3.500e+01 0.000e+00 3.360e+01 6.270e-01
  5.000e+01]
 [1.000e+00 8.500e+01 6.600e+01 2.900e+01 0.000e+00 2.660e+01 3.510e-01
  3.100e+01]
 [8.000e+00 1.830e+02 6.400e+01 0.000e+00 0.000e+00 2.330e+01 6.720e-01
  3.200e+01]
 [1.000e+00 8.900e+01 6.600e+01 2.300e+01 9.400e+01 2.810e+01 1.670e-01
  2.100e+01]
 [0.000e+00 1.370e+02 4.000e+01 3.500e+01 1.680e+02 4.310e+01 2.288e+00
  3.300e+01]
 [5.000e+00 1.160e+02 7.400e+01 0.000e+00 0.000e+00 2.560e+01 2.010e-01
  3.000e+01]
 [3.000e+00 7.800e+01 5.000e+01 3.200e+01 8.800e+01 3.100e+01 2.480e-01
  2.600e+01]
 [1.000e+01 1.150e+02 0.000e+00 0.000e+00 0.000e+00 3.530e+01 1.340e-01
  2.900e+01]
 [2.000e+00 1.970e+02 7.000e+01 4.500e+01 5.430e+02 3.050e+01 1.580e-01
  5.300e+01]
 [8.000e+00 1.250e+02 9.600e+01 0.000e+00 0.000e+00 0.000e+00 2.320e-01
  5.400e+01]]
[1. 0. 1. 0. 1. 0. 1. 0. 1. 1.]


#### Define the model

In [10]:
# create model
model = Sequential()
model.add(Dense(12, input_dim=8, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))

model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_1 (Dense)              (None, 12)                108       
_________________________________________________________________
dense_2 (Dense)              (None, 8)                 104       
_________________________________________________________________
dense_3 (Dense)              (None, 1)                 9         
Total params: 221.0
Trainable params: 221
Non-trainable params: 0.0
_________________________________________________________________


#### Compile the model

In [11]:
# Compile model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

#### Fit the model

In [12]:
# Fit the model
model.fit(X, Y, epochs=150, batch_size=10)

Epoch 1/150
Epoch 2/150
Epoch 3/150
Epoch 4/150
Epoch 5/150
Epoch 6/150
Epoch 7/150
Epoch 8/150
Epoch 9/150
Epoch 10/150
Epoch 11/150
Epoch 12/150
Epoch 13/150
Epoch 14/150
Epoch 15/150
Epoch 16/150
Epoch 17/150
Epoch 18/150
Epoch 19/150
Epoch 20/150
Epoch 21/150
Epoch 22/150
Epoch 23/150
Epoch 24/150
Epoch 25/150
Epoch 26/150
Epoch 27/150
Epoch 28/150
Epoch 29/150
Epoch 30/150
Epoch 31/150
Epoch 32/150
Epoch 33/150
Epoch 34/150
Epoch 35/150
Epoch 36/150
Epoch 37/150
Epoch 38/150
Epoch 39/150
Epoch 40/150
Epoch 41/150
Epoch 42/150
Epoch 43/150
Epoch 44/150
Epoch 45/150
Epoch 46/150
Epoch 47/150
Epoch 48/150
Epoch 49/150
Epoch 50/150
Epoch 51/150
Epoch 52/150
Epoch 53/150
Epoch 54/150
Epoch 55/150
Epoch 56/150
Epoch 57/150
Epoch 58/150
Epoch 59/150
Epoch 60/150
Epoch 61/150
Epoch 62/150
Epoch 63/150
Epoch 64/150
Epoch 65/150
Epoch 66/150
Epoch 67/150
Epoch 68/150
Epoch 69/150
Epoch 70/150
Epoch 71/150
Epoch 72/150
Epoch 73/150
Epoch 74/150
Epoch 75/150
Epoch 76/150
Epoch 77/150
Epoch 78

<keras.callbacks.History at 0x1216504e0>

In [13]:
# evaluate the model
scores = model.evaluate(X, Y)
print("\n%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))

 32/768 [>.............................] - ETA: 0s
acc: 78.26%


## References

1. https://people.eecs.berkeley.edu/~russell/intro.html
2. https://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec2.pdf
3. https://www.youtube.com/watch?v=LxfUGhug-iQ&list=PLkt2uSq6rBVctENoVBg1TpCC7OQi31AlC&index=7
4. https://www.learnopencv.com/keras-tutorial-transfer-learning-using-pre-trained-models/
5. https://medium.com/@margaretmz/anaconda-jupyter-notebook-tensorflow-and-keras-b91f381405f8
6. https://medium.com/@14prakash/transfer-learning-using-keras-d804b2e04ef8
7. https://medium.com/@jayeshbahire/the-xor-problem-in-neural-networks-50006411840b
8. https://medium.com/manchester-futurists/demystifying-deep-neural-nets-efb726eae941

Author and Speaker: Nohemy Veiga | email: nohe.veiga[at]gmail.com