# DIY your first Neural Network


## [A bit of context](https://people.eecs.berkeley.edu/~russell/intro.html)


**Artificial Intelligence** is a branch of Computer Science that studies and researches to develop **entities** -devices or machines- that have intelligence similar to a human being. Although there is not an official definition about what AI is. Some definitions draw the insights listed below. You can read more about the complexity behind trying to define what the human intelligence is, in the first chapter of ["Artificial Intelligence A Modern Approach"](https://people.eecs.berkeley.edu/~russell/intro.html) by [Peter Norvig](https://en.wikipedia.org/wiki/Peter_Norvig) and [Stuart Russell](https://en.wikipedia.org/wiki/Stuart_J._Russell).

* Systems that think like humans
* Systems that think rationally
* Systems that act like humans	
* Systems that act rationally


An outcome of these four ideas might be that an AI agent can learn from experience and deal with situations smartly, taking profit of the situation. 

On the other side, **Machine Learning** is a branch of AI. It allows systems to automatically learn and improve from experience without being explicitly programmed. Meanwhile, **Deep Learning** is an aspect of Machine Learning that is concerned with emulating the learning approach that human beings use to get certain types of knowledge. Deep Learning uses statistical models, which are vaguely inspired by information processing and communication patterns in biological nervous system, and are so-called 'neurons', ['cartoon' versions of a biological neuron](https://www.youtube.com/watch?v=vdqu6fvjc5c).

<img src='DYI-assets/ai-ml-dl-differences.png' width="600"/>
<center>Figure 1- AI, Machine Learning, Deep Learning . [Image Source](https://blogs.nvidia.com/blog/2016/07/29/whats-difference-artificial-intelligence-machine-learning-deep-learning-ai/)</center>


### Types of Machine Learning

Since we do not have ['The Master Algorithm'](https://www.amazon.com/Master-Algorithm-Ultimate-Learning-Machine/dp/1501299387) yet, we need to use a different kind of models and algorithms depending on the problem. This way, we have different kind of Machine Learning algorithms to address a different kind of problems, the human beings are so usually to deal with, like:

* Classification problems
* Association problems
* Probabilistic problems/Decision making


#### Supervised learning

This kind of algorithms is related to classification tasks. All the data is labelled, and it is quite similar to the first learning tasks a kid can take from his/her parents. 

<img src="DYI-assets/supervised-learning.png" width="500" />
<center>Figure 2- Supervised Learning. [Image Source](http://ywanvanloon.com/the-basics-of-machine-learning-for-recruitment/)</center>

#### Unsupervised learning

This nature of algorithms is related to association or clustering information. We do not have always access to labelled data. So if you walk down the street you will not see any label saying that this is a 'dog', a 'cat' or a 'horse'. However if someone asks you, what does have this animal in common, the most likely response would be, they are 'animals' or 'mammals' or 'four-leg animals'. 

<img src="DYI-assets/unsupervised-learning.jpg" width="600" />
<center>Figure 3- Unsupervised Learning.[image source](https://allagora.wordpress.com/2017/07/19/how-can-computers-learn-to-think-from-machine-learning-to-machine-thinking/)</center>


#### [Reinforcement learning](https://devblogs.nvidia.com/deep-learning-nutshell-reinforcement-learning/)


This family of algorithms are related to decision making. An agent takes actions in an environment aimed at maximizing their cumulative reward. An everyday example to illustrate this is when you learnt to ride your bike for the first time. You probably learnt from trial and error, and from the many times, you felt, how to move your feet, to fit your speed to the environment, and to use the breaks. 


<img src="DYI-assets/reinforcement-learning.png" width="400"><center>Figure 4- Reinforcement Learning.[image source](https://becominghuman.ai/the-very-basics-of-reinforcement-learning-154f28a79071?gi=f0be0f4c0213)</center>



## Neural Nets vs Deep Neural Nets

As we said previously, Neural nets or Neural Networks are statistical models that emulate a biological neuronal tissue. In Deep Learning we work with Neural Nets, and we can address the three already listed nature of problems: classification problems, clustering problems and decision-making problems. But how does a Neural Network look like? 

If a biological neuron is usually depicted like the following animation:

<img src="DYI-assets/neuron.gif" width="400">
<center>Figure 5- Information flow through a biological neuron</center>

A Neural Network might be seen as something like the following: 

<img src="DYI-assets/artificial_neuron.gif" width="400">
<center>Figure 6- Information flow through a neural network</center>


And we use the word 'might' because we have many different architectures of neural networks, depending on the problem and the performance we need to solve a particular problem. A neuronal network can turn on a very very complex model like one shown below. 


<img src="DYI-assets/neural-nets-deep-nets.png" width="600">
<center>Figure 7- Neural Nets vs Deep Neural Nets [Image soure](https://stats.stackexchange.com/questions/182734/what-is-the-difference-between-a-neural-network-and-a-deep-neural-network-and-w)</center>


### [Computation of the information](https://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec2.pdf)

The most simplified representation of how the information is processed in a Neural Network is using a **Perceptron**.

<img src="DYI-assets/perceptron.png" width="600">
<center>Figure 8- Neural Nets vs Deep Neural Nets. Image credit: [Geoff Hinton](https://en.wikipedia.org/wiki/Geoffrey_Hinton)</center>

A perceptron was the first generation of neural networks. 

1. Convert the raw input vector into a vector of feature activations. Use hand-written programs based on common-sense to define the features.

2. Learn how to weight each of the feature activations to get a single scalar quantity.

3. If this quantity is above some threshold, decide that the input vector is a positive example of the target class. 


## Goal of this activity

In this activity, you are going to build and train your first neural network to address a classification problem. This is a very typical problem in computer vision. In this activity, you might wonder how are we able to differentiate things? what makes us think that a chair is different than a table or a banana different from a hot dog? We all see patterns in thinks: linear patterns, further, textures and sometimes we use colours for different things as well. However, how could we transfer this learning or knowledge to a machine?  

<img src="DYI-assets/chihuahua-or-muffin.jpeg" width="300" />


### Ingredients

To this task you are going to use: 

* Dataset: MNIST
* Frameworks: Keras and TensorFlow
* Environment: Conda + Jupyter Notebook
* Test: A custom image of your choice

### Environment

1. Conda + environment file provided. 

2. Since these models are computationally too expensive -several multiplications and additions-, it is really tough to run this model in your laptop. Instead, we will use FloyHub, a Deep Learning service in the Cloud. You run a job at a time -one training-, for free.  

> https://www.floydhub.com/signup

In [None]:
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense
from keras import backend as K

In [None]:
# dimensions of our images.
img_width, img_height = 150, 150

train_data_dir = 'data/train'
validation_data_dir = 'data/validation'

In [None]:
nb_train_samples = 2000
nb_validation_samples = 800
epochs = 50
batch_size = 16

In [None]:
if K.image_data_format() == 'channels_first':
    input_shape = (3, img_width, img_height)
else:
    input_shape = (img_width, img_height, 3)

In [None]:
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=input_shape))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation('sigmoid'))

model.compile(loss='binary_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])

In [None]:
# this is the augmentation configuration we will use for training
train_datagen = ImageDataGenerator(
    rescale=1. / 255,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True)

# this is the augmentation configuration we will use for testing:
# only rescaling
test_datagen = ImageDataGenerator(rescale=1. / 255)

train_generator = train_datagen.flow_from_directory(
    train_data_dir,
    target_size=(img_width, img_height),
    batch_size=batch_size,
    class_mode='binary')

validation_generator = test_datagen.flow_from_directory(
    validation_data_dir,
    target_size=(img_width, img_height),
    batch_size=batch_size,
    class_mode='binary')

model.fit_generator(
    train_generator,
    steps_per_epoch=nb_train_samples // batch_size,
    epochs=epochs,
    validation_data=validation_generator,
    validation_steps=nb_validation_samples // batch_size)

In [None]:
model.save_weights('first_try.h5')

Author and Speaker: Nohemy Veiga | email: nohe.veiga[at]gmail.com