# Basics of Deep learning
Deep learning is a subset of machine learning.
This means with deep learning we are implementing algorithms that learn from data and make predictions out of it.
(Disclamer: Resource the summary is based on the [www.deeplizard.com](https://deeplizard.com/learn/video/gZmobeGL0Yg) website)

Deep learning itself can split up into **supervised learning** and **unsupervised learning**

* **Supervised learning** means, that the algorithms learn from already classified data
* **Unsupervised learning** means, the algorithm is trying to group similar inputs together to predict to correct outcome.

### Load data

It is very important to divide your data into 3 parts
1. **Training**: used to let the data learn from
2. **Validation**: used to evaluate if the training was effective
3 **Testing**: After the training cycle is complete, we can use the testing set, to see if the model can be used in production.
    You need this separation, because the Validation set is part of the training process and you **shouldn't** mix training
    data with validation data

In [11]:
import numpy as np
import pandas as pd

df_np = np.loadtxt('./Dataset/pima-indians-diabetes.csv', delimiter=',')
X = df_np[:, 0:8]
y = df_np[:, 8]


## Artificial Neural Networks
Artificial neural networks are based on the concept of how human brains work.
This means they compute the output based on the collection of connected neurons that are organized in layers

Layers are generally divided into **Input layer**(Where the values are feed in), the **Hidden layer** (Where the transformation happens)
and the **Output layer**(Where a result is formed).
Additionally there are different types of how the layers are connected to each other. For example:
* **Dense layer** Each output is generated using every input to the layer
* **Convolutional layers**
* **Pooling layers**
* **Recurrent layers**
* **Normalization layers**
But more on them later.

We are using Keras to generate quick neural networks that perform the discussed algorithms.
The **Sequential** model is Keras implementation of an ANN (artificial neural network):

In [12]:
import tensorflow
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation
from tensorflow.keras.optimizers import Adam


model = Sequential([
    Dense(12, input_shape=(8,), activation='relu'),
    Dense(8, activation="relu"),
    Dense(2, activation='softmax'),
])

# Define training process
model.compile(Adam(lr=.0001), loss='sparse_categorical_crossentropy', metrics=['accuracy'])

model.summary()

What this code is describing, is that we have the model takes an input with 10 columns.
The Activation function will be discussed later but it decides, how the values should be transformed to generate the output values
for the layer.
The dense layer was already discussed above.

The Neural Net is generally functioning like this:
Every Node is connected to nodes in the next layer. Because each connection has a different strength, the input values have
a different amount of impact in the next layer. each nodes adds each value together and transforms it with an activation function.
After that the output value is passed to the next node.
While training the Model, the "Weights"(which are the strength of each connection) are updated, to generate better results.

The compile method defines with what parameters the training should be performed.
First we define the **optimizer**, which will change how the weights are updated.
More on how they work [here](Optimization_Algorithms.ipynb)

Next we define the **loss function** it will calculate how high the error rate of the algorithm is.
A typical loss function is "mean squared error (MSE)", which squared the errors and gets the mean out of it.
For more detail read [here](Loss_Functions.ipynb)

The metric Parameter only lets us add additional information, that should be displayed while training, like the accuracy.

## Activation Function
The activation function is a way to remap the input values a node gets.
For example:
* **Sigmoid function** takes any value and transforms it into an value between 0 and 1, where negative values are closer to 0 and posititve
    values are closer to 1
* **Relu function** maps the value to its actual value, as long as it is positive, otherwise it give 0 back.

## Train an AI model

Training a model is actually soling an optimisation problem. It means, for with weight values is the error rate the lowest?
This means we try to apply methods like Stochastic Gradient Descent (SGD) to minimize the loss function.
Now we are running the real training:

In [13]:
model.fit(x=X, y=y, batch_size=10, epochs=150, shuffle=True, verbose=0)

<tensorflow.python.keras.callbacks.History at 0x2dc26711488>



## Evaluate Model

In [14]:
_, accuracy = model.evaluate(X, y)
print('Accuracy: %.2f' % (accuracy*100))

Accuracy: 72.01


## Predictions

In [15]:
# predicts values and puts them into classes
predictions = model.predict_classes(X)

# summarize the first 5 cases
for i in range(5):
	print('%s => %d (expected %d)' % (X[i].tolist(), predictions[i], y[i]))

[6.0, 148.0, 72.0, 35.0, 0.0, 33.6, 0.627, 50.0] => 1 (expected 1)
[1.0, 85.0, 66.0, 29.0, 0.0, 26.6, 0.351, 31.0] => 0 (expected 0)
[8.0, 183.0, 64.0, 0.0, 0.0, 23.3, 0.672, 32.0] => 1 (expected 1)
[1.0, 89.0, 66.0, 23.0, 94.0, 28.1, 0.167, 21.0] => 0 (expected 0)
[0.0, 137.0, 40.0, 35.0, 168.0, 43.1, 2.288, 33.0] => 1 (expected 1)
