# Keras/TensorFlow

Keras is an API. 

remember: An API (Application Programming Interface), is ... intermediary software that facilitates communication between 2 other pieces of software. ([Course material 6.1.1.](https://krspiced.pythonanywhere.com/chapters/project_pipeline/api/README.html?highlight=api#what-is-an-api))

Keras "connects" python with TensorFlow.  

Tensorflow is the platform we use to work with neural networks. It uses dataflow graphs ([think about back and forward propagation](https://colah.github.io/posts/2015-08-Backprop/)). It is written in Cuda (a parallel computing platform working on GPUs) and C++ for performance.

Keras (https://keras.io/about/) is a deep learning API written in Python to handle several backends (TensorFlow, Microsoft Cognitiv Toolkit, Theano). Since TensorFlow 1.4 it is part of TensorFlow. It was developed with a focus on enabling fast experimentation.



### What is a tensor?
etymological origin: tension => stress/strain (in 3D structures)

![image](figures/ranks.jpg)

It's an n dimensional object, which can contain the weights of the different layers, activation functions, etc., etc., ... (any python object).

In [3]:
# what I needed to do/install after "conda create -n tensor":

# conda activate tensor
# sudo apt install python3-pip
# pip install --upgrade pip
# pip install tensorflow
# pip install opencv-python
# pip install jupyter
# pip install sklearn
# pip install matplotlib
# pip install pandas

# (my versions in 'tensor'-env: python: 3.8.10, tensorflow/keras: 2.7.0)

from matplotlib import pyplot as plt
import numpy as np
import pandas as pd

import pandas as pd
import sklearn as sk
import tensorflow as tf

print(f"Tensor Flow Version: {tf.__version__}")
print(f"Keras Version: {tf.keras.__version__}")
gpu = len(tf.config.list_physical_devices('GPU'))>0
print("GPU is", "available" if gpu else "NOT AVAILABLE")

from IPython import display

Tensor Flow Version: 2.6.2
Keras Version: 2.6.0
GPU is NOT AVAILABLE


In [4]:

scalar = np.array(6)

vector = np.array([7, 2, 9, 10])

matrix = np.array([[5.2, 3.0, 4.5],[9.1, 0.1, 0.3]])

tensor3 = np.array([[[1,4,7], [2,9,7], [1,3,0], [9,6,9]], [[2,3,4], [4,3,5], [7,7,2], [3, 9, 8]]])

tensor7 = np.random.rand(3,2,4,7,1,2,9)

print(f'\'scalar\' has {scalar.ndim} dimensions and a shape of {scalar.shape}.')
print(f'\'vector\' has {vector.ndim} dimension and a shape of {vector.shape}.')
print(f'\'matrix\' has {matrix.ndim} dimensions and a shape of {matrix.shape}.')
print(f'\'tensor3\' has {tensor3.ndim} dimensions and a shape of {tensor3.shape}.')
print(f'\'tensor7\' has {tensor7.ndim} dimensions and a shape of {tensor7.shape}.')

'scalar' has 0 dimensions and a shape of ().
'vector' has 1 dimension and a shape of (4,).
'matrix' has 2 dimensions and a shape of (2, 3).
'tensor3' has 3 dimensions and a shape of (2, 4, 3).
'tensor7' has 7 dimensions and a shape of (3, 2, 4, 7, 1, 2, 9).


If your notebook or the dependencies create problems, you can try using [Google colab](https://colab.research.google.com) (login required). Many students use it for their final project. Be aware of time limits (and save intermediate results).

[tensorflow playground](https://playground.tensorflow.org)

[activation functions](https://himanshuxd.medium.com/activation-functions-sigmoid-relu-leaky-relu-and-softmax-basics-for-neural-networks-and-deep-8d9c70eed91e)

## "Challenge"
Assume you have this labeled data (color codes according to the two half moons). 
![moon problem](figures/moons.png)
Task: Predict, whether an arbitrary new point belongs to the upper or to the lower half moon.  
What information do we have as input?

# Solution with Tensorflow

In [None]:
from sklearn.datasets import make_moons

X, y = make_moons()

In [None]:
#X

In [None]:
#y

In [None]:
# scatter plot

## Recipe to Building and Artificial Neural Network 

1. **configure a model, give**:
    + architecture
    + number of neurons
    + layers
    + type of activation functions
    
2. **compile the model, give**:
    + optimizers (algorithm that finds the minimum of the loss function)
    + loss function (the loss function to be optimized; we choose the loss function depending on the problem we are solving)
    + metrics (metrics to be tracked over training)
    
3. **fitting the model, give**:
    + epochs (number of iterations of the dataset in training)
    + batch size (the data is fed in batches; not all data at once)
    + Determines which fraction of the data is used as a validation set 
    
    
4. Evaluate  

5. Make predictions  

## 1. Simple configuration

Individual dense layers mapped one by one. Different types of layers are described [here](https://towardsdatascience.com/four-common-types-of-neural-network-layers-c0d3bb2a966c), e.g.

In [None]:
from tensorflow.keras import backend as K

K.clear_session()
    
# run this as soon as you want to restart creating a model!

model = tf.keras.models.Sequential() 

# in case your system is not configured to support GPUs, you might get a warning

**Let's define the first layer**  
- units: number of neurons

- input_dim: dimensions (in our example: 2) of input tensor without bias (X[:,0], X[:,1])

In [None]:
#model.add


In [None]:
model.summary()

**Second layer**

In [None]:
model.summary()

**Where did 12 parameters in the first layer come from? Think about the architecture.**

In the first layer, 2 inputs (same for each) and 1 separate bias for each neuron. 

In the second layer, the weights for the  4 outputs of the 1st layer and 1 bias  goes to the remaining neuron.

In [None]:
display.Image("figures/network.jpg")

## 2. Compilation

This is where Keras actually communicates with TensorFlow
and creates what's called a 'computation graph'. Keras is compiling
our model into a very abstract form that is implemented in C++.

One caveat about compile -- if you run this piece of code more than once in a single session, Keras will get confused.


Running Keras in Jupyter is fine, but **remember**:

    `from tensforflow.keras import backend as K
     K.clear_session()` 

You should do this everytime you use Keras, because it will clear the memory of the previously compiled model every time.

     

In [None]:
K.clear_session()
model.compile(
              , # the algorithm used to optimize the weights
              , # how the loss is quantified (real values)
              , # how good the model performs (not used by opt. algorithm)
             )

 
Some nice [explanation](https://towardsdatascience.com/a-look-at-gradient-descent-and-rmsprop-optimizers-f77d483ef08b) of optimizers like [Adam](https://keras.io/api/optimizers/adam/)
    
ADAM optimises the network using a stochastic gradient descent. It is mentioned in the documentation that it works well if the sample size is larger in comparison to the number of parameters.

**Reminder: Stochastic gradient descent tries to minimise the gradient of the loss function using randomly chosen mini-batches.**

## 3. Fit the model to training data
 

Batch size: Number of samples per gradient update.

hint: [Machine Learning Glossary](https://developers.google.com/machine-learning/glossary) (in case you don't remember the meaning of a certain ML term)

In [None]:
history = model.fit(

                 # verbose = False,
                )

What do the parameters mean?
- X: input values
- y: output labels/values (classification/regression)
- epochs: like the number of increments (forward-backward propagation)
- batch_size: number of samples used (reduces computational effort instead of using all samples => stochastic gradient)
- validation_split: portion of results used for validation

## 4. Evaluation

In [None]:
losses_accurs = pd.DataFrame(history.history)

losses_accurs[['loss', 'val_loss']].plot()
plt.title('Train and Test (val) Loss')
plt.xlabel('epochs')
plt.show()

losses_accurs[['accuracy', 'val_accuracy']].plot()
plt.title('Train and Test (val) Accuracy')
plt.xlabel('epochs')
plt.show()



**You can also get single elements through history.history.**

In [None]:
pd.DataFrame(history.history)

## 5. Predictions
(Solution to our challenge)  
Let's assume, we want to know, whether the point (0.5, -0.5) belongs to the upper or lower half moon.  
We can predict this categorical problem with our just created model.  
According to our X and y (see above), zeros belong to the upper half moon, ones to the lower.

In [None]:
model.predict([[0.5, -0.5]])


## Alternative model configuration with keras.layers


Many many more layer options than what we are doing exist. 

Checkout https://www.tensorflow.org/api_docs/python/tf/keras/layers



In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation, LeakyReLU

In [None]:
K.clear_session()

model = Sequential([
                 # in i/p shape ',' is necessary at end when you have only one dimension
    
])

In [None]:
model.summary()

In [None]:
X.shape

## How to save a model for later use

In [None]:
from tensorflow.keras.models import load_model

model.save("model_moons.h5")
# here you can stop your notebook
moons_model = load_model("model_moons.h5")
moons_model.summary()

## Summary

- We spoke about what a tensor is (n-dimensional array).  
- We tried out several network parameters (activation functions, # layers/nodes, ...) in tensorflow playground.  
- We learnt to know and used keras API to create a neural network model.  
- We performed a challenge on the two half moon problem.
- We saved and loaded the model for later use


## References
+ keras models api: https://keras.io/api/models/
+ keras layers api: https://keras.io/api/layers/
+ keras optimizer api: https://keras.io/api/optimizers/
+ keras metrics api: https://keras.io/api/metrics/
+ keras losses api: https://keras.io/api/losses/
+ To track your different experiments on models use https://www.tensorflow.org/tensorboard/get_started