# Simplest example
## XOR gate
- Traditional problem in the early days of AI research
- The XOR gate is a function of two variables that returns one output:
$$\begin{array}{cc|c}
x_1 & x_2 & y \\ \hline
0 & 0 & 0 \\
0 & 1 & 1 \\
1 & 0 & 1 \\
1 & 1 & 0
\end{array}$$

Violates all the recomendations train/validation... but enough to learn how to use the libraries

### Import the relevant libraries

In [None]:
# optional, only for Jupyter
%matplotlib notebook

# General libraries
import numpy as np                # to deal with arrays, vectors, matrices...
import matplotlib.pyplot as plt   # to plot the data
# Tensorflow
import os
HOME = os.getenv('HOME')
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'  # to get rid of the TF compilation warnings
import tensorflow as tf
from tensorflow.keras import models
from tensorflow.keras.layers import Dense

In [None]:
# Only because my system-wide config is tuned, you don't need these lines
import matplotlib as mpl
mpl.rcParams['figure.figsize'] = 5,3
mpl.rcParams['font.size'] = 12.0

### Get the data
In this case we create the dataset manually

In [None]:
# Replicate the XOR behavior
IN_train = np.array([[0,0],
                     [0,1],
                     [1,0],
                     [1,1]])

OUT_train = np.array([[0],
                      [1],
                      [1],
                      [0]])

inp_shape = IN_train.shape[1:]
print(inp_shape)

There is no test data. We use the training set as testing

In [None]:
IN_test = IN_train
OUT_test = OUT_train

No need for normalization since all inputs and outputs $\in[0,1]$  
No NaN nor missing data

### Create the NN
`tensorflow.keras.models.Sequential` is the class used to create _Multi Layer Perceptrons_  
`tensorflow.keras.layers.Dense` is the standard Fully Connected layer. The default activation is `None` so we change it to the standard `sigmoid`.  

You can play around here, try different architectures, activations (sigmoid, tanh...), etc
Possible architectures:
- `input --> 2 --> 1 --> output`  (harder to train)  
- `input --> 200 --> 500 --> 1000 --> 100 --> 1 --> output` (slower to train)
- `input --> 10 --> 5 --> 3 --> 1 --> output` (reasonable)

In [None]:
model = models.Sequential()
model.add(Dense(5, activation='sigmoid', input_shape=inp_shape))
model.add(Dense(3, activation='sigmoid'))
model.add(Dense(1))

# model.add(Dense(10, activation='tanh', input_shape=inp_shape))
# model.add(Dense(5, activation='tanh'))
# model.add(Dense(3, activation='tanh'))
# model.add(Dense(1))

Compile the model and check that everything is ok.  
`optimizer`: method to follow the gradient descent.  
`loss`: error function to use.  
`metrics`: statistics to keep in order to monitor the training process, can be other loss functions, or any other that can be useful.

In [None]:
model.compile(optimizer = 'adam',
              loss = 'mean_squared_error',
              metrics = ['accuracy'])
model.summary()

## Testing
We can test now our model, initialized with random parameters:

In [None]:
print('Before training:')
print('Input   xpct Out   Output')
#
# We use the "predict" method to evaluate the model in the training dataset
#
predicted = model.predict(IN_train)

for i in range(IN_train.shape[0]):
    print(f'{IN_train[i]}     {OUT_train[i]}       {predicted[i][0]:.2f}')

We obtained random results since the bias and weights were randomly chosen and the model had not been trained yet.
Let us train the model now

### Training the model
The training process is carried out by the "fit" method.  
`epochs`: Number of steps towards the error minimum  
`validation_data`: If available, it is the dataset against which the accuracy is measured  
`verbose`: 0 run quietly, no output

In [None]:
# Train the model
from time import time
t_old = time()

history = model.fit(IN_train, OUT_train, epochs=500,
                    validation_data = (IN_test,OUT_test),
                    verbose=1)

print('Training: %ss'%(time()-t_old))

In [None]:
# plot learning curve
err = history.history['loss']
acc = history.history['accuracy']

fig, ax = plt.subplots()
ax.plot(err,label='loss')
ax.plot(acc,label='accuracy')
ax.set_title('Learning curves')
ax.legend()
plt.show()

We can test now our model, with the optimized parameters

In [None]:
print('After training:')
print('Input   xpct Out   Output')
predicted = model.predict(IN_train)
for i in range(IN_train.shape[0]):
    print(f'{IN_train[i]}     {OUT_train[i]}       {predicted[i][0]:.2f}')

## Explore parameter of the model

In [None]:
# Print the model weights and biases
for W in model.get_weights():
    print(W.shape)
    print(W)
    print('-----------')


## How to save and load models

In [None]:
# Save result
model.save('my_model.h5')

In [None]:
model_loaded = models.load_model('my_model.h5')
model_loaded.summary()

In [None]:
predicted = model_loaded.predict(IN_train)
for i in range(IN_train.shape[0]):
    print(IN_train[i],'   ',OUT_train[i],'      %.2f'%(predicted[i]))