### The XOR-Model ###

We will build our tiny neural network predicting the XOR-data in keras

First of all, we need to import the libraries

In [1]:
import numpy as np
from tensorflow import keras
from tensorflow.keras import layers





We need to define simple datas.


In [2]:
#Input datasets
x_train = np.array([[0,0],[0,1],[1,0],[1,1]])
y_train = np.array([[0],[1],[1],[0]])



Now we set up our parameters - and define the model, exactly as in the lecture slides

TODO: Show different way to set up the model, with and without Input layer

In [4]:
# Set up your model here :
x_train.shape
y_train.flatten().shape
y_train.reshape(1, -1).shape


(1, 4)

In [5]:
from keras.models import Sequential
from keras.layers import Dense

model = Sequential()
model.add(Dense(500, activation='relu', input_dim=2))
model.add(Dense(100, activation='relu'))
model.add(Dense(50, activation="relu"))
model.add(Dense(2, activation="softmax"))



In [6]:
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=["accuracy"])





We already have compiled our model - now we need to train it. We also need to get some predictions in order to see whether our model can indeed predict the XOR-data

In [7]:
# Train your model here, and predict the XOR-data :
model.fit(x_train, y_train.flatten(), epochs=10)


Epoch 1/10


Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.src.callbacks.History at 0x220d2f54fd0>

What is wrong with these predictions?



In [8]:
# Please display predicted values :
model.predict(x_train)




array([[0.5045543 , 0.49544576],
       [0.40081057, 0.59918934],
       [0.38915473, 0.6108453 ],
       [0.58468777, 0.41531223]], dtype=float32)

In [9]:
import pandas as pd
predictionsFinale = []
for ligne in model.predict(x_train):
  if ligne[0] > ligne[1]:
    predictionsFinale.append([0, ligne[0]])
  else:
    predictionsFinale.append([1, ligne[1]])

df = pd.DataFrame(predictionsFinale, columns=["predictions", "probabilités"])
df




Unnamed: 0,predictions,probabilités
0,0,0.504554
1,1,0.599189
2,1,0.610845
3,0,0.584688


In [10]:

print("Les prédictions :", df.predictions.values)


Les prédictions : [0 1 1 0]


Let's compare the predictions to the true labels, do you notice a "type" difference ?

In [11]:
print("Les valeurs réelles :", y_train.flatten())


Les valeurs réelles : [0 1 1 0]


In [None]:
# Please display true values :


This takes ridicuously long! Let's try to get the training faster. But first of all - and in order to "measure" how long the training takes, implement some code which stores the epoch at which the Neural Network has stably reached 100 percent accuracy. Stably means that the accuracy does not jump back to less then 100 percent. There are keras callbacks - and you could write a custom callback. But for now, you can also write a loop - in which the model is trained for one epoch at every iteration. You should store the accuracies at each epoch in a list in order to be able to visualize them.

Do not alter the cost function!

Hint: Good code style would be to put the model set up into a function and also the code to get the accuracy

You should get someting like this format :
* Epoch 0 / 200    accuracy: 0.5
* Epoch 20 / 200    accuracy: 0.5
* Epoch 40 / 200    accuracy: 0.5
* Epoch 60 / 200    accuracy: 0.5



In [None]:
# # function for model-set-up
# def build_model():
#   # Layers

#   # Learning rate

#   # Optimizer

#   # Compile

#   return model



In [13]:
# function for model-set-up
def build_model(X, y, epochs=200):
  from keras.models import Sequential
  from keras.layers import Dense
  from keras.callbacks import LearningRateScheduler
  import keras

  # Layers
  model = Sequential()
  model.add(Dense(50, activation='relu', input_dim=2))
  model.add(Dense(10, activation='relu'))
  model.add(Dense(50, activation="relu"))
  model.add(Dense(2, activation="softmax"))

  # Création d'une fonction call back qui va être appelé après chaque epoch
  # On crée une liste qui récupérera les valeurs d'accuracy
  accuracys = []
  # La classe callback de keras qui doit hériter de keras.callbacks.Callback
  class CustomCallback(keras.callbacks.Callback):
    # https://keras.io/guides/writing_your_own_callbacks/
    # Le nom on_epoch_end est reconnu par keras, il va lancer la fonction à la fin de chaque epoch
    # La fonction récupère l'accuracy
    def on_epoch_end(self, accuracy, logs=None):
      #keys = list(logs.keys())
      # on ajoute l'accuracy récupéré à la fin de chaque tour dans la liste
      accuracys.append(logs['accuracy'])
      # Si elle est à 100% (1.0)
      if logs['accuracy'] == 1.0:
        # On arrête l'apprentissage
        self.model.stop_training = True
      #print(f"\nAccuracy ! >>> {logs['accuracy']}")
      # Ensuite on retourne la liste des accuracy
      return accuracys

  # Learning rate
  learning_rate = 0.01

  # Optimizer
  opt = keras.optimizers.Adam(learning_rate=learning_rate)

  # Compile

  model.compile(optimizer=opt,
              loss='sparse_categorical_crossentropy',
              metrics=["accuracy"])

  model.fit(X, y, epochs=epochs, callbacks=[CustomCallback()])

  return model, accuracys


In [14]:
model, accuracy = build_model(x_train, y_train.flatten(), epochs=200)


Epoch 1/200
Epoch 2/200
Epoch 3/200
Epoch 4/200
Epoch 5/200
Epoch 6/200


In [15]:
from keras.models import Sequential
from keras.layers import Dense
from keras.callbacks import LearningRateScheduler
import keras
#import tensorflow_addons as tfa


# Layers
model = Sequential()
model.add(Dense(50, activation='relu', input_dim=2))
model.add(Dense(10, activation='relu'))
model.add(Dense(50, activation="relu"))
model.add(Dense(2, activation="softmax"))

# Learning rate
learning_rate = 0.01

# Optimizer
opt = keras.optimizers.Adam(learning_rate=learning_rate)

# Compile

model.compile(optimizer=opt,
            loss='sparse_categorical_crossentropy',
            metrics=["accuracy"])

X = x_train
y = y_train.flatten()

def fit_model_and_get_accuracy(model, epochs=60, epoch_log_step=20):
  model.fit(X, y, epochs=epochs, steps_per_epoch=epoch_log_step)
  accuracies_list = model.history.history["accuracy"]
  return accuracies_list

fit_model_and_get_accuracy(model)


Epoch 1/60
Epoch 2/60
Epoch 3/60
Epoch 4/60
Epoch 5/60
Epoch 6/60
Epoch 7/60
Epoch 8/60
Epoch 9/60
Epoch 10/60
Epoch 11/60
Epoch 12/60
Epoch 13/60


[0.6000000238418579,
 1.0,
 1.0,
 1.0,
 1.0,
 1.0,
 1.0,
 1.0,
 1.0,
 1.0,
 1.0,
 1.0,
 0.0]

## First Simple Reference Model (hidden_shape = 2, hidden_size = 1)

In [16]:
# Use your functions to execute your script

# Use your functions to execute your script

# function for model-set-up
def build_model(X, y, epochs=200):
  from keras.models import Sequential
  from keras.layers import Dense
  from keras.callbacks import LearningRateScheduler
  import keras

  # Layers
  model = Sequential()
  model.add(Dense(50, activation='relu', input_dim=2))
  model.add(Dense(10, activation='relu'))
  model.add(Dense(21, activation="relu"))
  model.add(Dense(2, activation="softmax"))

  # Fonction call back
  accuracys = []
  class CustomCallback(keras.callbacks.Callback):
    # https://keras.io/guides/writing_your_own_callbacks/
    def on_epoch_end(self, accuracy, logs=None):
      #keys = list(logs.keys())
      accuracys.append(logs['accuracy'])
      if logs['accuracy'] == 1.0:
        self.model.stop_training = True
      #print(f"\nAccuracy ! >>> {logs['accuracy']}")
      return accuracys

  # Learning rate
  learning_rate = 0.01

  # Optimizer
  opt = keras.optimizers.Adam(learning_rate=learning_rate)

  # Compile

  model.compile(optimizer=opt,
              loss='sparse_categorical_crossentropy',
              metrics=["accuracy"])

  model.fit(X, y, epochs=epochs)

  return model, accuracys


## More complex models

Okay, now let's get startet. Make a note of your accuracy == 1.0 epoch. It is your baseline. And then try to alter the model so that it trains faster. These are the hyperparameters you need to optimize - but feel free to add others!

1) Hidden size

2) Number of hidden layers

3) Learning rate - just try different numbers
   What happens if the learning rate is too big?

4) Learning rate - try to decrease it during the training process

5) Different optimizers



For the eager ones: The weights are initialized randomly (within limits) - so if you want better results and insights into the effect of different hyperparameters, you would have to run each experiment a couple of times (i.e. 5 to 10 minimum) and average over them. But you may ignore this in this quest - just bear it in mind!


In [None]:
# Hidden Size


# Number of hidden layers

# Learning rate

# Learning rate - continuous decrease

# Optimizers
model, accuracy = build_model(x_train, y_train.flatten(), epochs=200)
accuracy


Now visualize all your learning curves - i.e. nr. of epochs against accuracy. Which hyperparameter did have the biggest effect?

Hint: In order to do that systematically, you could save the list of accuracies for each experiment and then display them all in one graph (at least for each tuned hyperparameter)

In [None]:
# Here go the plots
import matplotlib.pyplot as plt


plt.plot(model.history.history['accuracy'])
#plt.plot(model.history.history['val_accuracy'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
#plt.legend(['train', 'test'], loc='upper left')
plt.show()


In [None]:
# More plots
import matplotlib.pyplot as plt


plt.plot(model.history.history['loss'])
#plt.plot(model.history.history['val_accuracy'])
plt.title('model MSE')
plt.ylabel('MSE')
plt.xlabel('epoch')
#plt.legend(['train', 'test'], loc='upper left')
plt.show()



In [None]:
# Add as many plots as you want


We are still working with mean_squared_error as a cost-function. What would be a more suitable cost function?

Alter the cost function and see how fast you get to a stable accuracy of 1.0.


What is your best score (epoch with accuracy == 1.0)?

Can you get the score to under 100?

What might be the problem with your best model?

Answer all these questions in the text cell below

My conclusion about these experiments is: