<a href="https://colab.research.google.com/github/GabeMaldonado/UoL_Study_Materials/blob/main/Neural_networks.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Neural networks
## Instructions:
* Go through the notebook and complete the tasks.
* Make sure you understand the examples given. If you need any help, refer to the documentation links provided or go to the Topic 7 discussion forum. 
* When a question allows a free-form answer (e.g. what do you observe?), create a new markdown cell below and answer the question in the notebook. 
* Save your notebooks when you are done.

Before you start on the tasks below, go through the information in the following links:
* <a href="http://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html">scikit-learn ‘Multilayer perceptron classifier’</a>
* <a href="http://scikit-learn.org/stable/modules/neural_networks_supervised.html">scikit-learn ‘Neural network models (supervised)’</a> 

**Task 1:**
In the cell below, you can see code that implements a perceptron network (with a single node in the hidden layer). The data given to the network relates to the XOR problem: there are 2D inputs that can either be 0 or 1. We obtain a 1 answer only when the inputs are different, i.e. when one input is 0 and one is 1. Run and go through the code.


In [1]:
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score
X=[ [0., 0.], [1., 1.], [0., 1.], [1.,0.]]*1000
y=[0,0,1,1]*1000

clf = MLPClassifier(solver='lbfgs', alpha=1e-5,hidden_layer_sizes=(1), random_state=1,activation='identity',max_iter=2000,  tol=1e-5,validation_fraction=0)
                   
clf.fit(X, y)
ypred=clf.predict(X)
accuracy_score(y, ypred)


0.5

**Task 2:**
What would you say regarding the performance of the network above? How would you justify the results?


In [None]:
# write answer as comments here 
# the problem is not linear and it cannot be solved using a single layer perceptron
#
# Also the accuracy is 50% which is no better than a random guess

**Task 3:**
What would be the most minimal changes you could make in order to obtain 100% accuracy on this problem?


In [2]:
# code here
# we can try adding more hidden layers and changing the activation function

In [4]:
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score
X=[ [0., 0.], [1., 1.], [0., 1.], [1.,0.]]*1000
y=[0,0,1,1]*1000

clf = MLPClassifier(solver='lbfgs', alpha=1e-5,hidden_layer_sizes=(2), random_state=1,activation='tanh',max_iter=2000,  tol=1e-5,validation_fraction=0)
                   
clf.fit(X, y)
ypred=clf.predict(X)
accuracy_score(y, ypred)


1.0

**Task 4:**
The MNIST dataset is one of the most well-known datasets that is used for machine learning tasks. It contains a set of handwritten digit images with accompanying labels.
In the cell below, you are given the code that loads the MNIST dataset and splits data into training and testing.
Your task is to create a multilayer perceptron classifier that performs well on MNIST (you should expect a score over 95% on the test data). Do this by completing the code in the cell below, reporting the accuracy and confusion matrix on both testing and training data.


In [5]:
import matplotlib.pyplot as plt
from sklearn.datasets import fetch_openml

mnist = fetch_openml('mnist_784') 

#  rescale between 0 and 1
X, y = mnist.data / 255., mnist.target
# train/ test split
X_train, X_test = X[:60000], X[60000:]
y_train, y_test = y[:60000], y[60000:]


In [13]:
# create, train mlp model

mlp = MLPClassifier(hidden_layer_sizes=(50,),
                    max_iter=10, alpha=1e-4,
                    solver='sgd', verbose=10, tol=1e-4,
                    random_state=1, learning_rate_init=.1)

mlp.fit(X_train, y_train)
print(f"Training set accuracy {mlp.score(X_train, y_train)}")
print(f"Test set accuracy: {mlp.score(X_test, y_test)}")

Iteration 1, loss = 0.32009978
Iteration 2, loss = 0.15347534
Iteration 3, loss = 0.11544755
Iteration 4, loss = 0.09279764
Iteration 5, loss = 0.07889367
Iteration 6, loss = 0.07170497
Iteration 7, loss = 0.06282111
Iteration 8, loss = 0.05530788
Iteration 9, loss = 0.04960484
Iteration 10, loss = 0.04645355




Training set accuracy 0.9868
Test set accuracy: 0.97


In [14]:
import tensorflow as tf
def train_mnist_conv():
 
    class myCallback(tf.keras.callbacks.Callback):
        def on_epoch_end(self, epoch, logs={}):
            if(logs.get('accuracy') > 0.998):
                print("\nReached 99.8% accuracy so cancelling training!")
                self.model.stop_training = True

    callbacks = myCallback()

    
    mnist = tf.keras.datasets.mnist
    (x_train, y_train),(x_test, y_test) = mnist.load_data()

    x_train = x_train.reshape(60000, 28, 28, 1)    
    x_train = x_train/ 255.0
    x_test = x_test.reshape(10000, 28, 28, 1)    
    x_test = x_test/255.0
    
    
    model = tf.keras.models.Sequential([
        tf.keras.layers.Conv2D(64, (3,3), activation='relu', input_shape=(28, 28, 1)),
        tf.keras.layers.MaxPooling2D(2, 2),
        tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
        tf.keras.layers.MaxPooling2D(2,2),
        tf.keras.layers.Flatten(),
        tf.keras.layers.Dense(128, activation='relu'),
        tf.keras.layers.Dense(10, activation='softmax')])

    model.compile(optimizer='adam',
                  loss='sparse_categorical_crossentropy',
                  metrics=['accuracy'])
    model.summary()
    # model fitting
    history = model.fit(# YOUR CODE SHOULD START HERE
        x_train, y_train, epochs=20, callbacks=[callbacks]
              # YOUR CODE SHOULD END HERE
    )
    # model fitting
    classifications = model.predict(x_test)

    print(classifications[1])

    print(y_test[1])
   
    return history.epoch, history.history['accuracy'][-1]

In [15]:
_, _ = train_mnist_conv()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              (None, 26, 26, 64)        640       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 13, 13, 64)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 11, 11, 64)        36928     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 5, 5, 64)          0         
_________________________________________________________________
flatten (Flatten)            (None, 1600)              0         
_________________________________________________________________
dense (Dense)                (None, 128)               204928    
______________________________