# Perceptron Model & MLP (Multi-Layer Perceptron)

- [x] Dataset generation
- [x] Perceptron
    - [x] Activiation functions
    - [x] init
    - [x] fit
    - [x] predict
- [x] Neural Network
    - [x] init
    - [x] fit
    - [x] predict
- [x] Global script

### Imports

Here we import the different libraries and modules to run the code.
And we add an autoreload feature to retrieve the last version of the python files.

In [2]:
%load_ext autoreload

In [3]:
%autoreload

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from random import uniform
from pprint import pprint

from perceptron import Perceptron 
from neuralnetwork import NeuralNetwork 
from neuralnetworkzouk import NeuralNetworkZouk

### Dataset Generation

Not sure about the most appropriate form of the data. Therefore there is 2 different output functions:
- generate_data_df: create the data as a Dataframe
- generate_data_array: split the data into a outputs 2D array and a target 1D array

Even if in the code, the target was set either -1 or 1, I decided to put it between 0 and 1.

Not sure also about the value of features :
- float from -1 to 1
- boolean -1 or 1

In [130]:
def generate_dataset_df(size, features):
    """
    Inputs:
        size: (int) number of samples -> m
        features : (int) number of features -> n
    Return:
        (Dataframe) with m samples labeled, n features [x0, xn] and the output y 
    """
    return pd.DataFrame(
        # dataset
        [[uniform(-1.0, 1.0) for _ in range(features + 1)] for _ in range(size)],
        # labels
        columns=[str(f"x{i}") for i in range(features)] + ['y'])
    
def generate_dataset_array(size, features):
    """
    Inputs:
        size: (int) number of samples -> m
        features : (int) number of features -> n
    Return:
        
    """
    # value between -1. and 1
    features = np.asarray([np.asarray([uniform(-1., 1.) for _ in range(features)]) for _ in range(size)])
    # value between 0. and 1.
#     features = np.asarray([np.asarray([uniform(0., 1.) for _ in range(features)]) for _ in range(size)])
    # value 0 or 1
#     features = np.asarray([np.random.randint(2, size=features) for _ in range(size)])

    # value between -1. and 1.
#     targets = np.heaviside(np.asarray([uniform(-1., 1.) for _ in range(size)]), 0)
    # value 0 or 1
    targets = np.random.randint(2, size=size)
    targets = targets.reshape(np.shape(targets)[0], 1)
    
    
    features[features==0] = -1.


#     targets[targets==0] = -1
    return features, targets

### Perceptron

Here we are creating a perceptron to check if the gradient descent is working.
Apparently it is working for the 4 activiation function because of the error decreasing.
Need some adjustement on the learing rate to not overshoot sometimes.

In [148]:
%autoreload

x, y = generate_dataset_array(1, 10)
# Testing on the first element
test_x, test_y = x[0], y[0][0]

activation_functions = ['relu', 'sigmoid', 'heaviside', 'tanh']
for activation_function in activation_functions:
    perceptron = Perceptron(10, activation_function)
    for _ in range(20):
        perceptron.fit(test_x, test_y, 0.3)
    print(f"{activation_function.upper().ljust(10)} Initial target: {test_y} | Model ouput: {perceptron.predict(test_x)}")

RELU       Initial target: 0 | Model ouput: 3.0531133177191805e-16
SIGMOID    Initial target: 0 | Model ouput: 0.06157841525911837
HEAVISIDE  Initial target: 0 | Model ouput: 0.0
TANH       Initial target: 0 | Model ouput: 6.10622663543836e-16


### Neural Network

Here is the main part, with the context of the neural network.

In [None]:
%autoreload

# TRAINING_LOOP = 500_000
TRAINING_LOOP = 10000
DATASET_SIZE = 8
DATASET_FEATURES = 6
LEARNING_RATE = 1.2

neural_network = NeuralNetwork(6, [100], 3)
# neural_network = NeuralNetwork(6, [10], 1)
X, y = generate_dataset_array(DATASET_SIZE, DATASET_FEATURES)

for i in range(TRAINING_LOOP):
    if (i%1000 == 0):
        print(f"Training loop: {i*100//TRAINING_LOOP}%")
    for inputs, target in zip(X, y):
        neural_network.fit(inputs, target, LEARNING_RATE)

predictions = neural_network.predict(np.array([-1.0, -1.0, -1.0, -1.0, -1.0, -1.0]))
print(f"Prediction: {predictions}")

### New Dataset - Iris

This data sets consists of 3 different types of irises’ (Setosa, Versicolour, and Virginica) petal and sepal length, stored in a 150x4 numpy.ndarray

The rows being the samples and the columns being: Sepal Length, Sepal Width, Petal Length and Petal Width.

![dataset_presentation](https://deeplearning.cms.waikato.ac.nz/img/iris.png)

Retrieving the dataset and formatting it in the correct ouput for your use case.

In [168]:
from sklearn import datasets
from sklearn.preprocessing import OneHotEncoder, StandardScaler, MinMaxScaler
from sklearn.model_selection import train_test_split

def generate_dataset():
    """
    Args:
    Return:
        2 2-D numpy array with the features inside, one is for training and the other one for testing
        2 2-D numpy array with the targets inside, one is for training and the other one for testing
    """
    # Load the dataset as an object
    iris = datasets.load_iris()
    # Take the features
    data_X = np.asarray(iris['data'])
    # Take the targets
    data_Y = np.asarray(iris['target'])
    # Transform the 1-D array as a 2-D array
    data_Y = data_Y.reshape(np.shape(data_Y)[0], 1)
    
    # Transform the the [0/1/2] label to a [1,0,0]/[0,0,0]/[0,0,1]
    enc = OneHotEncoder()
    data_Y = enc.fit_transform(data_Y).toarray()
    
    # Scale the features between 0 and 1
    scaler = MinMaxScaler()
    data_X = scaler.fit_transform(data_X)
    
    # Split the data between a training and a testing part
    return train_test_split(data_X, data_Y)

Training the model, here a 3 hidden layers with 20 perceptrons each (2 layers of 20 had perform the same), followed by a 3 outputs layer.

In [4]:
%autoreload

TRAINING_LOOP = 10000
LEARNING_RATE = 0.8

X_train, X_test, Y_train, Y_test = generate_dataset()

INPUT_SIZE = len(X_train[0])
OUTPUT_SIZE = len(Y_train[0])

neural_network = NeuralNetworkZouk(
    INPUT_SIZE,
    [(20, 'relu'), (20, 'relu')], 
    (OUTPUT_SIZE, 'tanh'),
)

for i in range(TRAINING_LOOP):
    if (i%1000 == 0):
        print(f"Training loop: {i*100//TRAINING_LOOP}%")
    for inputs, target in zip(X_train, Y_train):
        neural_network.fit(inputs, target, LEARNING_RATE)
print(f"Training loop: 100%")

NameError: name 'generate_dataset' is not defined

Because the output format of the neural network is not understandable directy, I built a translator between the label and the direct ouput.

For that we have created a dictionary with 3 tables having for key the different labels of the dataset.
We then make predictions and place them in the associated table.
We match an output type by label. A label can have several output types.
But if ever several labels correspond to an output type, we select the label that has the most occurrences for this output type.

2 limitations of this system:
- If an output gives in quasi equality different labels, we lose 50% of the predictions (well after it is always better to have a chance on three than a chance on two)
- If the labeled data have different features than the real data, the coherence in the prediction will be questioned

In [178]:
def create_lookup_table(
    features,
    targets,
    model,
):
    translator = {}

    for inputs, target in zip(features, targets):
        prediction = model.predict(inputs)

        # Put to the closest int the outputs
        prediction = np.round(prediction)

        # Storing for the translator
        prediction_string = np.array2string(prediction)
        target_string = np.array2string(target)
        if prediction_string not in translator:
            translator[prediction_string] = [target_string]
        else:
            translator[prediction_string].append(target_string)        

    convertor = {}
    for type_prediction in translator:
        # If only one kind of output
        if len(list(set(translator[type_prediction]))) == 1:
            convertor[type_prediction] = translator[type_prediction][0]
        else:
            occurences = {}
            for label in list(set(translator[type_prediction])):
                occurences[label] = translator[type_prediction].count(label)
            print("Multi labels same output: ", occurences)
            convertor[type_prediction] = max(occurences, key=occurences.get)
    
    return convertor

In [180]:
lookupt_table = create_lookup_table(X_train, Y_train, neural_network)

correct_predictions = 0
for inputs, target in zip(X_test, Y_test):
    prediction = neural_network.predict(inputs)
    prediction = np.round(prediction)
    if np.array2string(prediction) in lookupt_table:
        prediction_string = lookupt_table[np.array2string(prediction)]
        target_string = np.array2string(target)
        if prediction_string == target_string:
            correct_predictions += 1
print(f"Prediction rate {correct_predictions*100//len(Y_test)}% | {correct_predictions}/{len(Y_test)}")

Multi labels same output:  {'[0.]': 4, '[0.5]': 36, '[1.]': 32}
Prediction rate 52% | 20/38


### New model - Iris

In [115]:
%autoreload

from sklearn import datasets
from sklearn.preprocessing import OneHotEncoder, StandardScaler, MinMaxScaler
from sklearn.model_selection import train_test_split

def generate_dataset():
    # Load the dataset as an object
    iris = datasets.load_iris()
    # Take the features
    data_X = np.asarray(iris['data'])
    # Take the targets
    data_Y = np.asarray(iris['target'])
    
    # Scale between 0 and 1
    scaler = MinMaxScaler()
    data_X = scaler.fit_transform(data_X)
    data_Y = data_Y.reshape(np.shape(data_Y)[0], 1)
    data_Y = scaler.fit_transform(data_Y)
    
    # Split the data between a training and a testing part
    X_train, X_test, Y_train, Y_test = train_test_split(data_X, data_Y)    
    return X_train, X_test, Y_train, Y_test, scaler


TRAINING_LOOP = 10000
LEARNING_RATE = 0.1

X_train, X_test, Y_train, Y_test, scaler = generate_dataset()

INPUT_SIZE = len(X_train[0])
OUTPUT_SIZE = 3

neural_network = NeuralNetwork(
    INPUT_SIZE,
    [(20, 'sigmoid'), (20, 'sigmoid')],
    (OUTPUT_SIZE, 'sigmoid'),
)

for i in range(TRAINING_LOOP):
    if (i%1000 == 0):
        print(f"Training loop: {i*100//TRAINING_LOOP}%")
    for inputs, target in zip(X_train, Y_train):
        neural_network.fit(inputs, target[0], LEARNING_RATE)
print(f"Training loop: 100%")

Training loop: 0%
Training loop: 10%
Training loop: 20%
Training loop: 30%
Training loop: 40%
Training loop: 50%
Training loop: 60%
Training loop: 70%
Training loop: 80%
Training loop: 90%
Training loop: 100%


In [20]:
def do_look_up(features, targets, model):
    preds = [model.predict(inputs) for inputs in features]
    stats = {}
    for pred, target in zip(preds, targets):
        target_string = np.array2string(target)
        if target_string not in stats:
            stats[target_string] = [pred]
        else:
            stats[target_string].append(pred)
    for label in stats:
        values = np.asarray(stats[label])
        print(label, np.mean(values, axis=0), np.mean(values))
        print(label, np.min(values, axis=0), '\n', np.max(values, axis=0), '\n')
    pprint(stats)

In [117]:
def softmax(x):
    return np.exp(x) / np.sum(np.exp(x), axis=0)

predictions = [neural_network.predict(inputs) for inputs in X_test]

stats = {}
for pred, target in zip(predictions, scaler.inverse_transform(Y_test)):
    idx = np.argmax(softmax(pred))
    if idx not in stats:
        stats[idx] = [int(target[0])]
    else:
        stats[idx].append(int(target[0]))
  
occurences = {}
for idx in range(3):
    occurences[idx] = {}
    for label in list(set(stats[idx])):
        occurences[idx][label] = stats[idx].count(label)

for key in occurences:
    total = 0
    for label in occurences[key]:
        total += occurences[key][label]
    output = f"Label [{key}]: Succes rate {occurences[key][label]*100/total}%"
    if occurences[key][label] != total:
        output += f", missclassification of {total-occurences[key][label]} items over {total}"
    print(output)

Label [0]: Succes rate 100.0%
Label [1]: Succes rate 100.0%
Label [2]: Succes rate 75.0%, missclassification of 4 items over 16
