# ELM Implementations on Multiple Datasets

## Extreme Learning Machine
The extreme learning machine (ELM) developed by Huang et al. is a linear (in the single-layer case) classifier that was designed to achieve high training speeds. In its simplest form, it is a single-layer feedforward neural network (SLFN) for classification and regression among other functions.

Training involves randomizing the input weights and biases, and then performing a matrix inversion to get the output weights. This is done in one step.

In this notebook, I train and evaluate multiple types of ELMs on different datasets as a means of familiarizing myself with the algorithms.

### Important Modules

In [46]:
import numpy as np
# from scipy import linalg
import csv
import random
import matplotlib.pyplot as plt
import pandas as pd

### Iris Dataset
The first dataset I use is the Iris dataset which consists 3 highly separable classes

There are 150 instances in the dataset (50 per class). Each instance is 4-dimensional, with the features described below

1. sepal length in cm
2. sepal width in cm
3. petal length in cm
4. petal width in cm
5. class: 
    - Iris Setosa - 1
    - Iris Versicolour - 2
    - Iris Virginica - 3

Armed with this information, we dive right into it. First, we load the dataset.

In [47]:
filename = 'Datasets/Iris/iris.data'
trainRatio = 0.02

# Loads the data given a filename
def loadData(filename):
    # Create the dataframe then convert it to a
    # numpy array
    data = pd.read_csv(filename, header=None)
    data = data.to_numpy(dtype=None, copy=False)
    
    return data

# Splits the data according to the given trainRatio
def splitData(data, trainRatio):
    # Find which classes are which and separate them
    setosa = data[np.where(data[:, -1] == 'Iris-setosa')]
    setosa[:,-1] = 1
    versicolour = data[np.where(data[:, -1] == 'Iris-versicolor')]
    versicolour[:,-1] = 2
    virginica = data[np.where(data[:, -1] == 'Iris-virginica')]
    virginica[:,-1] = 3

    # Shuffle the classed data
    np.random.shuffle(setosa)
    np.random.shuffle(versicolour)
    np.random.shuffle(virginica)

    # Take the first trainNum of each class
    trainNum = int(trainRatio*setosa.shape[0])
    trainSet = setosa[:trainNum,:]
    trainSet = np.concatenate((trainSet, versicolour[:trainNum,:]), axis=0)
    trainSet = np.concatenate((trainSet, virginica[:trainNum,:]), axis=0)

    # Take the rest as test data
    testSet = setosa[trainNum:,:]
    testSet = np.concatenate((testSet, versicolour[trainNum:,:]), axis=0)
    testSet = np.concatenate((testSet, virginica[trainNum:,:]), axis=0)

    # Shuffle the train and test sets
    np.random.shuffle(trainSet)
    np.random.shuffle(testSet)
    
    # Ensure the data is float32 to avoid indecipherable numpy errors
    trainSet = trainSet.astype(np.float32)
    testSet = testSet.astype(np.float32)

    return trainSet, testSet

# Activation function
def activate(z, activation):
    if activation == 'sigmoid':
        return 1/(1+np.exp(-z))
    elif activation == 'relu':
        return np.maximum(0.0,z)

# Compute the accuracy of predictions
def accuracy(YPred, YTrue):
    sumAcc = 0
    if(YPred.size == YTrue.size):
        for i in np.arange(YTrue.size):
            if(YPred[i] == YTrue[i]):
                sumAcc += 1
    return (sumAcc/YTrue.size)# * 100

# This function thresholds the predictions by setting each
# one that is within 0.5 of the class labels to that label
def thresholdPreds(YPred, lowerLim, upperLim):
    for i in range(lowerLim, upperLim+1):
        pos = np.argwhere(np.abs(YPred-i) < 0.5)
        for j in pos:
            YPred[j] = i
    pos = np.argwhere(YPred < lowerLim)
    for j in pos:
            YPred[j] = lowerLim
    pos = np.argwhere(YPred > upperLim)
    for j in pos:
            YPred[j] = upperLim
    return YPred

# MAIN FUNCTION
def main():
    # Load the data
    data = loadData(filename)

    # Shuffle and split the data
    XTrain, XTest = splitData(data, trainRatio)
    print("trainSet Shape:", XTrain.shape)
    print("testSet Shape:", XTest.shape)

    # Separate the data into features and their labels
    YTrain = XTrain[:,-1]
    XTrain = XTrain[:,:-1]
    YTest = XTest[:,-1]
    XTest = XTest[:,:-1]

    # Now the data is prepped, we can train and 
    # test the single-layer ELM
    nHidden = 2
    activation = 'relu'
    np.random.seed(0)
    inputWeights = np.random.rand(XTrain.shape[1], nHidden)
    inputBias = np.random.rand(nHidden, 1)
    
    # Compute the forward calculation
    z = np.matmul(inputWeights.T, XTrain.T) + inputBias
    z = z.T

    # Activate the computation
    H = activate(z, activation)
    H.astype(np.double)
    print("H Shape:", H.shape)

    # Take the pseudoinverse of H and multiply
    # it by the labels
    Beta = np.matmul(np.linalg.pinv(H), YTrain)
    print("Beta Shape:", Beta.shape)

    # With this Beta, we should be able to carry out 
    # the classification task on the test data
    z = np.dot(inputWeights.T, XTest.T) + inputBias
    z = z.T
    H = activate(z, activation)

    YPred = np.matmul(H, Beta)

    # For multi-class classification, it is important that 
    # we threshold the data in some manner to select
    # its predicted label
    YPred = thresholdPreds(YPred, 1, 3)

    # Finally, we compute the prediction accuracy
    acc = accuracy(YPred, YTest)
    print('Prediction Accuracy:{:3.2%}'.format(acc))


In [48]:
if __name__ == '__main__':
    main()

trainSet Shape: (3, 5)
testSet Shape: (147, 5)
H Shape: (3, 2)
Beta Shape: (2,)
Prediction Accuracy:89.12%
