# Excercise 1

In the tutorial you saw how to compute LDA for a two class problem. In this excercise we will work on a multi-class problem. We will be working with the famous Iris dataset that has been deposited on the UCI machine learning repository
(https://archive.ics.uci.edu/ml/datasets/Iris).

The iris dataset contains measurements for 150 iris flowers from three different species.

The three classes in the Iris dataset:
1. Iris-setosa (n=50)
2. Iris-versicolor (n=50)
3. Iris-virginica (n=50)

The four features of the Iris dataset:
1. sepal length in cm
2. sepal width in cm
3. petal length in cm
4. petal width in cm

<img src="iris_petal_sepal.png">



In [1]:
from sklearn.datasets import make_classification
import matplotlib.pyplot as plt
from statistics import mode
from collections import Counter 
import numpy as np
import seaborn as sns; sns.set();
import pandas as pd
from itertools import combinations 
import math
from sklearn.model_selection import train_test_split
from numpy import pi

### Importing the dataset

In [2]:
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"
names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'Class']
dataset = pd.read_csv(url, names=names)

dataset.tail()

Unnamed: 0,sepal-length,sepal-width,petal-length,petal-width,Class
145,6.7,3.0,5.2,2.3,Iris-virginica
146,6.3,2.5,5.0,1.9,Iris-virginica
147,6.5,3.0,5.2,2.0,Iris-virginica
148,6.2,3.4,5.4,2.3,Iris-virginica
149,5.9,3.0,5.1,1.8,Iris-virginica


In [3]:
dataset

Unnamed: 0,sepal-length,sepal-width,petal-length,petal-width,Class
0,5.1,3.5,1.4,0.2,Iris-setosa
1,4.9,3.0,1.4,0.2,Iris-setosa
2,4.7,3.2,1.3,0.2,Iris-setosa
3,4.6,3.1,1.5,0.2,Iris-setosa
4,5.0,3.6,1.4,0.2,Iris-setosa
...,...,...,...,...,...
145,6.7,3.0,5.2,2.3,Iris-virginica
146,6.3,2.5,5.0,1.9,Iris-virginica
147,6.5,3.0,5.2,2.0,Iris-virginica
148,6.2,3.4,5.4,2.3,Iris-virginica


### Data preprocessing

Once dataset is loaded into a pandas data frame object, the first step is to divide dataset into features and corresponding labels and then divide the resultant dataset into training and test sets. The following code divides data into labels and feature set:

In [4]:
X = dataset.iloc[:, 0:4].values
y = dataset.iloc[:, 4].values

The above script assigns the first four columns of the dataset i.e. the feature set to X variable while the values in the fifth column (labels) are assigned to the y variable.

The following code divides data into training and test sets:

In [5]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

y_train = np.reshape(y_train, ((-1, 1)))
y_test = np.reshape(y_test, ((-1, 1)))

#### Feature Scaling

We will now perform feature scaling as part of data preprocessing too. For this task, we will be using scikit learn `StandardScalar`.

In [6]:
from sklearn.preprocessing import StandardScaler

sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

## Write your code below

Write you code below to LDA on the IRIS dataset and compute the overall accuracy of the classifier.

In [12]:
### WRITE YOUR CODE HERE ####

# Calculating covariance of an input matrix
def calc_cov_matrix(X_input):
    n_samples = np.shape(X_input)[0]
    cov_matrix = np.array((1 / (n_samples-1)) * (X_input - X_input.mean(axis=0)).T.dot(X_input - X_input.mean(axis=0)))
    return cov_matrix

def train(X_train, y_train):

    """Train method for LDA.

    Parameters
    -----------
    X_train: ndarray (num_examples(rows) vs num_features(columns))
    Input dataset which LDA will use to obtain optimal weights during training

    y_train: ndarray (num_examples(rows) vs class_labels(columns))
    """

    # Collecting all class 0 and class 1 into separate variables
    class_X0 = X_train[np.argwhere(y_train == 0)[:, 0]]
    class_X1 = X_train[np.argwhere(y_train == 1)[:, 0]]

    # Getting number of examples in each class
    num_class_X0_samples = np.shape(class_X0)[0]
    num_class_X1_samples = np.shape(class_X1)[0]

    # Computing class mean for each label and calculating the difference between them.
    class_X0_mean = class_X0.mean(0)
    class_X1_mean = class_X1.mean(0)
    class_mean_diff = class_X1_mean - class_X0_mean
    class_mean_diff = class_mean_diff.reshape((-1, 1))
    SB = np.dot(class_mean_diff, class_mean_diff.T)

    # Calculating covariance matrix
    cov_mat_class_X0 = calc_cov_matrix(class_X0)
    cov_mat_class_X1 = calc_cov_matrix(class_X1)
    #SW = num_class_X0_samples * cov_mat_class_X0 + num_class_X0_samples * cov_mat_class_X1
    SW = cov_mat_class_X0 + cov_mat_class_X1

    eigvals, eigvecs = np.linalg.eig(np.linalg.pinv(SW).dot(SB))

    # Getting the eigenvectors with the maximum eigenvalue.
    idx = eigvals.argsort()[::-1]
    eigvals = eigvals[idx][:1]
    weights = np.atleast_1d(eigvecs[:, idx])[:, :1]

    return weights

In [59]:
def train_one_vs_all(X_train, Y_train, base_weights, num_epochs=1000, learning_rate=0.1):
    n_samples, n_features = np.shape(X_train)
    history_weights = []
    epoch = 1
    classes = np.unique(Y_train)
    print("-------------")
    print("Creating One vs Rest Classifier....")
    print("Number of Classifiers needed: (N) =", len(classes))
    print("-------------")
    # Training using Batch GD
    for k in classes:
        # one-vs-all binary classifier
        binary_y_train = np.where(Y_train == k, 1, 0)
        trained = train(data_x, binary_y_train)
        history_weights.append(trained)
    print()
    print("Training Complete!")
    print()
    return history_weights   

def getPredictionOvA(X_test, Y_test, trained_weights):
    
    num_test_samples = np.shape(Y_test)[0]
    y_test_predicted = np.dot(X_test, trained_weights)
    y_test_predicted = y_test_predicted.reshape((-1, 1))
    return y_test_predicted.astype(np.uint8)

# Compute the accuracy of training data and validation data
def predict_one_vs_all(X_input, Y_input, trained_weights):
    num_classes = np.unique(Y_input)
    scores = np.zeros((len(num_classes), (X_input.shape)[0]))
    for k in range(len(num_classes)):
        binary_y_input = np.where(Y_input == num_classes[k], 1, 0)
        individual_scores = getPredictionOvA(X_input, Y_input, trained_weights[k])
        individual_scores = individual_scores.reshape((-1,))
        
        scores[k, :] = individual_scores
    print(scores)  
    pred_X = np.argmax(scores, axis=0)
    predictions = pred_X.reshape((-1, 1))
    Y = np.zeros(Y_input.shape)
    for k in range(len(num_classes)):
        Y[np.where(Y_input==num_classes[k])] = k
    print("Y:", Y)
    print()
    print("predictions:", predictions)
    acc = getAccuracy(predictions, Y)
    return pred_X, acc

In [60]:
trained_weights_ova = train_OvO(X_train, y_train)

res, acc= predict_one_vs_all(X_test, y_test, trained_weights_ova)

-------------
Creating One vs One Classifier....
Number of Classifiers needed: (N * (N-1)/2) = 3
-------------
Classifier Number:  1
Training Classifier to Classify:  ('Iris-setosa', 'Iris-versicolor')
Classifier Number:  2
Training Classifier to Classify:  ('Iris-setosa', 'Iris-virginica')
Classifier Number:  3
Training Classifier to Classify:  ('Iris-versicolor', 'Iris-virginica')

Training Complete!

[[  1.   0. 254.   1. 255.   1. 255.   0.   0.   0.   1.   0.   0.   0.
    0. 255.   0.   0. 255. 255.   1.   0. 255. 255.   0. 254. 255.   0.
    0. 255.]
 [  1.   0. 255.   1. 255.   1. 255.   0.   0.   0.   0.   0.   0.   0.
    0. 255.   0.   0. 255. 255.   1.   0. 255. 255.   0. 255. 255.   0.
    0. 255.]
 [255.   0.   2. 255.   1. 255.   1.   0.   0.   0.   0.   0.   0.   0.
    0.   1.   0.   0.   1.   1. 255.   0.   1.   1.   0.   1.   1.   0.
    0.   1.]]
Y: [[2.]
 [1.]
 [0.]
 [2.]
 [0.]
 [2.]
 [0.]
 [1.]
 [1.]
 [1.]
 [2.]
 [1.]
 [1.]
 [1.]
 [1.]
 [0.]
 [1.]
 [1.]
 [0.]
 [0.

In [61]:
acc

0.43333333333333335

In [16]:
# Function that splits the dataset into subsets and returns K(K-1)/2 classifier weights
def train_OvO(X_train, Y_train):
    a = np.hstack((X_train, Y_train))
    x = list(combinations(np.unique(Y_train), 2))
    N = len(np.unique(Y_train))
    comb = 0
    print("-------------")
    print("Creating One vs One Classifier....")
    print("Number of Classifiers needed: (N * (N-1)/2) =", int((N * (N-1)/2)))
    print("-------------")
    trained_weights = []
    for c in x:
        print("Classifier Number: ", comb + 1)
        print("Training Classifier to Classify: ", c)
        # Generating Subsets of Data by choosing combinations of two Classes
        p_x = X_train[np.where(a[:,-1] == c[0])]
        q_x = X_train[np.where(a[:,-1] == c[1])]
        p_y = Y_train[np.where(a[:,-1] == c[0])]
        q_y = Y_train[np.where(a[:,-1] == c[1])]
        data_x = np.concatenate((p_x, q_x))
        data_y = np.concatenate((p_y, q_y))
        # Obtain the binary representation of the classes
        binary_y_train = np.where(data_y == c[1], 1, 0)
        # Train the binary classifier
        trained = train(data_x, binary_y_train)
        # Store the classifier's weights in a list
        trained_weights.append(trained)
        comb += 1
    print()
    print("Training Complete!")
    print()
    return(trained_weights)

def getPrediction(X_test, Y_test, trained_weights):
    
    num_test_samples = np.shape(Y_test)[0]
    y_test_predicted = np.dot(X_test, trained_weights)
    y_test_predicted[y_test_predicted >= 0] = 1
    y_test_predicted[y_test_predicted < 0] = 0

    y_test_predicted = y_test_predicted.reshape((-1, 1))
    return y_test_predicted.astype(np.uint8)

def getAccuracy(y_test_predicted, Y_test):
    num_test_samples = np.shape(Y_test)[0]
    miscls_test_points = np.unique(np.argwhere(y_test_predicted != Y_test)[:, 0])
    accuracy = 1-(len(miscls_test_points)/num_test_samples)
    return accuracy

def most_frequent(lab): 
    lab = list(lab)
    occurence_count = Counter(lab) 
    return occurence_count.most_common(1)[0][0] 

def OvO_Pred(X, Y, weights):
    # Generate Combinations
    c = list(combinations(np.unique(Y), 2))
    # Create an array to store the predicted values
    pred_array = []
    for i in range(np.array(weights).shape[0]):
        # Predict values based on individual classifier
        pred = getPrediction(X, Y, weights[i])
        # Obtain labels of the classified result
        labels = np.where(pred == 1, c[i][1], c[i][0])
        # Place the result in the result array
        pred_array.append(labels)

    pred_array = np.array(pred_array)
    pred_array = np.hstack(pred_array)
    res = [0]*Y.shape[0]
    # Vote to get the maximum occuring label
    for i in range(pred_array.shape[0]):
        res[i] = most_frequent(pred_array[i])
    res = np.reshape(np.array(res), (Y.shape))
    # Obtain accuracy of the model
    acc = getAccuracy(res, Y)
    return np.array(pred_array), acc

In [17]:
trained_weights = train_OvO(X_train, y_train)

-------------
Creating One vs One Classifier....
Number of Classifiers needed: (N * (N-1)/2) = 3
-------------
Classifier Number:  1
Training Classifier to Classify:  ('Iris-setosa', 'Iris-versicolor')
Classifier Number:  2
Training Classifier to Classify:  ('Iris-setosa', 'Iris-virginica')
Classifier Number:  3
Training Classifier to Classify:  ('Iris-versicolor', 'Iris-virginica')

Training Complete!



In [18]:
y_train.shape

(120, 1)

In [19]:
res, acc= OvO_Pred(X_test, y_test, trained_weights)

In [20]:
acc

0.7666666666666666

In [21]:
x = np.unique(y_train)
x[0]

'Iris-setosa'