## TP2 : Classification using Linear & Quadratic Discriminant Analysis

First think of configuring your notebook :

In [13]:
import csv
# import os
from pylab import *
import numpy as np
from numpy import linalg as la


## Reading synthetic data
Load the training and test data sets |synth_train.txt| and
|synth_test.txt| already used for Knn. Targets belong to {1,2} and entries belong to R^2.
We have 100 training data samples and 200 test samples.

* the 1st column contains the label of the class the sample, 
* columns 2 & 3 contain the coordinates of each sample in 2D.

In [14]:
train = np.loadtxt('synth_train.txt')

test = np.loadtxt('synth_test.txt')

In [15]:
print("train :",train[:3])
print("test :",test[:3])
print("len(train) :",len(train))
print("len(test) :",len(test))

X_train = train[:,1:]
Y_train = train[:,:1]
X_test = test[:,1:]
Y_test = test[:,:1]

train : [[ 2.         -0.72221141  2.00447087]
 [ 2.         -0.92467912  0.48366928]
 [ 2.         -0.76602281  0.79432891]]
test : [[ 2.          0.54837733  1.2213453 ]
 [ 2.         -0.51618236  1.56239592]
 [ 2.         -0.92877833  0.92107217]]
len(train) : 100
len(test) : 200


## Recall about the main steps of discriminant analysis:
* estimation of weights `pi_1` and `pi_2` for each class,
* estimation of empirical means `mu_1` and `mu_2` for each class, 
* estimation of empirical covariance matrices  `sigma_1` and `sigma_2`,
* computation of the common averaged covariance `sigma` (average of intra-class covariances),
* computation of log-probabilities of belonging to each class,
* decision of classification,
* display results.


In [16]:
N1,N2 = np.count_nonzero(y_train == 1),np.count_nonzero(y_train == 2)
N = len(y_train)

pi_1= N1/N
pi_2=N2/N
mu_1= np.mean(train[np.argwhere(train[:,0]==1).T],axis=1)[:,1:]
mu_2=np.mean(train[np.argwhere(train[:,0]==2).T],axis=1)[:,1:]

x1_centered = train[np.argwhere(train[:,0]==1).T][0][:,1:]-mu_1
x2_centered = train[np.argwhere(train[:,0]==2).T][0][:,1:]-mu_2

sigma_1=x1_centered.T@x1_centered/N1
sigma_2=x2_centered.T@x2_centered/N2
sigma=(N1*sigma_1+N2*sigma_2)/N

x1_centered

array([[-0.20940505, -0.78685728],
       [-0.68200519, -0.23882278],
       [ 0.55219749, -0.12594765],
       [-1.1205768 , -0.40585538],
       [-0.46047791, -0.26455755],
       [-0.68215745, -1.49359749],
       [-0.13349225, -0.85424314],
       [-0.08275763,  0.4994529 ],
       [-0.4841993 , -0.56687235],
       [ 0.96101464,  0.19597001],
       [-0.00572667,  0.2054343 ],
       [ 1.5413662 ,  1.57186913],
       [ 0.10816106,  0.04652453],
       [ 0.11153886,  1.08562931],
       [-0.22370397, -0.07382386],
       [-0.26976525,  0.23590783],
       [ 1.50095974,  1.07973423],
       [ 0.48988385,  0.45551149],
       [ 0.53215453,  0.02240428],
       [-1.05697158, -0.73664289],
       [ 0.49474016, -0.44983661],
       [-0.88077749,  0.59861899]])

## TO DO : linear & quadratic discriminant analysis (LDA & QDA)
1. Implement a classifier using LDA of the data set. 
2. Then implement QDA classification.
3. In each case (LDA & QDA) show the decision boundary and
compute the error rate respectively for the training set and the test set. 
4. Compare and comment on your results with LDA and QDA.
5. You may also compare your results to K nearest neighbours.

_Indication 1 : matrices `sigma` are of size 2x2.
More generally, be careful of the sizes of vectors and matrices you
manipulate._

_Indication 2 : to display the regions of decision, you may use:_


In [17]:
Nx1=100 # number of samples for display
Nx2=100
x1=np.linspace(-2.5,1.5,Nx1)  # sampling of the x1 axis 
x2=np.linspace(-0.5,3.5,Nx2)  # sampling of the x2 axis
[X1,X2]=np.meshgrid(x1,x2)  
x=np.hstack((X1.flatten('F'),X2.flatten('F'))) # list of the coordinates of points on the grid
#N = size(x,axis=0)

# Then compute the sampled prediction class_L for each couple (X1,X2)


In [18]:
#LDA classification 
sigmaI=np.linalg.inv(sigma)

def y1(x):
    return np.log(pi_1)+x.T*sigmaI*mu_1-0.5*mu_1.T*sigma*mu_1
def y2(x):
    return np.log(pi_2)+x.T*sigmaI*mu_2-0.5*mu_1.T*sigma*mu_1

N_test=len(Y_test)
Y_predictLDA=[]
errorLDA=0
for x in X_test:
    diff=y1(x)-y2(x)
    if(diff>0):
        Y_predictLDA.append(1)
    else:
        Y_predictLDA.append(2)
for y_test,y_predict in Y_test,Y_predictLDA:
    if(y_test!=y_predict):
        errorLDA+=1
errorLDA/=N_test


ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

In [19]:
#QDA classification 
sigma_1I=np.linalg.inv(sigma_1)
sigma_2I=np.linalg.inv(sigma_2)

def y1(x):
    return np.log(pi_1)-0.5*np.log(np.linalg.det(sigma_1))-0.5*(x-mu_1).T*sigma_1I*(x-mu_1)
def y2(x):
    return np.log(pi_2)-0.5*np.log(np.linalg.det(sigma_2))-0.5*(x-mu_2).T*sigma_2I*(x-mu_2)

N_test=len(Y_test)
Y_predictQDA=[]
errorQDA=0
for x in X_test:
    diff=y1(x)-y2(x)
    if(diff>0):
        Y_predictQDA.append(1)
    else:
        Y_predictQDA.append(2)
for y_test,y_predict in Y_test,Y_predictQDA:
    if(y_test!=y_predict):
        errorQDA+=1
errorQDA/=N_test


ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

In [9]:
# For graphical representation  use code below for instance :
class_L = Y_predict
plt.imshow(class_L, extent = (np.min(x1),np.max(x1),np.min(x2),np.max(x2)) )
plt.show()

NameError: name 'Y_predict' is not defined

## TO DO : LDA & QDA using scikit-learn module

The module `scikit-learn` is dedicated to machine learning algorithms. Many of them are available in a simple manner. For LDA and QDA, have a look at the tutorial available at http://scikit-learn.org/stable/modules/lda_qda.html 

**Warning** : you may have a critical view of the way LDA and QDA are illustrated in the proposed example...


