# LAB 5
In the first part of this laboratory we will solve the IRIS classification task using Gaussian classifiers.

In [78]:
import sklearn.datasets
import numpy as np
import matplotlib as plt

In [79]:
D = sklearn.datasets.load_iris()["data"].T
L = sklearn.datasets.load_iris()["target"]


In the following examples we use the dataset as returned by the aforementioned load_iris. 
We split the datasets in two parts: 
- the first part will be used for model training 
- the second for evaluation (validation set).

We use 100 samples for training and 50 samples for evaluation.

In [80]:
seed = 0
nTrain = int(D.shape[1] * 2.0 / 3.0)
np.random.seed(seed)
idx = np.random.permutation(D.shape[1])

idxTrain = idx[0:nTrain]
idxTest = idx[nTrain:]

# DTR and LTR are training data and labels
DTR = D[:, idxTrain]
LTR = L[idxTrain]
# DTE and LTE are evaluation data and labels
DTE = D[:, idxTest]
LTE = L[idxTest]

# print(DTR.shape,LTR.shape)


# Multivariate Gaussian Classifier
The first model we implement is the Multivariate Gaussian Classifier (MVG). 

As we have seen, the classifier assumes that samples of each class c ∈ {0,1,2} can be modeled as samples of a multivariate Gaussian distribution with class-dependent mean and covariance matrices :

$$ f_{X|C}=N(x|\mu _c,C_c) $$
The ML solution for the parameters is given by the empirical mean and covariance matrix of each class:
$$ \mu^{*} _c=\frac{1}{N}\sum_i{x_i} $$ 
$$ C^{*}_c=\frac{1}{N_c}\sum_i{(x_{c,i}-\mu^{*}_{c})(x_{c,i}-\mu^{*}_{c})^T} $$

Compute the ML estimates for the classifier parameters $(μ_0,Σ_0)$, $(μ_1,Σ_1)$, $(μ_2,Σ_2)$:

In [82]:
# dividing by class
DTR0 = DTR[:, LTR == 0]
DTR1 = DTR[:, LTR == 1]
DTR2 = DTR[:, LTR == 2]

# mean for each class
mu0 = DTR0.mean(1).reshape(4, 1)
mu1 = DTR1.mean(1).reshape(4, 1)
mu2 = DTR2.mean(1).reshape(4, 1)

# centering data
DTR0cent = DTR0 - mu0
DTR1cent = DTR1 - mu1
DTR2cent = DTR2 - mu2

# covariance for each class
C0 = np.dot(DTR0cent, DTR0cent.T) / float(DTR0cent.shape[1])
C1 = np.dot(DTR1cent, DTR1cent.T) / float(DTR1cent.shape[1])
C2 = np.dot(DTR2cent, DTR2cent.T) / float(DTR2cent.shape[1])

print(mu0)
print(mu1)
print(mu2)
print()
print(C0)
print(C1)
print(C2)


[[4.96129032]
 [3.42903226]
 [1.46451613]
 [0.2483871 ]]
[[5.91212121]
 [2.78484848]
 [4.27272727]
 [1.33939394]]
[[6.45555556]
 [2.92777778]
 [5.41944444]
 [1.98888889]]

[[0.13140479 0.11370447 0.02862643 0.01187305]
 [0.11370447 0.16270552 0.01844953 0.01117586]
 [0.02862643 0.01844953 0.03583767 0.00526535]
 [0.01187305 0.01117586 0.00526535 0.0108845 ]]
[[0.26470156 0.09169881 0.18366391 0.05134068]
 [0.09169881 0.10613407 0.08898072 0.04211203]
 [0.18366391 0.08898072 0.21955923 0.06289256]
 [0.05134068 0.04211203 0.06289256 0.03208448]]
[[0.30080247 0.08262346 0.18614198 0.04311728]
 [0.08262346 0.08533951 0.06279321 0.05114198]
 [0.18614198 0.06279321 0.18434414 0.04188272]
 [0.04311728 0.05114198 0.04188272 0.0804321 ]]
