# QDA & LDA
### Bayes with Different Covariance Matrix

- Estimate the full covariance matrix for the classes

>$\displaystyle {\cal{}L}_{\boldsymbol{x}}(C_k) =  G(\boldsymbol{x};\mu_k, \Sigma_k)$

> Handles correlated features well

- Consider binary problem with 2 classes

> Taking the negative logarithm of the likelihoods we compare

>$\displaystyle (x\!-\!\mu_1)^T\,\Sigma_1^{-1}(x\!-\!\mu_1) + \ln\,\lvert \Sigma_1  \lvert $ vs.

>$\displaystyle (x\!-\!\mu_2)^T\,\Sigma_2^{-1}(x\!-\!\mu_2) + \ln \, \lvert\Sigma_2\lvert $

> If the difference is lower than a threshold, we classify it accordingly

- This is called **Quadratic Discriminant Analysis**

### Bayes with Same Covariance Matrix

- When $\Sigma_1=\Sigma_2=\Sigma$, the quadratic terms cancel from the difference
 
>$\displaystyle (x\!-\!\mu_1)^T\,\Sigma^{-1}(x\!-\!\mu_1) $ 
>$\displaystyle -\ (x\!-\!\mu_2)^T\,\Sigma^{-1}(x\!-\!\mu_2) $

- Hence this is called **Linear Discriminant Analysis**

> Fewer parameters to estimate during the learning process



# Example QDA

In [2]:
import numpy as np

In [123]:
class MyQDA(dict):
     
    def fit(self,X,C):
        for k in np.unique(C):
            members = (C==k)
            prior = members.sum() / float(C.size)
            S = X[members,:] # subset of class
            mu = S.mean(axis=0)    
            Z = (S-mu).T # centered column vectors
            
            cov = Z.dot(Z.T) / (Z[0,:].size-1)
            
            self[k] = (mu,cov,prior)
        print(mu.shape)
        return self
            
    def predict(self,Y):
        Cpred = -1 * np.ones(Y[:,0].size)
        for i in range(Cpred.size):
            d2min, kbest = 1e99, None
            for k in self:
                mu, cov, prior = self[k]
                diff = (Y[i,:]-mu).T
                d2 = diff.T.dot(np.linalg.inv(cov)).dot(diff) / 2
                d2 += np.log(np.linalg.det(cov)) / 2 - np.log(prior) 
                if d2<d2min: d2min,kbest = d2,k
            Cpred[i] = kbest
        return Cpred

In [124]:
# reference implementation
from sklearn.discriminant_analysis import QuadraticDiscriminantAnalysis as QDA

D = np.loadtxt('/home/akhil/machine-learning-basics/examples/Class-Train.csv', delimiter=',')
Q = np.loadtxt('/home/akhil/machine-learning-basics/examples/Class-Query.csv', delimiter=',')
X, C = D[:,0:2], D[:,2]

Cpred = MyQDA().fit(X,C).predict(Q)
Cskit =   QDA().fit(X,C).predict(Q)

print('Number of different estimates:{:d}'.format(sum(Cpred!=Cskit))) 

(2,)
Number of different estimates:0


# QDA with Cross Validation Example

In [14]:
Dc = D.copy()
# randomize and split to D1 + D2
np.random.seed(seed=42)
np.random.shuffle(Dc)
split = int(Dc[:,0].size/2)
D1, D2 = Dc[:split,:], Dc[split:,:]
# train on one estimate on the other
for i,(T,Q) in enumerate([(D1,D2),(D2,D1)]):
    X, C = T[:,0:2], T[:,2]
    Cpred, Ctrue = MyQDA().fit(X,C).predict(Q[:,:2]), Q[:,2]
    print ("Case #{:d} - Number of mislabeled points out of a total {:3d} points : {:2d}".format(i, 
                                                                     Q.shape[0],sum(Ctrue!=Cpred)))

(2, 2)
(2, 2)
Case #0 - Number of mislabeled points out of a total 157 points : 19
(2, 2)
(2, 2)
Case #1 - Number of mislabeled points out of a total 156 points : 20


# Example Using sklearn package

In [16]:
from sklearn.model_selection import cross_val_score
clf = QDA()
cv = cross_val_score(clf, X,C, cv=10)
print(cv)
print(np.mean(cv))
print(np.std(cv))

[0.82352941 0.82352941 0.82352941 0.875      0.73333333 0.8
 1.         1.         1.         0.93333333]
0.8812254901960784
0.09139601703590215


In [196]:
class MyLDA(dict):
     
    def fit(self,X,C):
        for k in np.unique(C):
            members = (C==k)
            prior = members.sum() / float(C.size)
            S = X[members,:] # subset of class
            mu = S.mean(axis=0) 
            if k == np.unique(C)[0]: 
                Z = (S-mu).T # centered column vectors
                cov = Z.dot(Z.T) / (Z[0,:].size-1)
            self[k] = (mu,cov,prior)
        return self

    def predict(self,Y):
        Cpred = -1 * np.ones(Y[:,0].size)
        for i in range(Cpred.size):
            d2min, kbest = 1e99, None
            for k in self:
                mu, cov, prior = self[k]
                diff = (Y[i,:]-mu)
                d2 = (diff.T.dot(np.linalg.inv(cov)).dot(diff))/2 - np.log(prior)
                if d2<d2min: d2min,kbest = d2,k
            Cpred[i] = kbest
        return Cpred

In [197]:
# reference implementation
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis as LDA

D = np.loadtxt('/home/akhil/machine-learning-basics/examples/Class-Train.csv', delimiter=',')
Q = np.loadtxt('/home/akhil/machine-learning-basics/examples/Class-Query.csv', delimiter=',')
X, C = D[:,0:2], D[:,2]

Cpred = MyLDA().fit(X,C).predict(Q)
Cskit =   LDA().fit(X,C).predict(Q)
print('Number of different estimates: {:d}'.format(sum(Cpred!=Cskit))) 

Number of different estimates:40


## References
https://web.stanford.edu/class/stats202/content/lec9.pdf