<a href="https://colab.research.google.com/github/mimingucci/ML/blob/main/LinearDiscriminativeAnalysis.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

We use Bayes's rules to turn P(X|Y=k) into P(Y=k|X)<br/>
Suppose we have a classification task with K unordered classes, represented by k=1,…,K.
<br/>
1.Estimate the density of the predictors conditional on the target belonging to each class P(x|y=k)
<br/>
2.Estimate the prior probability that a target belongs to any given class P(y=k)
<br/>
3.Using Bayes’ rule, calculate the posterior probability that the target belongs to any given class.
<br/>
$P(y=k|x)\sim P(x|y=k)P(y=k)$ for k=1, 2, ..., K
<br/>
We then classify observation n as being from the class for which P(y=k|x) is greatest. In math, $\widehat{\gamma }=arg maxP(y=k|x)$ for k=1, 2, ..., K



Class Priors P(y=k)<br/>
$\widehat{\pi _{k}}=\frac{N_{k}}{N}$

In LDA, assume <br/>
$x|y=k\sim MVN(\mu _{k}, \Sigma )$<br/>
here, $\mu _{k}$ is unique mean vector of each class k, $\Sigma$ is covariance matrix<br/>
We have following formula:<br/>
$\widehat{\Sigma }=\frac{1}{N}\sum_{n=1}^{N}\sum_{k=1}^{K}I_{nk}(x_{n}-\mu _{k})(x_{n}-\mu _{k})^{T}$
<br/>
$\widehat{\mu _{k}}=\overline{x_{k}}$

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets

wine = datasets.load_wine()
X, y = wine.data, wine.target

In [None]:
class LDA:

    ## Fitting the model
    def fit(self, X, y):

        ## Record info
        self.N, self.D = X.shape
        self.X = X
        self.y = y

        ## prior probabilities
        self.unique_y, unique_y_counts = np.unique(self.y, return_counts = True) # returns unique y and counts
        self.pi_ks = unique_y_counts/self.N

        self.mu_ks = []
        self.Sigma = np.zeros((self.D, self.D))
        for i, k in enumerate(self.unique_y):
            X_k = self.X[self.y == k]
            mu_k = X_k.mean(0).reshape(self.D, 1)
            self.mu_ks.append(mu_k)

            for x_n in X_k:
                x_n = x_n.reshape(-1,1)
                x_n_minus_mu_k = (x_n - mu_k)
                self.Sigma += np.dot(x_n_minus_mu_k, x_n_minus_mu_k.T)

        self.Sigma /= self.N


    ## classifications

    def _mvn_density(self, x_n, mu_k, Sigma):
        x_n_minus_mu_k = (x_n - mu_k)
        density = np.exp(-(1/2)*x_n_minus_mu_k.T @ np.linalg.inv(Sigma) @ x_n_minus_mu_k)
        return density

    def classify(self, X_test):

        y_n = np.empty(len(X_test))
        for i, x_n in enumerate(X_test):

            x_n = x_n.reshape(-1, 1)
            p_ks = np.empty(len(self.unique_y))

            for j, k in enumerate(self.unique_y):
                p_x_given_y = self._mvn_density(x_n, self.mu_ks[j], self.Sigma)
                p_y_given_x = self.pi_ks[j]*p_x_given_y
                p_ks[j] = p_y_given_x

            y_n[i] = self.unique_y[np.argmax(p_ks)]

        return y_n


In [None]:
lda = LDA()
lda.fit(X, y)
yhat = lda.classify(X)
print(np.mean(yhat == y))

1.0


  p_ks[j] = p_y_given_x
