# Gaussian Discriminant Analysis generative learning algorithm


Model is implemented for classifying between two classes 

Assumptions for the model are :

$ y $ is distributed with $ Bernoulli $ distribution parametrized by  $\phi $

$ x | y =0 $ and $ x | y =1 $ are distributed with $ Gaussian $ distribution with means $ \mu_0 , \mu_1 $ and covariances $ \Sigma $

The distributions are :

$$ p(y) = \phi^y(1 - \phi)^{1-y} $$


$$ p(x|y) = \frac{1}{2 \pi ^{d/2} |\Sigma|^{1/2} }exp(-\frac{1}{2} ( x - \mu)^T \Sigma^{-1} (x-\mu))$$


where $d$ is size of $\Sigma$ and $\mu$ is $\mu_0$ or $\mu_1$ for $p(x|y=0)$ and $p(x|y=1)$  respectively



In [None]:
def prob_x_given_y(arg, mean, covariance):
    a = -0.5 * np.transpose(arg - mean).dot((np.linalg.inv(covariance)).dot(arg - mean))
    b = 1/(((2 * np.pi)**(covariance.shape[0]/2)) * np.sqrt(np.linalg.det(covariance)))

    return b * np.exp(a)

In [None]:
def train(x, y):
    n = len(y)
    labels = y.reshape(n, 1)

Then log likelihood of data can be written as 


$$ log \  \prod^n_{i=1} p(x^{(i)}, y^{(i)} ; \phi, \mu_0, \mu_1, \Sigma )= \ log \ \prod^n_{i=1}(p(x^{(i)} | y^{(i)}; \mu_0, \mu_1, \Sigma) \cdot p(y^{(i)}; \phi))  $$


By maximizing log likelihood function we get the parameters :

$$ \phi \ = \ \frac{1}{n} \sum_{i=1}^n 1\{y^{(i)} = 1\} $$


$$ \mu_1 \ = \ \frac{\sum_{i=1}^n 1\{y^{(i)} = 1\}x^{(i)}}{\sum_{i=1}^n 1\{y^{(i)} = 1\}} $$

$$ \mu_0 \ = \ \frac{\sum_{i=1}^n 1\{y^{(i)} = 0\}x^{(i)}}{\sum_{i=1}^n 1\{y^{(i)} = 0\}} $$


In [None]:
    positive_mean_numerator = 0
    negative_mean_numerator = 0
    y_positive_cnt = 0
    y_negative_cnt = 0
    for i in range(n):
        if labels[i] == 1:
            positive_mean_numerator += x[i, :]
            y_positive_cnt += 1
        else:
            negative_mean_numerator += x[i, :]
            y_negative_cnt += 1
    mean_positive = np.array((positive_mean_numerator / y_positive_cnt))
    mean_negative = np.array((negative_mean_numerator / y_negative_cnt))
    class_prior = y_positive_cnt / n


$$ \Sigma = \frac{1}{n} \sum_{i=1}^n(x^{(i)} - \mu_{y^{(i)}})(x^{(i)} - \mu_{y^{(i)}})^T$$

In [None]:
    covariance = 0
    y_neg = k = np.array([0 if a == 1 else 1 for a in labels]).reshape(n, 1)
    temp = labels * (x - mean_positive) + y_neg * (x - mean_negative)
    covariance = (np.transpose(temp).dot(temp))/n

A predict function 

In [None]:
def predict(arg, mean_pos, mean_neg, covariance, prior):
    pos = prob_x_given_y(arg, mean_pos, covariance) * prior
    neg = prob_x_given_y(arg, mean_neg, covariance) * (1 - prior)
    if pos > neg:
        return 1
    else:
        return 0

Plotting data and decision boundary :
    

In [None]:
  for i in range(n):
        if labels[i] == 0:
            color = '#ff2200'
        else:
            color = '#1f77b4'
        plt.scatter(x[i, 0], x[i, 1], c=color)

    axes = plt.gca()
    (x_min, x_max) = axes.get_xlim()
    (y_min, y_max) = axes.get_ylim()
    # arbitrary number
    elements = n * 2
    x_grid, y_grid = np.meshgrid(np.linspace(x_min, x_max, elements), np.linspace(y_min, y_max, elements))
    p = np.empty((elements, elements))
    for i in range(elements):
        for j in range(elements):
            k = np.array([x_grid[i, j], y_grid[i, j]])
            p[i, j] = predict(k.reshape(x.shape[1], 1), mean_positive.reshape(x.shape[1], 1),
                              mean_negative.reshape(x.shape[1], 1), covariance, class_prior)
    plt.contour(x_grid, y_grid, p, levels=[0.5])

    plt.show()

Output for examplary data

!["output"](plot.png)