### What is this about

This is an experimentation in implementing Gaussian Discriminant Analyis for prediction milti-class classifications.

### Scope

The target of this study is to test the capability of Gaussian Discriminant Analysis in predicting the weather using a pre-defined dataset found at https://www.kaggle.com/datasets/nikhil7280/weather-type-classification

### Approach

This study will makes use of the following techniques and formulaes after derivations done by the researcher

### Formula

$$
P(y=k|x) = \frac{1}{\sum_{i=0, i\not= k}^kexp(-\theta^T_ix + \theta_{0i}) } 
$$

where, when predicting for $k$ 

$\theta_i = \Sigma^-(\mu_k - \mu_i)$ and $\theta_{0i} = ln\frac{\phi_i}{\phi_k} + \frac{1}{2}(\mu_k^T\Sigma^-\mu_k - \mu_i^T\Sigma^-\mu_i)$

Notice that it looks similar to how we make predictions in softmax

### Matrix shortcut for computing the parameters

We can easily compute for the parameters using the following matrix multipications 

$$
\phi = \frac{1}{n}\sum_{i=1}^nY
$$

$$
\mu = \frac{X^TY}{\sum_{i=1}^nY}
$$

$$
\Sigma = \frac{1}{n}\sum_{i = 1}^n(x^i - \mu_{y^i})(x^i - \mu_{y^i})^T
$$

Where 

 $\phi\exists\R^{1\times K}$

 $\mu\exists\R^{D \times K}$ = where the $h$ column represents the mean for the $i'th$ classification

$X\exists\R^{M \times D}$ = where the $i'th$ row represents the $i'th$ training data

$Y\exists\R^{M \times K}$ = where the $i'th$ row represents the classification of the $i'th$ data represented in a                     one-hot row matrix

Using this we can solve for the $\theta_i$  and $\theta_{0i}$ 

### Matrix Shortcut For Predictions in Multi-class

To easily compute $P(y = k|x)$  for every $k$ what we can do is perform the following matrix transformations

$$
H(k) = \frac{1}{exp(-sum(X'W))}
$$

Where  

$X'\exists\R^{1\times D + 1}$ = Which is the set of features we want to make predictions for

$W\exists\R^{D + 1\times K}$ = Where the $i'th$ column represents the $\theta_{i}$ for classification $k$ and the last row represents all the $\theta_0$ where the $j'th$ column represents $\theta_{0j}$

$D$ = number of dimensions (features)

$M$ = number of training data

$K$  = number of classifications

We then do this for every $k$ a


### STAGE 1: DATA SANITATION AND COLLECTION

First we will get the data (X and Y) and compose it to our intended format.

We also have to take into account, and assign number for the categorical values

In [248]:
import pandas as pd
import tensorflow as tf

# assign numbers to the categorical values
categorical_mapping = {
    "Weather Type": {"Rainy": 0, "Cloudy": 1, "Sunny": 2, "Snowy": 3},
    "Cloud Cover" : {"overcast": 0.,"partly cloudy": 1., "clear": 2., "cloudy": 3.},
    "Season"     : {"Spring": 0., "Autumn": 1., "Winter": 2., "Summer": 3.},
    "Location"    : {"inland": 0., "mountain": 1., "coastal": 2.}
}

# configs
classifications = ["Rainy", "Cloudy", "Sunny", "Snowy"]
file_path = 'dataset\weather_classification_data.csv'
features = ["Temperature", "Humidity", "Wind Speed", 
            "Precipitation (%)", "Cloud Cover", "Atmospheric Pressure", 
            "UV Index", "Season", "Visibility (km)", "Location"]

dependent_variable = "Weather Type"
K = len(categorical_mapping[dependent_variable])
D = len(features)
M = None

def display_tensor(tensor, name):
    print(name + ": \n", tensor.numpy())

def encode_row_matrix_to_one_hot(row_matrix, depth=K):
    row_matrix = tf.reshape(row_matrix, [-1])
    one_hot_matrix = tf.one_hot(row_matrix, depth=depth)
    
    return one_hot_matrix

def read_csv_to_XYy(file_path, categorical_mapping, dependent_variable):
    df = pd.read_csv(file_path)

    for column, mapping in categorical_mapping.items():
        df[column] = df[column].map(mapping)

    y = df[dependent_variable]
    x = df.drop(columns = [dependent_variable])

    X = tf.convert_to_tensor(x.values, dtype=tf.float32)
    Y = encode_row_matrix_to_one_hot(tf.convert_to_tensor(y.values, dtype=tf.uint8))

    return [X, Y, y.values]

# parse the data to X and Y
X, Y, y = read_csv_to_XYy(file_path, categorical_mapping, dependent_variable)
X_T = tf.transpose(X)
M = len(X)

display_tensor(X, "X")
display_tensor(Y, "Y")

X: 
 [[14.  73.   9.5 ...  2.   3.5  0. ]
 [39.  96.   8.5 ...  0.  10.   0. ]
 [30.  64.   7.  ...  0.   5.5  1. ]
 ...
 [30.  77.   5.5 ...  1.   9.   2. ]
 [ 3.  76.  10.  ...  2.   2.   0. ]
 [-5.  38.   0.  ...  1.  10.   1. ]]
Y: 
 [[1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]
 ...
 [0. 1. 0. 0.]
 [0. 0. 0. 1.]
 [1. 0. 0. 0.]]


  file_path = 'dataset\weather_classification_data.csv'


### STAGE 2: COMPUTE FOR THE PARAMTERS
We then compute for our intended parameters which are $\phi, \Sigma, \mu$

In [249]:

Y_SUM = tf.reduce_sum(Y, axis=0)
PHI = Y_SUM / M
MU  = tf.matmul(X_T, Y) / Y_SUM
SIGMA = tf.Variable(initial_value=tf.zeros(shape=(D, D)), dtype=tf.float32)

# Sigma
for i in range(M):
    diff = tf.transpose([X_T[:, i] - MU[:, y[i]]])
    SIGMA.assign(SIGMA + (tf.matmul(diff, tf.transpose(diff)) / M))

SIGMA_INV = tf.linalg.inv(SIGMA)

display_tensor(PHI, "PHI")
display_tensor(MU, "MU")
display_tensor(SIGMA, "SIGMA")
display_tensor(SIGMA_INV, "SIGMA_INV")

PHI: 
 [0.25 0.25 0.25 0.25]
MU: 
 [[ 2.27881813e+01  2.28236370e+01  3.24290924e+01 -1.53060603e+00]
 [ 7.83978806e+01  6.65287857e+01  5.14063644e+01  7.85102997e+01]
 [ 1.36775761e+01  8.60181808e+00  6.07318163e+00  1.09762125e+01]
 [ 7.47524261e+01  4.02863655e+01  2.49527264e+01  7.45860596e+01]
 [ 3.99090916e-01  6.60303056e-01  1.68181813e+00  3.10606062e-01]
 [ 1.00414972e+03  1.01017078e+03  1.01793933e+03  9.91051697e+02]
 [ 2.68424249e+00  3.58393931e+00  7.80454540e+00  1.95030308e+00]
 [ 1.50363636e+00  1.47272730e+00  1.49636364e+00  1.95030308e+00]
 [ 3.62848496e+00  7.07121229e+00  7.56045437e+00  3.59151506e+00]
 [ 1.04454541e+00  9.99696970e-01  1.01939392e+00  5.59090912e-01]]
SIGMA: 
 [[ 1.4457471e+02  2.8291887e+01  7.6160531e+00  3.7079231e+01
   1.6194342e-01  1.6221891e+01  3.4787211e+00 -1.7164505e-01
  -1.7499726e+00  2.2594477e-01]
 [ 2.8291887e+01  2.8425623e+02  2.7601900e+01  1.7813998e+02
  -1.6614841e+00  4.3973594e+00 -2.2474716e+00  6.0635775e-01
  -1

### STAGE 3: COMPUTING FOR $\theta_1i$ and $\theta_0i$

With these parameters in hand, we can then compute for the $\theta$ s

In [253]:
THETA_LIST = []

# mu^TSigma^-mu
mu_t_sigma_inv_mu = []
for i in range(K):
    MU_COL = tf.expand_dims(MU[:, i], axis=1)
    RESULT = tf.matmul(tf.transpose(MU_COL), tf.matmul(SIGMA_INV, MU_COL))
    mu_t_sigma_inv_mu.append(RESULT)

mu_t_sigma_inv_mu = tf.reshape(tf.convert_to_tensor(mu_t_sigma_inv_mu, dtype=tf.float32), [-1])

# compute for theta_1i and theta_0i, combine then and append them to THETA_LIST
for i in range(K):
    THETA_1 = tf.matmul(SIGMA_INV, tf.subtract(tf.tile(tf.expand_dims(MU[:, i], axis=1), [1, K]), MU)) * -1
    THETA_0 = tf.expand_dims(tf.math.log(PHI / PHI[i]) + 0.5 * tf.subtract(mu_t_sigma_inv_mu[i], mu_t_sigma_inv_mu), axis=0)
    THETA_LIST.append(tf.concat([THETA_1, THETA_0], axis=0))


### STAGE 4: PREDICTION AND TESTING 
Now that we have all the parameters, we're ready to create predictions. We are also ready to check the accuracy or our model using Gaussian Discriminant Analysis

In [263]:
bias = tf.constant([1], dtype=tf.float32)

def H(row_tensor_feature):
    prediction = []

    # add bias in the end for the intercept
    expanded = tf.expand_dims(tf.concat([row_tensor_feature, bias], axis=0), axis=0)
    for i in range(K):
        XW = tf.matmul(expanded, THETA_LIST[i])
        prediction.append(1 / tf.reduce_sum(tf.exp(XW)))
    
    return tf.convert_to_tensor(prediction, dtype=tf.float64)

def calculate_accuracy():
    accuracy = 0
    
    for i in range(M):
        result = H(X[i])
        accuracy += result[y[i]] / M

    return accuracy * 100

print("ACCURACY ACCROSS ALL DATA: ", calculate_accuracy().numpy())

ACCURACY ACCROSS ALL DATA:  tf.Tensor(75.15319901837556, shape=(), dtype=float64)
