### Q1

It is necessary to calculate the distances from every sample $x$ to every class $k$.

Suppose there are $N$ sample $x$ and $K$ class $k$, the number of the distance values that we need to collect is $N \times K$.

### Q2

Covariance = $ COV(X,Y) = E[(X-EX)(Y-EY)] $

Correlation = $\frac {COV(X,Y)}{\sigma_x\sigma_y} $

In a covariance matrix, the values in the diagonal are the variances.

In [20]:
import numpy as np
C = np.array([[3.8600, -0.0200, 0.0100],[ -0.0200, 0.0100, -0.0200],[0.0100, -0.0200, 0.0700]])

#calculate the standard variance matrix
e = np.eye(len(C))
S = e*C
sqrt_S = np.power(S, 0.5)
S_inv = np.linalg.inv(sqrt_S)
R = S_inv.dot(C).dot(S_inv)
R

array([[ 1.        , -0.10179732,  0.01923789],
       [-0.10179732,  1.        , -0.75592895],
       [ 0.01923789, -0.75592895,  1.        ]])

### Q3

In [None]:
import numpy as np
import pandas
import os
from scipy.stats import multivariate_normal
breast_train = pandas.read_csv('BreastCancerTrain.csv',header=None)
breast_test = pandas.read_csv('BreastCancerValidation.csv',header=None)

breast_train.columns = ['1','2','3','4','5','6','7','8','9','10']
breast_test.columns = ['1','2','3','4','5','6','7','8','9','10']

In [43]:
pos_data = breast_train[breast_train['10']==1]
neg_data = breast_train[breast_train['10']==0]
pos_data1 = pos_data.iloc[:,:9]
neg_data1 = neg_data.iloc[:,:9]

mu1 = pos_data1.mean() 
mu2 = neg_data1.mean()

S1 = pos_data1.cov() 
S2 = neg_data1.cov()

norm1 = multivariate_normal(mu1, S1)
norm2 = multivariate_normal(mu2, S2)

P1 = len(pos_data1)/(len(pos_data1)+len(neg_data1)) 
P2 = len(neg_data1)/(len(pos_data1)+len(neg_data1))

breast_train_X = breast_train.drop(['10'], axis=1)

pdf1_x = norm1.pdf(breast_train_X)*P1 
pdf2_x = norm2.pdf(breast_train_X)*P2 

post1 = pdf1_x/(pdf1_x + pdf2_x) 
post2 = pdf2_x/(pdf1_x + pdf2_x)

pred_train = post1>post2

labels = breast_train['10'] 
labels[labels==1] = True 
labels[labels==0] = False

In [44]:
#Training Error
(sum((labels == pred_train))/len(pred_train))*100

96.82997118155619

In [45]:
pos_data = breast_test[breast_test['10']==1]
neg_data = breast_test[breast_test['10']==0]
pos_data1 = pos_data.iloc[:,:9]
neg_data1 = neg_data.iloc[:,:9]

mu1 = pos_data1.mean() 
mu2 = neg_data1.mean()

S1 = pos_data1.cov() 
S2 = neg_data1.cov()

norm1 = multivariate_normal(mu1, S1)
norm2 = multivariate_normal(mu2, S2)

P1 = len(pos_data1)/(len(pos_data1)+len(neg_data1)) 
P2 = len(neg_data1)/(len(pos_data1)+len(neg_data1))

breast_test_X = breast_test.drop(['10'], axis=1) 

pdf1_x = norm1.pdf(breast_test_X)*P1 
pdf2_x = norm2.pdf(breast_test_X)*P2 

post1 = pdf1_x/(pdf1_x + pdf2_x) 
post2 = pdf2_x/(pdf1_x + pdf2_x)

pred_test = post1>post2

labels = breast_test['10'] 
labels[labels==1] = True 
labels[labels==0] = False

In [46]:
# Test Error
(sum((labels == pred_test))/len(pred_test))*100

94.64285714285714

The Training Error is 96.8300; The validation Error is 94.6429.

### Q4

In [18]:
import numpy as np
import pandas
import matplotlib.pyplot as plt
from sklearn.preprocessing import PolynomialFeatures
from sklearn import linear_model
reg2d = pandas.read_csv('reg2d.csv',header=None)
x = reg2d.iloc[:,:2]
y = reg2d.iloc[:,-1]
poly = PolynomialFeatures(degree = 2) 
X_poly = poly.fit_transform(x) 
poly.fit(X_poly, y) 
lin1 = linear_model.LinearRegression() 
fit1 = lin1.fit(X_poly, y)
print('Coefficients', lin1.coef_)

Coefficients [ 0.         -0.25697386  0.05128251  1.14226452  0.13806308  0.8996328 ]


### Q5

For each Gaussian we have:
    
a. A $D \times D$ diagonal covariance matrix, which has $D$ parameters. $D$ is the number of diagonal elements in the matrix.

b.A mean vaector of $D$ dimensional has $D$ parameters

c.A weight value has 1 parameter(likelihood $p(x|\theta)$)

As a result, each Gaussian has $D + D + 1 = 2D + 1$ parameters.

In this question, there are 3 components in each mixture model and 9 features in the dataset. So for each mixture model, we have $ 3\times(2 \times D + 1)$ parameters.

Due to the sum of the mixing weights must be 1, so if there are k weight values, we only need (k-1) parameters to represent the total weight value. 

In this way, the total number of model parameters is more than 56 parameters.