<a href="https://colab.research.google.com/github/SaxenaVaishnavi/Machine-Learning-Techniques/blob/main/Week_8.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [6]:
import numpy as np
import matplotlib.pyplot as plt

# Bernoulli naive Bayes

Run the below cell to get the following variables:

`X` = Data matrix of shape $(n, d)$. All the features are binary taking values $0$ or $1$.

`y` = label vector. Labels are $0$ and $1$.

In [2]:
rng = np.random.default_rng(seed=1)
X1 = np.concatenate((rng.binomial(size = 50,n = 1, p =0.7), rng.binomial(size = 50,n = 1, p =0.2))).reshape(-1, 1)
X2 = np.concatenate((rng.binomial(size = 50,n = 1, p =0.6), rng.binomial(size = 50,n = 1, p =0.1))).reshape(-1, 1)
X3 = np.concatenate((rng.binomial(size = 50,n = 1, p =0.6), rng.binomial(size = 50,n = 1, p =0.2))).reshape(-1, 1)
X4 = np.concatenate((rng.binomial(size = 50,n = 1, p =0.8), rng.binomial(size = 50,n = 1, p =0.1))).reshape(-1, 1)

X = np.column_stack((X1,X2,X3,X4))

y = np.concatenate((np.zeros(50, dtype= int), np.ones(50, dtype = int))).reshape(-1, 1)
permute = rng.permuted(range(100))

X = X[permute]
y = y[permute]

In [3]:
print(X.shape)
print(y.shape)

(100, 4)
(100, 1)


## Question 1
If we train the naive Bayes model on the dataset, What will be the value of $\hat{p}$, the estimate for $P(Y=1)$?

---
$$
  \hat{p} = \frac{\text{Number of samples with Y = 1}}{\text{Total number of samples}}
$$


In [4]:
# Enter your solution here
count_y1 = np.sum(y == 1)
total_samples = X.shape[0]
p_hat = count_y1/total_samples
print(f"Estimated P(Y = 1): {p_hat}")

Estimated P(Y = 1): 0.5


## Question 2
What will be the value of $\hat{p}_0^0$, the estimate of $P(f_0=1|y=0)$?  Write your answer correct to two decimal places.

---
Using Baye's:
$$
  \hat{p}^{label}_{feature}=\hat{p}^0_0 = P(f_0 = 1 | y = 0) = \frac {P(f_0 = 1) \cdot P(y = 0 | f_0 = 1)} {P(y = 0 )}
$$


In [5]:
# Enter your solution here
def p(feature, label):
  '''
  For eg: in this case, we want to find the probability that the first feature (feature with index=0) is 1, given that the label
  for this data point (y) is 0. We will compute the value of the following -
        np.sum((X[:, 0] == 1) & (y.flatten() == 0)) / np.sum(y.flatten() == 0)
  '''
  return np.sum((X[:, feature] == 1) & (y.flatten() == label)) / np.sum(y.flatten() == label)

print(p(feature=0, label=0))

0.68


## Question 3
What will be the value of $\hat{p}_0^1$, the estimate of $P(f_0=1|y=1)$?  Write your answer correct to two decimal places.



In [6]:
# Enter your solution here
print(p(feature=0, label=1))

0.26


## Question 4
What will be the value of $\hat{p}_3^1$, the estimate of $P(f_3=1|y=1)$?  Write your answer correct to two decimal places.




In [7]:
# Enter your solution here
print(p(feature=3, label=1))

0.12


## Question 5

What will be the predicted label for the point $[1, 0, 1, 0]$?

---
The Bernoulli Naive Bayes formula is:
$$
  P(Y=y | X=x) = \propto P(Y=y) \prod_{i=1}^{d}P(f_i=x_i | Y=y)
$$


In [8]:
# Enter your solution here
def bernoulli_nb_prob(x, class_prob, feature_probs):
    prob = class_prob
    for i, xi in enumerate(x):
        prob *= feature_probs[i] if xi == 1 else (1 - feature_probs[i])
    return prob

# Probabilities for P(Y=0) and P(Y=1)
p_y0 = np.sum(y == 0) / len(y)
p_y1 = np.sum(y == 1) / len(y)

# Feature probabilities conditioned on Y=0 and Y=1
p_features_given_y0 = np.sum(X[y.flatten() == 0], axis=0) / np.sum(y.flatten() == 0)
p_features_given_y1 = np.sum(X[y.flatten() == 1], axis=0) / np.sum(y.flatten() == 1)

x1 = np.array([1, 0, 1, 0])
p_x1_y0 = bernoulli_nb_prob(x1, p_y0, p_features_given_y0)
p_x1_y1 = bernoulli_nb_prob(x1, p_y1, p_features_given_y1)
predicted_label_x1 = 1 if p_x1_y1 > p_x1_y0 else 0
print(f"Predicted label for [1,0,1,0]: {predicted_label_x1}")

Predicted label for [1,0,1,0]: 1


## Question 6

What will be the predicted label for the point $[1, 0, 1, 1]$?



In [9]:
# Enter your solution here
x2 = np.array([1, 0, 1, 1])
p_x2_y0 = bernoulli_nb_prob(x2, p_y0, p_features_given_y0)
p_x2_y1 = bernoulli_nb_prob(x2, p_y1, p_features_given_y1)
predicted_label_x2 = 1 if p_x2_y1 > p_x2_y0 else 0
print(f"Predicted label for [1,0,1,1]: {predicted_label_x2}")

Predicted label for [1,0,1,1]: 0


# Gaussian naive Bayes

Run the below cell to get the following variables:

`X_train` = Training dataset of the shape $(n, d)$. All the examples are coming from multivariate gaussian distribution.

`y_train` = label vector for corresponding training examples. labels are $0$ and $1$.

`X_test` = Test dataset of the shape $(m, d)$, where $m$ is the number of examples in the test dataset. All the examples are coming from multivariate gaussian distribution.

`y_test` = label vector for corresponding test examples. labels are $0$ and $1$.



In [2]:
from sklearn.datasets import make_classification, make_blobs
from sklearn.model_selection import train_test_split

# generate artificial data points
X, y = make_blobs(n_samples = 100,
                  n_features=2,
                  centers=[[5,5],[10,10]],
                  cluster_std=1.5,
                  random_state=2)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
                                                    random_state=123)

## Question 7

How many examples are there in the training dataset?



In [3]:
# Enter your solution here
X_train.shape[0]

80

## Question 8
How many features are there in the dataset?



In [4]:
# Enter your solution here
X_train.shape[1]

2

## Question 9

If we train the Gaussian naive Bayes model on the trianing dataset, What will be the value of $\hat{p}$, the estimate for $P(Y=1)$? Write your answer correct to two decimal places.





In [7]:
# Enter your solution here
def priors(y):
    return np.sum(y == 1)/len(y)

priors(y_train)

0.4875

## Question 10

If $\hat{\mu}_0 = [\mu_1, \mu_2, ..., \mu_d]$ be the estimate for $\mu_0$, the mean of $0$ labeled examples, what will be the value of $\mu_1+\mu_2+...+\mu_d$? Write your answer correct to two decimal places.



In [8]:
# Enter your solution here
X_train_y0 = X_train[y_train == 0]
mu_y0 = X_train_y0.mean(axis=0)
sum_mu_y0 = mu_y0.sum()
round(sum_mu_y0, 2)

9.58

We will be using the different covariances for different labeled examples. The estimate for $\Sigma_k$ will be

$$\hat{\Sigma}_k = \sigma_iI$$ where $\sigma_i$ is the variance of $i^{th}$ feature values of examples labeled $k$.



## Question 11
What will be value of $\text{trace}({\hat{\Sigma}}_0)$?  Write your answer correct to two decimal places.

---
For Gaussian Naive Bayes, we treat the covariance matrix $\Sigma^0$ as a diagonal matrix, where each diagonal element corresponds to the variance of a feature.

$$
\Sigma^0 =
  \begin{bmatrix}
      \sigma_1^2 & 0 & 0 & \dots & 0 \\
      0 & \sigma_2^2 & 0 & \dots & 0 \\
      0 & 0 & \sigma_3^2 & \dots & 0 \\
      \vdots & \vdots & \vdots & \ddots & 0 \\
      0 & 0 & 0 & \dots & \sigma_d^2
  \end{bmatrix}
$$







In [9]:
# Enter your solution here
X_train_y0 = X_train[y_train == 0]
variances_y0 = X_train_y0.var(axis=0)
trace_sigma_0 = variances_y0.sum()
round(trace_sigma_0, 2)

4.44

## Question 12

Once we have estimated all the parameters for Gaussian naive Bayes assuming the different covariance matrices, we predict the labels for the training examples. What will be the training accuracy?

Accuracy is defined as the proportion of correctly classified examples.  Write your answer correct to two decimal places.




In [13]:
# Enter your solution here
def gaussian_pdf(sample, prior, means, variances, label):
    covariance_matrix = np.diag(variances[label])  # Creating a diagonal covariance matrix for the class
    normalization_factor = (
        (2 * np.pi) ** (sample.shape[0] / 2) * np.linalg.det(covariance_matrix) ** 0.5
    )
    exponent_term = -0.5 * ((sample - means[label]).T @ np.linalg.inv(covariance_matrix) @ (sample - means[label]))
    return (1 / normalization_factor) * np.exp(exponent_term)

def predict_labels(X, prior, means, variances, labels):
    log_posterior = np.zeros((X.shape[0], len(labels)))  # log posterior matrix

    for sample_idx, sample in enumerate(X):
        for class_idx, label in enumerate(labels):
            # log of likelihood + log prior for each class
            log_posterior[sample_idx, class_idx] = (
                np.log(gaussian_pdf(sample, prior, means, variances, label)) +
                np.log(prior if label == 1 else (1 - prior))
            )

    # label prediction
    return np.argmax(log_posterior, axis=1)

p = np.mean(y_train == 1)
labels = np.unique(y_train)  # e.g., [0, 1] for binary classification
mu = np.array([X_train[y_train == label].mean(axis=0) for label in labels])  # Mean for each class
sigma = np.array([X_train[y_train == label].var(axis=0) for label in labels])  # Variance for each class
training_accuracy = np.mean(predict_labels(X_train, p, mu, sigma, labels) == y_train)
print(f"Training Accuracy: {training_accuracy}")

Training Accuracy: 0.9875


## Question 13

What will be the test accuracy?

Accuracy is defined as the proportion of correctly classified examples.  




In [15]:
# Enter your solution here
test_accuracy = np.mean(predict_labels(X_test, p, mu, sigma, labels) == y_test)
print(f"Test Accuracy: {test_accuracy}")

Test Accuracy: 1.0
