<a href="https://colab.research.google.com/github/poorvapuri/UML501/blob/main/Assignment6.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Quest 1. Gaussian Naïve Bayes Classifier

Implement Gaussian Naïve Bayes
Classifier on the Iris dataset from sklearn.datasets using

(i) Step-by-step implementation

In [2]:
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score


iris = load_iris()
X = iris.data
y = iris.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)



class GaussianNBManual:

    def fit(self, X, y):
        self.classes = np.unique(y)
        self.mean = {}
        self.var = {}
        self.priors = {}

        for c in self.classes:
            X_c = X[y == c]
            self.mean[c] = X_c.mean(axis=0)
            self.var[c] = X_c.var(axis=0)
            self.priors[c] = X_c.shape[0] / X.shape[0]

    def gaussian(self, x, mean, var):
        eps = 1e-6
        coef = 1 / np.sqrt(2 * np.pi * (var + eps))
        exponent = np.exp(-((x - mean) ** 2) / (2 * (var + eps)))
        return coef * exponent

    def predict(self, X):
        predictions = []
        for sample in X:
            posteriors = []

            for c in self.classes:
                prior = np.log(self.priors[c])
                class_conditional = np.sum(np.log(self.gaussian(sample, self.mean[c], self.var[c])))
                posterior = prior + class_conditional
                posteriors.append(posterior)

            predictions.append(np.argmax(posteriors))

        return np.array(predictions)


model = GaussianNBManual()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

print("Manual Gaussian NB Accuracy:", accuracy_score(y_test, y_pred))


Manual Gaussian NB Accuracy: 0.9777777777777777


(ii) In-built function

In [3]:
from sklearn.naive_bayes import GaussianNB
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

iris = load_iris()
X = iris.data
y = iris.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

model = GaussianNB()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

print("In-built GaussianNB Accuracy:", accuracy_score(y_test, y_pred))


In-built GaussianNB Accuracy: 0.9777777777777777


Quest 2. Explore about GridSearchCV toot in scikit-learn. Use
this tool to find the best value of K for K-NN Classifier using any dataset.

In [4]:
from sklearn.datasets import load_iris
from sklearn.model_selection import GridSearchCV, train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score

iris = load_iris()
X = iris.data
y = iris.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

param_grid = {'n_neighbors': list(range(1, 21))}

grid = GridSearchCV(KNeighborsClassifier(), param_grid, cv=5)
grid.fit(X_train, y_train)

print("Best K =", grid.best_params_['n_neighbors'])

best_model = grid.best_estimator_
y_pred = best_model.predict(X_test)

print("Accuracy with best K:", accuracy_score(y_test, y_pred))


Best K = 1
Accuracy with best K: 1.0
