# Gaussian Naive Bayes - Classification of Iris
**Uzair Ahmad**

Here's a simple example using the famous Iris dataset from the `scikit-learn` library. This dataset comprises 3 classes of 50 instances each, where each class refers to a type of iris plant.


### Step 1: Loading the Dataset

First, we'll load the Iris dataset from `scikit-learn`.


In [1]:
from sklearn import datasets

iris = datasets.load_iris()
X = iris.data
y = iris.target


### Step 2: Splitting the Dataset

Before training our model, we'll split the dataset into a training set and a test set. This allows us to evaluate the model's performance on unseen data.


In [2]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)



### Step 3: Training the Gaussian Naive Bayes Classifier

Now, we'll train the Gaussian Naive Bayes classifier using the training data.



In [3]:
from sklearn.naive_bayes import GaussianNB

gnb = GaussianNB()
gnb.fit(X_train, y_train)


In [6]:
# Access the parameters
print("Class Priors (class_prior_):")
print(gnb.class_prior_)

print("\nClass Counts (class_count_):")
print(gnb.class_count_)

print("\nMeans of each feature per class (theta_):")
print(gnb.theta_)

print("\nVariances of each feature per class (sigma_):")
print(gnb.var_)

Class Priors (class_prior_):
[0.2952381  0.35238095 0.35238095]

Class Counts (class_count_):
[31. 37. 37.]

Means of each feature per class (theta_):
[[4.96451613 3.37741935 1.46451613 0.2483871 ]
 [5.86216216 2.72432432 4.21081081 1.3027027 ]
 [6.55945946 2.98648649 5.54594595 2.00540541]]

Variances of each feature per class (sigma_):
[[0.1119667  0.13658689 0.03325703 0.01152966]
 [0.27532506 0.08724617 0.23934259 0.04134405]
 [0.42241052 0.09630387 0.28842951 0.08591673]]



### Step 4: Making Predictions

With our trained model, we can now make predictions on the test set.



In [7]:
y_pred = gnb.predict(X_test)


### Step 5: Evaluating the Model

To evaluate the performance of our classifier, we can use metrics such as accuracy.



In [8]:
from sklearn.metrics import accuracy_score

print("Accuracy:", accuracy_score(y_test, y_pred))

Accuracy: 0.9777777777777777


Running the code should give you the accuracy of the Gaussian Naive Bayes classifier on the test set of the Iris dataset. Typically, the GNB classifier does quite well on this dataset.

Remember, the actual accuracy may vary slightly based on the random split of data in the `train_test_split` function.