##  **Naive Bayes Classifier for Iris Flower Dataset**

### 1. Cross Validation: Importing the Dataset and Splitting
##### Imports the Iris dataset, splits it into training and testing sets (80/20 split), ensuring stratification of classes for balanced representation.

```python
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.metrics import accuracy_score

# Load the iris dataset
iris = load_iris()
X, y = iris.data, iris.target

# Split the dataset into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, stratify=y, random_state=60)


### 2. Likelihood: Calculating Mean and Standard Deviation
##### Computes the mean and standard deviation of each feature across samples belonging to different classes, and defines a Gaussian likelihood function to model the probability distribution of features given each class.

In [62]:
# Calculate mean and standard deviation for each feature and class
num_classes = len(np.unique(y))
num_features = X.shape[1]

mean_features = np.zeros((num_classes, num_features))
std_features = np.zeros((num_classes, num_features))

for class_idx in range(num_classes):
    X_class = X_train[y_train == class_idx]
    mean_features[class_idx] = np.mean(X_class, axis=0)
    std_features[class_idx] = np.std(X_class, axis=0)

def gaussian_likelihood(x, mean, std):
    numerator = np.exp(-0.5 * ((x - mean) / std) ** 2)
    denominator = np.sqrt(2 * np.pi) * std
    return numerator / denominator


### 3. Priori: Calculating Prior Probabilities
##### Calculates the prior probabilities of each class based on their frequency in the training dataset.

In [65]:
# Calculate prior probabilities
prior_probabilities = np.bincount(y_train) / len(y_train)


### 4. Posterior: Calculating Posterior Probabilities
##### Defines a function to compute the posterior probabilities for a given sample using Bayes' theorem, incorporating the Gaussian likelihood and prior probabilities.

In [68]:
# Calculate posterior probabilities for a sample
def posterior_probability(sample):
    posteriors = []
    for class_idx in range(num_classes):
        likelihoods = np.prod(gaussian_likelihood(sample, mean_features[class_idx], std_features[class_idx]))
        posterior = likelihoods * prior_probabilities[class_idx]
        posteriors.append(posterior)
    return posteriors


### 5. Prediction Accuracy
##### Utilizes the computed posterior probabilities to predict class labels for the test set and evaluates the accuracy of the predictions.

In [71]:
# Predict class labels for the test set
y_pred = np.argmax([posterior_probability(sample) for sample in X_test], axis=1)
accuracy = accuracy_score(y_test, y_pred)
print(f"Prediction accuracy on test set: {accuracy:.4f}")


Prediction accuracy on test set: 0.9667


### 6. Generated Samples
##### Generates new samples by randomly sampling from the learned Gaussian distributions for each class, providing synthetic data based on the statistical model derived from the training dataset.

In [16]:
# Generate new samples based on learned distributions
def generate_sample(class_idx):
    return np.random.normal(mean_features[class_idx], std_features[class_idx])

num_samples = 10
new_samples = np.array([generate_sample(class_idx) for _ in range(num_samples) for class_idx in range(num_classes)])
new_labels = np.array([class_idx for _ in range(num_samples) for class_idx in range(num_classes)])

# Print generated samples and their labels
print("\nGenerated Samples:")
for i in range(num_samples * num_classes):
    class_name = iris.target_names[new_labels[i]]
    print(f"Sample {i+1} (Class: {class_name}):", new_samples[i])



Generated Samples:
Sample 1 (Class: setosa): [5.1343439  3.15981393 1.45926096 0.25261814]
Sample 2 (Class: versicolor): [6.42628769 3.41556082 4.35579739 1.34147817]
Sample 3 (Class: virginica): [7.43363157 3.08117443 5.98798249 1.96624181]
Sample 4 (Class: setosa): [5.06629286 3.50785377 1.41320269 0.2076264 ]
Sample 5 (Class: versicolor): [6.35769548 2.93473592 4.10560063 1.38189009]
Sample 6 (Class: virginica): [7.19072968 3.14192282 4.85026604 2.29596875]
Sample 7 (Class: setosa): [4.89038301 3.06356523 1.58449107 0.4568682 ]
Sample 8 (Class: versicolor): [5.90694355 2.55192271 4.19103935 1.17036024]
Sample 9 (Class: virginica): [7.17607616 2.87027069 6.0746439  2.09933211]
Sample 10 (Class: setosa): [5.09717107 3.43951841 1.54114045 0.37989234]
Sample 11 (Class: versicolor): [6.21429528 2.04299191 4.1723595  1.45779168]
Sample 12 (Class: virginica): [7.90415849 3.54317674 5.50433225 1.77822531]
Sample 13 (Class: setosa): [5.49908669 3.22166542 1.24980828 0.4310175 ]
Sample 14 (C