## Naive Bayes Classifier in R

Naive Bayes is a supervised, non‑linear classification algorithm based on Bayes’ theorem with the “naive” assumption that all features are independent given the class.

### Theory

Bayes’ theorem states that for events \(A\) and \(B\):

$$
P(A \mid B) \;=\;\frac{P(B \mid A)\,P(A)}{P(B)}
$$

- \(P(A\mid B)\): posterior probability of class \(A\) given predictor \(B\).  
- \(P(B\mid A)\): likelihood of predictor \(B\) given class \(A\).  
- \(P(A)\): prior probability of class \(A\).  
- \(P(B)\): prior probability of predictor \(B\).  

For multiple predictors \(B_1, B_2, \dots, B_n\), the (naive) posterior becomes:

$$
P(A \mid B_1, B_2, \dots, B_n)
\propto
P(A)\,\prod_{i=1}^n P(B_i \mid A).
$$


## Example: Iris Dataset

The **Iris** dataset contains 150 samples equally distributed among three species: *setosa*, *versicolor*, *virginica*. Four features are measured: sepal length, sepal width, petal length and petal width.

```r

# Load data and inspect structure
data(iris)
str(iris)

```

``` 
'data.frame':	150 obs. of  5 variables:
 $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
 $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
 $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
 $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
 $ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...

```

## Performing Naive Bayes on Dataset
Using Naive Bayes algorithm on the dataset which includes 11 persons and 6 variables or attributes

```R
# Installing Packages
install.packages("e1071")
install.packages("caTools")
install.packages("caret")

# Loading package
library(e1071)
library(caTools)
library(caret)

# Splitting data into train
# and test data
split <- sample.split(iris, SplitRatio = 0.7)
train_cl <- subset(iris, split == "TRUE")
test_cl <- subset(iris, split == "FALSE")

# Feature Scaling
train_scale <- scale(train_cl[, 1:4])
test_scale <- scale(test_cl[, 1:4])

# Fitting Naive Bayes Model 
# to training dataset
set.seed(120)  # Setting Seed
classifier_cl <- naiveBayes(Species ~ ., data = train_cl)
classifier_cl

```



```
Naive Bayes Classifier for Discrete Predictors

Call:
naiveBayes.default(x = X, y = Y, laplace = laplace)

A-priori probabilities:
Y
    setosa versicolor  virginica 
 0.3333333  0.3333333  0.3333333 

Conditional probabilities:
            Sepal.Length
Y                [,1]      [,2]
  setosa     5.006667 0.3768594
  versicolor 5.883333 0.4720340
  virginica  6.406667 0.6175219

            Sepal.Width
Y                [,1]      [,2]
  setosa     3.413333 0.3892817
  versicolor 2.780000 0.2696102
  virginica  2.920000 0.3623677

            Petal.Length
Y                [,1]      [,2]
  setosa     1.483333 0.1782740
  versicolor 4.233333 0.4823315
  virginica  5.416667 0.5180090

            Petal.Width
Y                 [,1]      [,2]
  setosa     0.2633333 0.1159171
  versicolor 1.3400000 0.1714039
  virginica  1.9833333 0.3029548
```

#### Interpretation:

Each species has equal prior probability (1/3), matching the balanced class distribution in the training set.

- **Conditional probabilities** for each feature given each class, shown as two columns:  
- Column 1 = feature mean  
- Column 2 = feature standard deviation  

For example, for **Sepal.Length**:

| Species     | mean   | sd    |
|-------------|--------|-------|
| setosa      | 5.0067 | 0.3769|
| versicolor  | 5.8833 | 0.4720|
| virginica   | 6.4067 | 0.6175|

These statistics define the Gaussian likelihood \(P(\text{Sepal.Length} \mid \text{Species})\).


```R
# Predicting on test data'
y_pred <- predict(classifier_cl, newdata = test_cl)

# Confusion Matrix
cm <- table(test_cl$Species, y_pred)
cm

```

```
            y_pred
             setosa versicolor virginica
  setosa         20          0         0
  versicolor      0         17         3
  virginica       0          0        20
```

#### Interpretation:

| Actual \ Pred | setosa | versicolor | virginica |
|--------------|:------:|:----------:|:---------:|
| **setosa**     | 20 | 0  | 0  |
| **versicolor** | 0  | 17 | 3  |
| **virginica**  | 0  | 0  | 20 |

- **Rows** = actual species  
- **Columns** = predicted species  

**Interpretation**

- All 20 actual *setosa* samples were correctly predicted as *setosa*.  
- Out of 20 actual *versicolor* samples, 17 were correctly predicted, and 3 were misclassified as *virginica*.  
- All 20 actual *virginica* samples were correctly predicted.

# Model Evaluation
confusionMatrix(cm)

```
Confusion Matrix and Statistics

            y_pred
             setosa versicolor virginica
  setosa         20          0         0
  versicolor      0         17         3
  virginica       0          0        20

Overall Statistics
                                          
               Accuracy : 0.95            
                 95% CI : (0.8608, 0.9896)
    No Information Rate : 0.3833          
    P-Value [Acc > NIR] : < 2.2e-16       
                                          
                  Kappa : 0.925           
                                          
 Mcnemar's Test P-Value : NA              

Statistics by Class:

                     Class: setosa Class: versicolor Class: virginica
Sensitivity                 1.0000            1.0000           0.8696
Specificity                 1.0000            0.9302           1.0000
Pos Pred Value              1.0000            0.8500           1.0000
Neg Pred Value              1.0000            1.0000           0.9250
Prevalence                  0.3333            0.2833           0.3833
Detection Rate              0.3333            0.2833           0.3333
Detection Prevalence        0.3333            0.3333           0.3333
Balanced Accuracy           1.0000            0.9651           0.9348
```

#### Interpretation:


#### Overall Statistics

| Metric | Value |
|--------|-------|
| **Accuracy** | **0.95** |
| 95 % CI | (0.8608 – 0.9896) |
| No Information Rate | 0.3833 |
| *P*-Value [Acc > NIR] | <2.2 × 10⁻¹⁶ |
| Kappa | 0.925 |

- **Accuracy (0.95)**: 95 % of all test samples are correctly classified.  
- **95 % CI**: Confidence interval for accuracy (86.08 %–98.96 %).  
- **No Information Rate (0.3833)**: Accuracy achievable by always predicting the most frequent class (≈ 38.33 %).  
- **P-Value [Acc > NIR] (< 2.2 × 10⁻¹⁶)**: Model accuracy is significantly better than the No Information Rate.  
- **Kappa (0.925)**: Agreement between predictions and true labels, adjusted for chance (values > 0.9 indicate almost perfect agreement).


#### Statistics by Class

| Class | Sensitivity<br>(Recall) | Specificity | Precision<br>(PPV) | NPV | Balanced<br>Accuracy |
|-------|-------------------------|-------------|--------------------|-----|----------------------|
| **setosa**     | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 |
| **versicolor** | 1.0000 | 0.9302 | 0.8500 | 1.0000 | 0.9651 |
| **virginica**  | 0.8696 | 1.0000 | 1.0000 | 0.9250 | 0.9348 |


Per-class metrics:

- **Sensitivity**: Proportion of actual positives correctly identified.  
- **Specificity**: Proportion of actual negatives correctly identified.  
- **Precision (PPV)**: Proportion of positive predictions that are correct.  
- **NPV**: Proportion of negative predictions that are correct.  
- **Balanced Accuracy**: Average of Sensitivity and Specificity.

Overall, the classifier performs excellently, with perfect scores on *setosa*, slight confusion between *versicolor* and *virginica*, and very high balanced accuracy across classes.