### Complement Naive Bayes (CNB) Algorithm:
- CNB is a variant of Naive Bayes algorithm that is specially designed to improve classification performance on imbalanced datasets and text classification tasks.
- It modifies the way probailities are estimated to reduce bias towards majority classes, making it more suitable than the standard Multinomial Naive Bayes in many cases.
### Challenge of Unbalanced Datasets:
- An unbalanced dataset means one type of data appears much more often than the others.
- This often happens in spam filtering (more normal mails than the spam) (or) medical diagnosis(more healthy cases than disease cases).


### How Complement Naive Bayes Works:
- 1. For each class, compute the complement frequency - The frequency of features in all other classes combined.
- 2. Estimate the conditional probailities using these complement frequencies.
- 3. Normalize the values to ensure they form valid probaility distributions.
- 4. Classify a sample by choosing the class with the maximum posterior probability.
 
#### Formula:
- For a class c and feature f -
  - P(f|c) = count(f,c̅)+α / ∑count(f,c̅)+α.|V|
  - Here,
     - count(f,c̅) - count of feature f in the complement of class c.
     -  α     - smoothing parameter( Laplace smoothing ).
     -  |V| - vocabulary size.

### Implementing CNB
#### 1. Importing Libraries and load data:

In [3]:
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import ComplementNB
from sklearn.metrics import classification_report, accuracy_score

data = load_wine()
X = data.data
y = data.target

#### 2. Split into training and test sets:
- Split the dataset into 70% training and 30% testing data.
- Set random_state=42 for reproducibility.

In [4]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

#### 3. Train the CNB classifier:
- Create a ComplementNB instance.
- Fit the classifier on the training data.

In [6]:
cnb = ComplementNB()
cnb.fit(X_train, y_train)

#### 4. Evaluate the model:
- Predict the class labels for the test set using predict().
- Print the accuracy score and the classification report for detailed metrics.

In [9]:
y_pred = cnb.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
class_rep = classification_report(y_test, y_pred)

print("Accuracy score :", accuracy)
print("classification Report:\n",class_rep)


Accuracy score : 0.6666666666666666
classification Report:
               precision    recall  f1-score   support

           0       0.66      1.00      0.79        19
           1       0.64      0.67      0.65        21
           2       1.00      0.21      0.35        14

    accuracy                           0.67        54
   macro avg       0.76      0.63      0.60        54
weighted avg       0.74      0.67      0.62        54

