# Naïve Bayes classifier

This project uses the a **Naïve Bayes classifier** to distinguish between the preferences of English and Scottish individuals
The dataset (*BritishScotish.csv*) contains the preferences of **6 English people** and **7 Scottish people**.

### Features (Preferences)
Each person is described by **5 binary features** (0 = dislike/absence, 1 = like/presence):

1. **Scones**  
2. **Beer**  
3. **Whisky**  
4. **Oats**  
5. **Soccer**  

The last column is the **Nationality** (English or Scottish).

### Tasks
1. **Implement the Naïve Bayes classifier.**  
2. **Classify the example**  
   - x₁ = (1, 0, 1, 1, 0) → Determine whether it corresponds to the preferences of an English or Scottish person.  
3. **Classify the example**  
   - x₂ = (0, 1, 1, 0, 1) → Determine whether it corresponds to the preferences of an English or Scottish person.  


## SetUp and Dataset

In [1]:
import pandas as pd
from naive_bayes_classifier import NaiveBayesClassifier

df_path = "BritishScottish.csv"
df= pd.read_csv(df_path)
display(df)

Unnamed: 0,scones,beer,whisky,oats,soccer,Nationality
0,0,0,1,1,1,British
1,1,0,1,1,0,British
2,1,1,0,0,1,British
3,1,1,0,0,0,British
4,0,1,0,0,1,British
5,0,0,0,1,0,British
6,1,0,0,1,1,Scottish
7,1,1,0,0,1,Scottish
8,1,1,1,1,0,Scottish
9,1,1,0,1,0,Scottish


## Implementation

In [None]:
# Suppose df has columns: scones, beer, whisky, oats, soccer, Nationality
nb = NaiveBayesClassifier()
nb.fit(df)


print("-----------------------------------------------------------")
print("Classifying x1")

# Classify x1 = (1, 0, 1, 1, 0)
x1_value = [1, 0, 1, 1, 0]
pred, evidence, posteriors = nb.predict(x1_value)

print("\nPosterior probabilities:")
for cls, prob in zip(nb.classes, posteriors):
    print(f"P({cls} | x1) = {prob:.4f}")

print("\nConditional probabilities used:")
for class_idx, cls in enumerate(nb.classes):
    print(f"\nGiven class = {cls}:")
    for feat_idx, col in enumerate(evidence.columns):
        print(f"  P({col} = {x1_value[feat_idx]} | {cls}) = {evidence.loc[class_idx, col]:.4f}")



print(f"\nPredicted class: {pred}")


print("-----------------------------------------------------------")
print("Classifying x2")

# Classify x2 = (0, 1, 1, 0, 1)
x2_value = [0, 1, 1, 0, 1]
pred, evidence, posteriors = nb.predict(x2_value)

print("\nPosterior probabilities:")
for cls, prob in zip(nb.classes, posteriors):
    print(f"P({cls} | x1) = {prob:.4f}")

print("\nConditional probabilities used:")
for class_idx, cls in enumerate(nb.classes):
    print(f"\nGiven class = {cls}:")
    for feat_idx, col in enumerate(evidence.columns):
        print(f"  P({col} = {x1_value[feat_idx]} | {cls}) = {evidence.loc[class_idx, col]:.4f}")

-----------------------------------------------------------
Classifying x1
Scottish is the most probable class.

Posterior probabilities:
P(British | x1) = 0.0108
P(Scottish | x1) = 0.0350

Conditional probabilities used:

Given class = British:
  P(scones = 1 | British) = 0.5000
  P(beer = 0 | British) = 0.5000
  P(whisky = 1 | British) = 0.3750
  P(oats = 1 | British) = 0.5000
  P(soccer = 0 | British) = 0.5000

Given class = Scottish:
  P(scones = 1 | Scottish) = 0.8889
  P(beer = 0 | Scottish) = 0.4444
  P(whisky = 1 | Scottish) = 0.4444
  P(oats = 1 | Scottish) = 0.6667
  P(soccer = 0 | Scottish) = 0.5556

Predicted class: Scottish
-----------------------------------------------------------
Classifying x2
British is the most probable class.

Posterior probabilities:
P(British | x1) = 0.0108
P(Scottish | x1) = 0.0022

Conditional probabilities used:

Given class = British:
  P(scones = 1 | British) = 0.5000
  P(beer = 0 | British) = 0.5000
  P(whisky = 1 | British) = 0.3750
  P(oat

: 