# Naive Bayes Mathematics

Naive Bayes is a supervised ml algorithm which works in basis of **Bayes Theorem**. It predicts the results using the probability of another event occuring

![Image](https://miro.medium.com/max/500/1*IGwM9cb8W-gyJW5rkiVQPw.jpeg)

![Image](https://miro.medium.com/max/1000/1*aoKlzTs3w5tomWWxZGr81g.png)

## Importing Libraries

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

## Importing Dataset

In [2]:
df = pd.read_csv("./computer_data.csv")
df

Unnamed: 0,age,income,student,credit_rating,buys_computer
0,30,high,no,fair,no
1,30,high,no,excellent,no
2,35,high,no,fair,yes
3,40,medium,no,fair,yes
4,40,low,yes,fair,yes
5,40,low,yes,excellent,no
6,35,low,yes,excellent,yes
7,30,medium,no,fair,no
8,30,low,yes,fair,yes
9,40,medium,yes,fair,yes


## Calculation for Naive Bayes

There are 2 Classes:

- C1: Product Not Purchased = NO
- C2: Product Purchased = YES

In [3]:
# Total Records in Data
total = len(df)
total

14

In [4]:
# Count of Product Purchased is
p_purchased_count = len(df[df["buys_computer"] == "yes"])

# Count of Product Not Purchased is
p_not_purchased_count = len(df[df["buys_computer"] == "no"])

In [5]:
# Probability of Product Purchases is

p_purchased =  p_purchased_count / total
p_purchased

0.6428571428571429

In [6]:
# Probability of Product Not Purchases is

p_not_purchased = p_not_purchased_count / total
p_not_purchased

0.35714285714285715

## Predicting New Results

Now we will predict if user buys computer or not for

X = (age = 30 , income = medium, student = yes, credit_rating = fair)

In [7]:
x = [30, "medium", "yes", "fair"]

We will Compute P(X|Ci) for each class

 Compute P(Age|Ci)

In [8]:
p_age_yes = len(df[(df["age"] == x[0]) & (df["buys_computer"] == "yes")]) / p_purchased_count
print("Probability of Age = 30 and Computer Purchased is", p_age_yes)

p_age_no = len(df[(df["age"] == x[0]) & (df["buys_computer"] == "no")]) / p_not_purchased_count
print("Probability of Age = 30 and Computer Not Purchased is", p_age_no)

Probability of Age = 30 and Computer Purchased is 0.2222222222222222
Probability of Age = 30 and Computer Not Purchased is 0.6


 Compute P(Income|Ci)

In [9]:
p_income_yes = len(df[(df["income"] == x[1]) & (df["buys_computer"] == "yes")]) / p_purchased_count
print("Probability of Age = 30 and Computer Purchased is", p_income_yes)

p_income_no = len(df[(df["income"] == x[1]) & (df["buys_computer"] == "no")]) / p_not_purchased_count
print("Probability of Age = 30 and Computer Not Purchased is", p_income_no)

Probability of Age = 30 and Computer Purchased is 0.4444444444444444
Probability of Age = 30 and Computer Not Purchased is 0.4


 Compute P(student|Ci)

In [10]:
p_student_yes = len(df[(df["student"] == x[2]) & (df["buys_computer"] == "yes")]) / p_purchased_count
print("Probability of Age = 30 and Computer Purchased is", p_student_yes)

p_student_no = len(df[(df["student"] == x[2]) & (df["buys_computer"] == "no")]) / p_not_purchased_count
print("Probability of Age = 30 and Computer Not Purchased is", p_student_no)

Probability of Age = 30 and Computer Purchased is 0.6666666666666666
Probability of Age = 30 and Computer Not Purchased is 0.2


 Compute P(credit_rating|Ci)

In [11]:
p_credit_yes = len(df[(df["credit_rating"] == x[3]) & (df["buys_computer"] == "yes")]) / p_purchased_count
print("Probability of Age = 30 and Computer Purchased is", p_credit_yes)

p_credit_no = len(df[(df["credit_rating"] == x[3]) & (df["buys_computer"] == "no")]) / p_not_purchased_count
print("Probability of Age = 30 and Computer Not Purchased is", p_credit_no)

Probability of Age = 30 and Computer Purchased is 0.6666666666666666
Probability of Age = 30 and Computer Not Purchased is 0.4


Now multiply probability for each class

Mutlitplying P(X | "yes")

In [12]:
# Probability of buying computer
p_yes = p_age_yes * p_income_yes * p_student_yes * p_credit_yes
p_yes

0.04389574759945129

In [13]:
# Probability of not buying computer
p_no = p_age_no * p_income_no * p_student_no * p_credit_no
p_no

0.019200000000000002

Normalizing the probability

P(Yes) = $\frac {P(Yes)} {P(Yes) + P(No)}$

P(No) = $\frac {P(No)} {P(Yes) + P(No)}$

In [14]:
# Normalized probability of buying computer
p_yes_normalized = p_yes / (p_yes + p_no)
p_yes_normalized

print(f"The Probability that user will buy computer is {round(p_yes_normalized * 100)}%")

The Probability that user will buy computer is 70%


In [15]:
p_no_normalized = p_no / (p_yes + p_no)
p_no_normalized

print(f"The Probability that user will no buy computer is {round(p_no_normalized * 100)}%")

The Probability that user will no buy computer is 30%
