Naive Bayes Classifier From Scratch

In this notebook I will create a naive bayes classifier from scratch, based on created weather dataset and use it to predict the class of a previously unseen instance.

Steps: 1.Compute the class probabilities
       2.Compute the conditional probabilities
       3.Make Predictions with Naive Bayes


In [246]:
# Import libraries
import pandas as pd
import numpy as np

In [247]:
# Create dataset
df=pd.DataFrame()
df['outlook']=['sunny','rainy','rainy', 'rainy','rainy','rainy','sunny','sunny','sunny','rainy','sunny','rainy','rainy', 'rainy']
df['temp']=['hot','hot','cool','hot','cool','cool','cool','hot','cool','cool','cool','cool','hot','cool']
df['humidity']=['high','high','normal','high','normal','normal','normal','normal','normal','normal','normal','normal','normal','high']
df['play']=['yes','no','no','no','no','no','yes','yes','yes','no','yes','yes','no','no']

In [248]:
df

Unnamed: 0,outlook,temp,humidity,play
0,sunny,hot,high,yes
1,rainy,hot,high,no
2,rainy,cool,normal,no
3,rainy,hot,high,no
4,rainy,cool,normal,no
5,rainy,cool,normal,no
6,sunny,cool,normal,yes
7,sunny,hot,normal,yes
8,sunny,cool,normal,yes
9,rainy,cool,normal,no


In [249]:
# Exploratory data analysis
df.dtypes

outlook     object
temp        object
humidity    object
play        object
dtype: object

In [250]:
# Convert categorical input variables and a class variable into numbers
print('Unique values for outlook:', df['outlook'].unique())
print('Unique values for temp:', df['temp'].unique())
print('Unique values for humidity:', df['humidity'].unique())
print('Unique values for play:', df['play'].unique())

Unique values for outlook: ['sunny' 'rainy']
Unique values for temp: ['hot' 'cool']
Unique values for humidity: ['high' 'normal']
Unique values for play: ['yes' 'no']


In [251]:
encoding={'outlook': {'sunny':0, 'rainy':1},
          'temp': {'hot' : 0,'cool':1},
          'humidity':{'high':0, 'normal':1},
          'play' :{'no':0,'yes':1}}
df=df.replace(encoding)

In [252]:
df.head()

Unnamed: 0,outlook,temp,humidity,play
0,0,0,0,1
1,1,0,0,0
2,1,1,1,0
3,1,0,0,0
4,1,1,1,0


In [253]:
#1. Compute class probabilities
P_class0=sum(df['play']==0)/len(df['play'])
P_class1=sum(df['play']==1)/len(df['play'])

In [254]:
P_class0

0.5714285714285714

In [255]:
P_class1

0.42857142857142855

In [256]:
# 2.Calculate the Conditional Probabilities of each input value given each class value

# Example - Outlook Input Variable :
#P(outlook=sunny|class=play) = count(outlook=sunny and class=play) / count(class=play)
#P(outlook=sunny|class=stay-home) = count(outlook=sunny and class=stay-home) / count(class=stay-home)
#P(outlook=rainy|class=play) = count(outlook=rainy and class=play) / count(class=play)
#P(outlook=rainy|class=stay_home) = count(outlook=rainy and class=stay_home) / count(class=stay_home)

In [257]:
# Outlook Input Variable :
p_outlook_sunny_play=(sum(df['outlook']==0)+ sum(df['play']==1))/ sum(df['play']==1)
p_outlook_sunny_stayhome=(sum(df['outlook']==0)+ sum(df['play']==0))/ sum(df['play']==0)
p_outlook_rainy_play=(sum(df['outlook']==1)+ sum(df['play']==1))/ sum(df['play']==1)
p_outlook_rainy_stayhome=(sum(df['outlook']==1)+ sum(df['play']==0))/ sum(df['play']==0)

In [258]:
print('The conditional probability for Outlook sunny and play class',p_outlook_sunny_play)
print('The conditional probability for Outlook sunny and stay_home class',p_outlook_sunny_stayhome)
print('The conditional probability for Outlook rainy and play class',p_outlook_rainy_play)
print('The conditional probability for Outlook rainy and stay_home class',p_outlook_rainy_stayhome)

The conditional probability for Outlook sunny and play class 1.8333333333333333
The conditional probability for Outlook sunny and stay_home class 1.625
The conditional probability for Outlook rainy and play class 2.5
The conditional probability for Outlook rainy and stay_home class 2.125


In [259]:
# Temp Input Variable :
p_temp_hot_play=(sum(df['temp']==0)+ sum(df['play']==1))/ sum(df['play']==1)
p_temp_hot_stayhome=(sum(df['temp']==0)+ sum(df['play']==0))/ sum(df['play']==0)
p_temp_cool_play=(sum(df['temp']==1)+ sum(df['play']==1))/ sum(df['play']==1)
p_temp_cool_stayhome=(sum(df['temp']==1)+ sum(df['play']==0))/ sum(df['play']==0)

In [260]:
print('The conditional probability for temp hot and play class',p_temp_hot_play)
print('The conditional probability for temp hot and stay_home class',p_temp_hot_stayhome)
print('The conditional probability for temp cool and play class',p_temp_cool_play)
print('The conditional probability for temp cool and stay_home class',p_temp_cool_stayhome)

The conditional probability for temp hot and play class 1.8333333333333333
The conditional probability for temp hot and stay_home class 1.625
The conditional probability for temp cool and play class 2.5
The conditional probability for temp cool and stay_home class 2.125


In [261]:
# Humidity Input Variable :
p_humidity_high_play=(sum(df['humidity']==0)+ sum(df['play']==1))/ sum(df['play']==1)
p_humidity_high_stayhome=(sum(df['humidity']==0)+ sum(df['play']==0))/ sum(df['play']==0)
p_humidity_normal_play=(sum(df['humidity']==1)+ sum(df['play']==1))/ sum(df['play']==1)
p_humidity_normal_stayhome=(sum(df['humidity']==1)+ sum(df['play']==0))/ sum(df['play']==0)

In [262]:
print('The conditional probability for humidity high and play class',p_humidity_high_play)
print('The conditional probability for humidity high and stay_home class',p_humidity_high_stayhome)
print('The conditional probability for humidity normal and play class',p_humidity_normal_play)
print('The conditional probability for humidity normal and stay_home class',p_humidity_normal_stayhome)

The conditional probability for humidity high and play class 1.6666666666666667
The conditional probability for humidity high and stay_home class 1.5
The conditional probability for humidity normal and play class 2.6666666666666665
The conditional probability for humidity normal and stay_home class 2.25


In [263]:
#3. Make Predictions with Naive Bayes

## Naive Bayes algorithm intuition

Bayes theorem allows us to make predictions based on data. Naïve Bayes Classifier uses the Bayes’ theorem to predict membership probabilities for each class such as the probability that given record or data point belongs to a particular class. The class with the highest probability is considered as the most likely class. Here is the classic version of the Bayes theorem:

### P(A∣B)= P(B∣A) * P(A) / P(B)
In words:

P(class∣data) = P(data∣class) * P(class) / P(data)
class - is a particular class (in our example class is 0,1 or 2)

data - is an observation’s data

p(class∣data) - is called the posterior - this is what we are looking for

p(data|class) - is called the likelihood - for real data, like here, we calculate it from probability density function. We need to calculate it for every feature in the dataset. The “gaussian” and “naive” come from two assumptions present in this likelihood:

#### Assumption that each feature is uncorrelated from each other. This is obviously not true, and is a “naive” assumption - hence the name “naive bayes.”
#### Assumption that the value of the features (e.g. petal_length) are normally (gaussian) distributed. This means that P(data/class) is calculated by inputing the required parameters into the probability density function of the normal distribution:
p(class) - is called the prior. This is just the number of instances belonging to particular class in the dataset divided by the total number of instances in the dataset. p(data) - is called the marginal probability, which is the same for all classes and will be ignored.

In a bayes classifier, we calculate the numerator of posterior for every class for each observation and of course we pick the largets. This is also known as the Maximum A Posteriori (MAP)

## MAP = max(P(B|A) * P(A))


In [264]:
# Let's take the example - first record from dataset:
# outlook=sunny, temp=hot, humidity=high, class=stayhome

#play=P(outlook=sunny|class=play) * P(temp=hot|class=play) * P(humidity=high|class=play) * P(class=play)
#stayhome==P(outlook=sunny|class=stayhome) * P(temp=hot|class=stayhome) * P(humidity=high|class=stayhome) * P(class=stayhome)

In [265]:
play=p_outlook_sunny_play * p_temp_hot_play * p_humidity_high_play *P_class1
stayhome=p_outlook_sunny_stayhome * p_temp_hot_stayhome * p_humidity_high_stayhome *P_class0

In [266]:
print('Probability for the first row of dataset to play:', play)
print('Probability for the first row of dataset to stayhome:', stayhome)

Probability for the first row of dataset to play: 2.4007936507936503
Probability for the first row of dataset to stayhome: 2.263392857142857


In [267]:
# The highest probability is for the class 1 - play, what is correct according to dataset.

Below I create a new instance for which we know feature values but not the class. The goal is to predict the class.

In [268]:
df_test=pd.DataFrame()
df_test['outlook']=[0]
df_test['temp']=1
df_test['humidity']=0

In [269]:
df_test

Unnamed: 0,outlook,temp,humidity
0,0,1,0


In [270]:
test=df_test.to_numpy()

In [271]:
test_play=p_outlook_rainy_play * p_temp_hot_play * p_humidity_normal_play *P_class1
test_stayhome=p_outlook_rainy_stayhome * p_temp_hot_stayhome * p_humidity_normal_stayhome *P_class0

In [272]:
print('Probability for the test row of dataset to play:', test_play)
print('Probability for the test row of dataset to stayhome:', test_stayhome)

Probability for the test row of dataset to play: 5.238095238095237
Probability for the test row of dataset to stayhome: 4.439732142857142


In [273]:
# Because the posteriori for class = 1 is the greatest, then we predict that the class for the test instance is to play.