<a href="https://colab.research.google.com/github/muscak/Master-Machine-Learning-Algorithms/blob/master/Nonlinear-Algorithms/Naive-Bayes/Naive-Bayes.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Naive Bayes<a id='naive'></a>

Naive Bayes methods are a set of supervised learning algorithms based on applying Bayes’ theorem with the “naive” assumption of conditional independence between every pair of features given the value of the class variable[1]. Naive Bayes is a very simple classification algorithm that makes some strong assumptions about the independence of each input variable. The representation for naive Bayes is probabilities. A list of probabilities is stored to file for a learned naive Bayes model [2]. This includes:

1. Calculate Class Probabilities
2. Calculate Conditional Probabilities

[1] [Scikit-Learn](https://scikit-learn.org/stable/modules/naive_bayes.html#multinomial-naive-bayes)

[2] [Master Machine Learning Algorithms](https://machinelearningmastery.com/master-machine-learning-algorithms/)

[3] [Practical Statistics for Data Scientists](https://www.oreilly.com/library/view/practical-statistics-for/9781491952955/)

## Table of Contents
1. [Introduction](#introduction) 
2. [Import Libraries](#libraries) 
3. [Load Sample Data](#sampledata)
4. [Naive Bayes](#naive)
5. [Gaussian Bayes](#gaus)

## Introduction<a id='introduction'></a>

The purpose of this study to practice the manual implementation of Naive Bayes algorithm by using a manually generated data. The same data will be used to predict using `MultinomialNB` function of `sklearn` library and the results will be compared.

## Import Libraries<a id='libraries'><a/>

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sb
sb.set_style('whitegrid')
sb.despine(offset=10, trim=True);

<Figure size 432x288 with 0 Axes>

## Load Sample Data<a id='sampledata'></a>
Sample data consists of 2 categorical features which define `weather` and `car` condition and 1 binary label `action` which defines the decision about going out or staying home. Predictor variables must be categorical variables in the standard naive Bayes algorithm. If the predictors are numeric values then, those predictors should be binned and converted to categorical predictors [3].

In [None]:
# Create features
weather = ['sunny', 'rainy', 'sunny', 'sunny', 'sunny', 'rainy', 'rainy', 'sunny', 'sunny', 'rainy']
car = ['working', 'broken', 'working', 'working', 'working', 'broken', 'broken', 'working', 'broken', 'broken']
#Create label
action = ['go-out', 'go-out', 'go-out', 'go-out', 'go-out', 'stay-home', 'stay-home', 'stay-home', 'stay-home', 'stay-home']
# Convert the data into dataframe
df = pd.DataFrame(zip(weather, car, action), columns=['weather', 'car', 'action'])
# Print the dataframe
df

Unnamed: 0,weather,car,action
0,sunny,working,go-out
1,rainy,broken,go-out
2,sunny,working,go-out
3,sunny,working,go-out
4,sunny,working,go-out
5,rainy,broken,stay-home
6,rainy,broken,stay-home
7,sunny,working,stay-home
8,sunny,broken,stay-home
9,rainy,broken,stay-home


## Manual Implementation of Naive Bayes
As defined in the [Naive Bayes](#naive) section, there are 2 steps:
1. Calculate Class Probabilities
2. Calculate Conditional Probabilities


### Calculate Class Probabilities
The probabilities of each class in the training dataset.

$$P(class = 1) = \frac{count(class=1)}{count(class=0) + count(class=1)}$$

$$P(class = 0) = \frac{count(class=0)}{count(class=0) + count(class=1)}$$

In [None]:
p_stay_home = len(df[df['action'] == 'stay-home']) / (len(df[df['action'] == 'stay-home']) + len(df[df['action'] == 'go-out'])) # We could use len(df)
print('Probabilty of stay-home: ', p_stay_home)

p_go_out = len(df[df['action'] == 'go-out']) / (len(df[df['action'] == 'stay-home']) + len(df[df['action'] == 'go-out'])) # We could use len(df)
print('Probabilty of go-out: ', p_go_out)

Probabilty of stay-home:  0.5
Probabilty of go-out:  0.5


### Calculate Conditional Probabilities 

The conditional probabilities of each feature value given each class value.


$$P(y = i|x_1, x_2, …, x_p) = \frac{P(y = i)P(x_1 | y = i)…P(x_p | y = i)}{P(y = 0)P(x_1 | y = 0)…P(x_p | y = 0) + P(y = 1)P(x_1 | y = 1)…P(x_p | y = 1)}$$
<br /> <br />
\* Naive Bayes is a very simple classification algorithm that makes some strong assumptions about the independence of each input variable [2]. Which means that it doesn't care about finding the other records with the same predictor profile like complete or exact Bayesian classification [3]


In [None]:
# For weather feature
p_sunny_go_out = len(df[(df['weather'] == 'sunny') & (df['action'] == 'go-out')]) / len(df[df['action'] == 'go-out'])
print('P(weather=sunny | action=go-out): ', p_sunny_go_out)
p_rainy_go_out = len(df[(df['weather'] == 'rainy') & (df['action'] == 'go-out')]) / len(df[df['action'] == 'go-out'])
print('P(weather=rainy | action=go-out): ', p_rainy_go_out)

p_sunny_stay_home = len(df[(df['weather'] == 'sunny') & (df['action'] == 'stay-home')]) / len(df[df['action'] == 'stay-home'])
print('P(weather=sunny | action=stay-home): ', p_sunny_stay_home)
p_rainy_stay_home = len(df[(df['weather'] == 'rainy') & (df['action'] == 'stay-home')]) / len(df[df['action'] == 'stay-home'])
print('P(weather=rainy | action=stay_home): ', p_rainy_stay_home)

# For car feature
p_working_go_out = len(df[(df['car'] == 'working') & (df['action'] == 'go-out')]) / len(df[df['action'] == 'go-out'])
print('P(car=working | action=go-out): ', p_working_go_out)
p_broken_go_out = len(df[(df['car'] == 'broken') & (df['action'] == 'go-out')]) / len(df[df['action'] == 'go-out'])
print('P(car=broken | action=go-out): ', p_broken_go_out)

p_working_stay_home = len(df[(df['car'] == 'working') & (df['action'] == 'stay-home')]) / len(df[df['action'] == 'stay-home'])
print('P(car=working | action=stay-home): ', p_working_stay_home)
p_broken_stay_home = len(df[(df['car'] == 'broken') & (df['action'] == 'stay-home')]) / len(df[df['action'] == 'stay-home'])
print('P(car=broken | action=stay_home): ', p_broken_stay_home)

P(weather=sunny | action=go-out):  0.8
P(weather=rainy | action=go-out):  0.2
P(weather=sunny | action=stay-home):  0.4
P(weather=rainy | action=stay_home):  0.6
P(car=working | action=go-out):  0.8
P(car=broken | action=go-out):  0.2
P(car=working | action=stay-home):  0.2
P(car=broken | action=stay_home):  0.8


### Make Prediction

In [None]:
df.loc[(df['weather'] == 'sunny') & (df['car'] == 'working'), 'pred_go_out'] = p_go_out * p_sunny_go_out * p_working_go_out
df.loc[(df['weather'] == 'rainy') & (df['car'] == 'working'), 'pred_go_out'] = p_go_out * p_sunny_go_out * p_broken_go_out
df.loc[(df['weather'] == 'sunny') & (df['car'] == 'broken'), 'pred_go_out'] = p_go_out * p_rainy_go_out * p_working_go_out
df.loc[(df['weather'] == 'rainy') & (df['car'] == 'broken'), 'pred_go_out'] = p_go_out * p_rainy_go_out * p_broken_go_out

df.loc[(df['weather'] == 'sunny') & (df['car'] == 'working'), 'pred_stay_home'] = p_stay_home * p_sunny_stay_home * p_working_stay_home
df.loc[(df['weather'] == 'rainy') & (df['car'] == 'working'), 'pred_stay_home'] = p_stay_home * p_rainy_stay_home * p_working_stay_home
df.loc[(df['weather'] == 'sunny') & (df['car'] == 'broken'), 'pred_stay_home'] = p_stay_home * p_sunny_stay_home * p_broken_stay_home
df.loc[(df['weather'] == 'rainy') & (df['car'] == 'broken'), 'pred_stay_home'] = p_stay_home * p_rainy_stay_home * p_broken_stay_home

df.loc[df['pred_go_out'] > df['pred_stay_home'], 'pred_action'] = 'go-out'
df.loc[df['pred_go_out'] <= df['pred_stay_home'], 'pred_action'] = 'stay-home'

df

Unnamed: 0,weather,car,action,weather_nr,car_nr,action_nr,pred_go_out,pred_stay_home,pred_action
0,sunny,working,go-out,1,1,1,0.32,0.04,go-out
1,rainy,broken,go-out,0,0,1,0.02,0.24,stay-home
2,sunny,working,go-out,1,1,1,0.32,0.04,go-out
3,sunny,working,go-out,1,1,1,0.32,0.04,go-out
4,sunny,working,go-out,1,1,1,0.32,0.04,go-out
5,rainy,broken,stay-home,0,0,0,0.02,0.24,stay-home
6,rainy,broken,stay-home,0,0,0,0.02,0.24,stay-home
7,sunny,working,stay-home,1,1,0,0.32,0.04,go-out
8,sunny,broken,stay-home,1,0,0,0.08,0.16,stay-home
9,rainy,broken,stay-home,0,0,0,0.02,0.24,stay-home


In [None]:
from sklearn.metrics import accuracy_score

In [None]:
acc = accuracy_score(df['action'].values, df['pred_action'].values)
print('Accuracy: ', acc)

Accuracy:  0.8


## Sklearn Implementation<a id='sklearn'></a>

Wer can use `MultinomialNB` from sklearn library.

Unnamed: 0,weather,car,action,weather_nr,car_nr,action_nr
0,sunny,working,go-out,1,1,1
1,rainy,broken,go-out,0,0,1
2,sunny,working,go-out,1,1,1
3,sunny,working,go-out,1,1,1
4,sunny,working,go-out,1,1,1
5,rainy,broken,stay-home,0,0,0
6,rainy,broken,stay-home,0,0,0
7,sunny,working,stay-home,1,1,0
8,sunny,broken,stay-home,1,0,0
9,rainy,broken,stay-home,0,0,0


In [None]:
from sklearn.naive_bayes import MultinomialNB

In [None]:
model = MultinomialNB()

# Reset the dataframe to initial version
df = pd.DataFrame(zip(weather, car, action), columns=['weather', 'car', 'action'])

# Map the string vaules to binary values as the function accepts only numeric values
weather_nr = {'sunny':1, 'rainy':0}
car_nr = {'working':1, 'broken':0}
action_nr = {'go-out':1, 'stay-home':0}
df['weather_nr'] = df['weather'].map(weather_nr)
df['car_nr'] = df['car'].map(car_nr)
df['action_nr'] = df['action'].map(action_nr)
df

x = df[['weather_nr', 'car_nr']]
y = df['action_nr']

model.fit(x, y)

In [None]:
y_hat = model.predict(x)
acc = accuracy_score(y, y_hat)
print('Accuracy: ', acc)

Accuracy:  0.8


As you can see we got the same accuracy score as the manual implementation of the algorithm.

Naive Bayes is a classification algorithm for binary (two-class) and multiclass classification problems [2] by calculating the Posterior probability of an event [3]. It asks, “Within each outcome category, which predictor categories are most probable?” That information is then inverted to estimate probabilities of outcome categories, given predictor values [3].