###### Naive Bayes

<img src="Bayes_rule.png">

Naive Bayes Classification is a supervised machine learning technique. It is simple but one of the most effective techniques of classification. It is a probabilistic classfication algorithm that is well suitable for categorical data and uses the **bayes theorem** together with a strong (hence "naive") independence assumption.

Assumptions made in Naive Bayes:

- All the samples are i.i.d, i.e., all random variables are independent of each other and are drawn from the similar distribution.
- All features are conditionally independent.

The basic idea behind Naive Bayes is that it assigns a probability to every category (finite outcome variable) based on the features in the data and chooses the outcome that is most likely as its prediction and we call it as posterior probability.

$$ posterior\space probability = \frac{conditional\space probabililty \space.\space prior\space probability}{evidence}$$


- P(c|x) is the posterior probability of the hypothesis that given value/attribute **x** belongs to class label **c** 
- P(c) is the prior probability of class i.e., the probability of the class **c** irrespective of the data.
- P(x|c) is the conditional probability expression how often a given value of **x** belongs to label **c** or in other terms the probability of attribute **x** given that the hypothesis was true.
- P(x) is the evidence also termed as prior probability of predictor i.e., the probability of the attribute **x**


In [2]:
import pandas as pd
import numpy as np

In [3]:
df = pd.read_csv("data.csv")
print(df.describe())
print()
df

       Outlook Temperature Humidity  Windy Play golf
count       14          14       14     14        14
unique       3           3        2      2         2
top      Sunny        Mild   Normal  False       Yes
freq         5           6        7      8         9



Unnamed: 0,Outlook,Temperature,Humidity,Windy,Play golf
0,Rainy,Hot,High,False,No
1,Rainy,Hot,High,True,No
2,Overcast,Hot,High,False,Yes
3,Sunny,Mild,High,False,Yes
4,Sunny,Cool,Normal,False,Yes
5,Sunny,Cool,Normal,True,No
6,Overcast,Cool,Normal,True,Yes
7,Rainy,Mild,High,False,No
8,Rainy,Cool,Normal,False,Yes
9,Sunny,Mild,Normal,False,Yes


The posterior probability can be calculated by first, constructing a frequency table for each attribute against the target. Then, transforming the frequency tables to likelihood tables and finally use the Naive Bayesian equation to calculate the posterior probability for each class. The class with the highest posterior probability is the outcome of prediction.

In [33]:
# considering outlook feature
outlook_freq_table = pd.DataFrame(df['Outlook'].value_counts()).reset_index(drop = False)
outlook_freq_table.columns = ['level_names', 'freq']
outlook_freq_table['pos_freq'] = [df[(df['Outlook'] == val) & (df['Play golf'] == 'Yes')].shape[0] for val in outlook_freq_table['level_names'].values.tolist()]
outlook_freq_table['neg_freq'] = [df[(df['Outlook'] == val) & (df['Play golf'] == 'No')].shape[0] for val in outlook_freq_table['level_names'].values.tolist()]
outlook_freq_table

Unnamed: 0,level_names,freq,pos_freq,neg_freq
0,Rainy,5,2,3
1,Sunny,5,3,2
2,Overcast,4,4,0


In [34]:
outlook_freq_table['pos_prob'] = [str(outlook_freq_table.loc[i, 'pos_freq'])+"/"+str(outlook_freq_table['pos_freq'].sum()) for i in range(outlook_freq_table.shape[0])]
outlook_freq_table['neg_prob'] = [str(outlook_freq_table.loc[i, 'neg_freq'])+"/"+str(outlook_freq_table['neg_freq'].sum()) for i in range(outlook_freq_table.shape[0])]
outlook_freq_table['evi_prob'] = [str(outlook_freq_table.loc[i, 'freq'])+"/"+str(outlook_freq_table['freq'].sum()) for i in range(outlook_freq_table.shape[0])]
outlook_freq_table

Unnamed: 0,level_names,freq,pos_freq,neg_freq,pos_prob,neg_prob,evi_prob
0,Rainy,5,2,3,2/9,3/5,5/14
1,Sunny,5,3,2,3/9,2/5,5/14
2,Overcast,4,4,0,4/9,0/5,4/14


$$ p(x|c) = p(Sunny|Yes) = 3/9 = 0.33$$
$$p(x) = p(Sunny) = 5/14 = 0.36$$
$$p(c) = p(Yes) = 9/14 = 0.64$$
**posterior probability**
$$p(c|x) = p(Yes|Sunny) = \frac{0.33 * 0.64}{0.36} = 0.60$$

$$ p(x|c) = p(Sunny|No) = 2/5 = 0.40$$
$$p(x) = p(Sunny) = 5/14 = 0.36$$
$$p(c) = p(No) = 5/14 = 0.36$$
**posterior probability**
$$p(c|x) = p(No|Sunny) = \frac{0.40 * 0.36}{0.36} = 0.40$$