
#### Excellent detailed references ..   
Wikipedia: https://en.wikipedia.org/wiki/Naive_Bayes_classifier#Gaussian_naive_Bayes

Blog: https://chrisalbon.com/machine-learning/naive_bayes_classifier_from_scratch.html

In [1]:
import pandas as pd
import numpy as np

### Create Data

In [2]:
# Create an empty dataframe
data = pd.DataFrame()

# Create our target variable
data['Gender'] = ['male','male','male','male','female','female','female','female']

# Create our feature variables
data['Height'] = [6,5.92,5.58,5.92,5,5.5,5.42,5.75]
data['Weight'] = [180,190,170,165,100,150,130,150]
data['Foot_Size'] = [12,11,12,10,6,8,7,9]

# View the data
data

Unnamed: 0,Gender,Height,Weight,Foot_Size
0,male,6.0,180,12
1,male,5.92,190,11
2,male,5.58,170,12
3,male,5.92,165,10
4,female,5.0,100,6
5,female,5.5,150,8
6,female,5.42,130,7
7,female,5.75,150,9


In [3]:
# Below we will create a new person for whom we know their feature values but not their gender. Our goal is to predict their gender.
#Create an empty dataframe
person = pd.DataFrame()

# Create some feature values for this single row
person['Height'] = [6]
person['Weight'] = [130]
person['Foot_Size'] = [8]

# View the data
person

Unnamed: 0,Height,Weight,Foot_Size
0,6,130,8


# Gaussian Naive Bayes Classifier
p(A|B) = P(B|A).P(A)
          __________
            P(B)

In other words using our example:

P(H|E) = P(H)

where:
P(H|E) == Posterior == Probability of Hypotheis is True Given an Event == As use case, Probablity of Hypothesis (Identified Male, given Event(Observataion Data) is TRUE, same for females. Can we written as P(Male, ObservationData)

P(H) == Prior == Prior Probablity of Hypothesis was True == As use case, Probablity that we knew before that Hypothesis P(H) == P(Male) was true, in this case, average of being Male given total dataset 

P(E|H) = Likelihood == Probability of the Event given Hypothesis is True == As use case, Probablity of this Event, ie ObversationData given Hypothesis is True, i.e. Obeservation is indeed Male given dataset.

P(E) == Total probability of Event Occuring == Probablity of Observation is indeed indetified of Male or Female and this is equal to combination of: 
     == P(H) * P(E|H) + P(-H) * P(E | -H)
     == (Probablity IS Male * Probablity of correctly identfied as Male) +
        (Probablity NOT Male * Probablity fasely identified as Male)




To put to perspective, for classes Female, Male

P(Class | Event) = P(Event | Class) * P(Class)
                   __________________________
                             P(Event)


Let's go deep in our example:


P(Class | ObservationData) = P(ObservationData | Class) * P(Class)
                                   __________________________
                                     P(ObservationData)

where:

class is a particular class (e.g. male)
ObservationData is an observation's data
p(class∣ObservationData)p(class∣ObservationData) is called the posterior
p(ObservationData|class) is called the likelihood
p(class) is called the prior
p(ObservationData) is called the marginal probability


For individual class, representaion will look like:

P(Male | ObservationData) = P(ObservationData | Male) * P(Male)
                                   __________________________
                                     P(ObservationData)
                                     
P(Female | ObservationData) = P(ObservationData | Female) * P(Female)
                                   __________________________
                                     P(ObservationData)
                                     
Now, at this stage, it is right time to unfold various features we have in Observation Data and update the equations subsequently.


Posterior(Male) = P(Height | Male) * P(Weight | Male)*  P(FootSize | Male) * P(Male)
                     ___________________________________________________________
                                     Marginal Probablity
                                     
Posterior(Female) = P(Height | Female) * P(Weight | Female)*  P(FootSize | Female) * P(Female)
                     ___________________________________________________________
                                     Marginal Probablity                                     

### First, Let's Calculate Prior i.e. P(A) i.e. in this case, number of males rows in TOTAL rows, same for females

In [31]:
# Number of males
n_male = data['Gender'][data['Gender'] == 'male'].count()

# Number of females
n_female = data['Gender'][data['Gender'] == 'female'].count()

# Total rows
total_ppl = data['Gender'].count()

# Number of males divided by the total rows
P_male = float(n_male)/float(total_ppl)

# Number of females divided by the total rows
P_female = float(n_female)/float(total_ppl)

print "Total Sample:", total_ppl, "  Male:", n_male, "  Female:", n_female
print "Prior of Male ie. P(A):", P_male
print "Prior of Female ie. P(A):", P_female

Total Sample: 8   Male: 4   Female: 4
Prior of Male ie. P(A): 0.5
Prior of Female ie. P(A): 0.5


### Second, Let's calculate Likelihood i.e. P (E | A)

Likelihood is Porobablity Density Function of Normal Distribution
(Look at Wikipedia for details: https://en.wikipedia.org/wiki/Naive_Bayes_classifier#Gaussian_naive_Bayes)

Example:

p(height|female) = 1/sqrt(2*pie*Variance of Female Height in Data) * ...                      exp(-square(observations_height - avg_height_of_females_in_data))/2*variance of female height in data

So we need to calculate Mu (mean) and Sigma (Variance) 

In [32]:
# Group the data by gender and calculate the means of each feature
data_means = data.groupby('Gender').mean()

# View the values
data_means

Unnamed: 0_level_0,Height,Weight,Foot_Size
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
female,5.4175,132.5,7.5
male,5.855,176.25,11.25


In [34]:
# Group the data by gender and calculate the variance of each feature
data_variance = data.groupby('Gender').var()

# View the values
data_variance

Unnamed: 0_level_0,Height,Weight,Foot_Size
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
female,0.097225,558.333333,1.666667
male,0.035033,122.916667,0.916667


In [35]:
# Means for male
male_height_mean = data_means['Height'][data_variance.index == 'male'].values[0]
male_weight_mean = data_means['Weight'][data_variance.index == 'male'].values[0]
male_footsize_mean = data_means['Foot_Size'][data_variance.index == 'male'].values[0]

# Variance for male
male_height_variance = data_variance['Height'][data_variance.index == 'male'].values[0]
male_weight_variance = data_variance['Weight'][data_variance.index == 'male'].values[0]
male_footsize_variance = data_variance['Foot_Size'][data_variance.index == 'male'].values[0]

# Means for female
female_height_mean = data_means['Height'][data_variance.index == 'female'].values[0]
female_weight_mean = data_means['Weight'][data_variance.index == 'female'].values[0]
female_footsize_mean = data_means['Foot_Size'][data_variance.index == 'female'].values[0]

# Variance for female
female_height_variance = data_variance['Height'][data_variance.index == 'female'].values[0]
female_weight_variance = data_variance['Weight'][data_variance.index == 'female'].values[0]
female_footsize_variance = data_variance['Foot_Size'][data_variance.index == 'female'].values[0]


### Function to calculate the probability density of each of the terms of the likelihood (e.g. p(height∣female)p(height∣female)).

In [36]:
# Create a function that calculates p(x | y):
def p_x_given_y(x, mean_y, variance_y):

    # Input the arguments into a probability density function
    p = 1/(np.sqrt(2*np.pi*variance_y)) * np.exp((-(x-mean_y)**2)/(2*variance_y))

    # return p
    return p

### Apply Bayes Classifier To New Data Point
Alright! Our bayes classifier is ready. Remember that since we can ignore the marginal probability (the demoninator), what we are actually calculating is this:

numerator of the posterior = P(female) * p(height∣female) * p(weight∣female) * p(foot size∣female)


To do this, we just need to plug in the values of the unclassified person (height = 6), the variables of the dataset (e.g. mean of female height), and the function (p_x_given_y) we made above:

In [37]:
# Numerator of the posterior if the unclassified observation is a male
P_male * \
p_x_given_y(person['Height'][0], male_height_mean, male_height_variance) * \
p_x_given_y(person['Weight'][0], male_weight_mean, male_weight_variance) * \
p_x_given_y(person['Foot_Size'][0], male_footsize_mean, male_footsize_variance)

6.1970718438780782e-09

In [38]:
# Numerator of the posterior if the unclassified observation is a female
P_female * \
p_x_given_y(person['Height'][0], female_height_mean, female_height_variance) * \
p_x_given_y(person['Weight'][0], female_weight_mean, female_weight_variance) * \
p_x_given_y(person['Foot_Size'][0], female_footsize_mean, female_footsize_variance)

0.00053779091836300176

## Because the numerator of the posterior for female is greater than male, then we predict that the person is female.