# Gender classification model through naive bayes

## When to use naive bayes
    - Naive bayes is used for strings and numbers(categorically)
    - it can be used for classification
    - Can not use for regression
    - best for binary classification.
    - It can be used for multiclass classification also in the same way of binary classification.In other words, you compute the probability of each class label in the usual way, then pick the class with the largest probability.
    - The number of parameters used is independent of the size of training data
    - This model is mostly used for large datasets.
    - It gives fast predictions.
    - Naïve Bayes can also be an extremely good text classifier as it performs well, such as in the spam ham dataset
    - Naive Bayes is linear classifier which might not be suitable to classes that are not linearly separated in a dataset

## Assumptions
    - It assumes that features of a dataset are completely independent of each other. But it is generally not true that is why we also call it a ‘naive’ algorithm
    

In [1]:
import pandas as pd
import numpy as np
import pprint

#### Reading dataset

In [2]:
df = pd.read_csv('./dataset/dataset.csv')
df.head()

Unnamed: 0,Favorite Color,Favorite Music Genre,Favorite Beverage,Favorite Soft Drink,Gender
0,Cool,Rock,Vodka,7UP/Sprite,F
1,Neutral,Hip hop,Vodka,Coca Cola/Pepsi,F
2,Warm,Rock,Wine,Coca Cola/Pepsi,F
3,Warm,Folk/Traditional,Whiskey,Fanta,F
4,Cool,Rock,Vodka,Coca Cola/Pepsi,F


#### Probabilities and logic

X = {Favorite Color, Favorite Music Genre, Favorite Beverage, Favorite Soft Drink}
  = {f_color, f_music, f_bev, f_drink}
y = {Gender}

$
\begin{align}
    p( A | B ) &= \text{probability of happaning event A when event B already happened} \\
               &= \frac{p( B | A ) p(A)}{p(B)}
\end{align}
$

As we assume in this algorithm that each features(inputs) are independet from each other, so the probability of independent event is as follow.

$
\begin{align}
    p(A, B, C, ... , Z) &= p(A) p(B) p(C) ... p(Z)
\end{align}
$

$
\begin{align}
    p(\text{ f_color, f_music, f_bev, f_drink }) = p(\text{f_color}) p(\text{f_music}) p(\text{f_bev})  p(\text{f_drink})
\end{align}
$

In our case,


$
\begin{align}
    p( y | X ) &= \text{probability of happaning event y when event X already happened} \\ \\
    p( y | X ) &= \frac{p( X | y ) p(y)}{p(X)} \\ \\
    p( y | \text{ [f_color, f_music, f_bev, f_drink] }) &= \frac{p( \text{ [f_color, f_music, f_bev, f_drink] }| y ) p(y)}{p(\text{ f_color, f_music, f_bev, f_drink })}\\ \\
    &= \frac{p(\text{f_color} | y) p(\text{f_music} | y) p(\text{f_bev} | y)p(\text{f_drink} | y)p(y)}{p(\text{f_color}) p(\text{f_music}) p(\text{f_bev})  p(\text{f_drink})}
\end{align}
$

Let me calculate for one column,

|Favourite color   |F        |M        |p( ... \|F) |p( ... \|M) |
|------------------|---------|---------|------------|------------|
|Cool.             |17.      |20.      |17/33 = 0.51|20/33 = 0.61|
|Neutral           |3.       |4.       |3/33 = 0.091|4/33 = 0.121|
|Warm.             |13.      |9.       |13/33 = 0.39|9/33 = 0.273|
|total.            |33.      |33.      |1           |1.          |

|Gender |freq|p(F), p(M)|
|-------|----|----|
|F      |33|33/66 = .5|
|M.     |33|33/66 = .5|
|total. |66|1|

for example we want to predict 

f_color = Cool

f_music = pop

f_bev = vodka

f_drink = fenta

Gender = ?

$
\begin{align}
\text{p(M | cool, pop, vodka, fenta)} &= \frac{\text{p(cool|M) p(pop|M) p(vodka|M) p(fenta|M) p(M)}}{\text{p(cool) p(pop) p(vodka) p(fenta)}} \\ \\
&=  \frac{\text{0.61 p(pop|M) p(vodka|M) p(fenta|M) 0.5}}{\text{p(cool) p(pop) p(vodka) p(fenta)}}
\end{align}
$

$
\begin{align}
\text{p(F | cool, pop, vodka, fenta)} &= \frac{\text{p(cool|F) p(pop|F) p(vodka|F) p(fenta|F) p(F)}}{\text{p(cool) p(pop) p(vodka) p(fenta)}} \\ \\
&= \frac{\text{0.51 p(pop|F) p(vodka|F) p(fenta|F) 0.5}}{\text{p(cool) p(pop) p(vodka) p(fenta)}}
\end{align}
$

In [3]:
conditional_p = {}
for each_column in df.columns:
    conditional_p[each_column] = {}
    for each_value in df[each_column].unique():
        conditional_p[each_column][each_value] = {}
        conditional_p[each_column][each_value]['F'] = len(df.loc[(df[each_column] == each_value) & (df['Gender'] == 'F')].index)
        conditional_p[each_column][each_value]['M'] = len(df.loc[(df[each_column] == each_value) & (df['Gender'] == 'M')].index)
        conditional_p[each_column][each_value]['p( ' + each_value + ' | F)'] = conditional_p[each_column][each_value]['F'] / len(df.loc[df['Gender'] == 'F'].index)
        conditional_p[each_column][each_value]['p( ' + each_value + ' | M)'] = conditional_p[each_column][each_value]['M'] / len(df.loc[df['Gender'] == 'M'].index)

m_p = len(df.loc[df['Gender'] == 'M'].index) / len(df.index)
f_p = len(df.loc[df['Gender'] == 'F'].index) / len(df.index)
print(m_p, f_p)
pprint.pprint(conditional_p)
           

0.5 0.5
{'Favorite Beverage': {'Beer': {'F': 6,
                                'M': 7,
                                'p( Beer | F)': 0.18181818181818182,
                                'p( Beer | M)': 0.21212121212121213},
                       "Doesn't drink": {'F': 5,
                                         'M': 9,
                                         "p( Doesn't drink | F)": 0.15151515151515152,
                                         "p( Doesn't drink | M)": 0.2727272727272727},
                       'Other': {'F': 7,
                                 'M': 4,
                                 'p( Other | F)': 0.21212121212121213,
                                 'p( Other | M)': 0.12121212121212122},
                       'Vodka': {'F': 4,
                                 'M': 5,
                                 'p( Vodka | F)': 0.12121212121212122,
                                 'p( Vodka | M)': 0.15151515151515152},
                       'Whiskey': {'F': 5,
        

#### Testing

let us predict gender for my interest

color = cool

song = tradional

bev = beer

drink = 7up

Male prob = ?
Female prob = ?

$
\begin{align}
\text{p(M | Cool, Folk/Traditional, Beer, 7UP/Sprite)} &= \frac{\text{p(Cool|M) p(Folk/Traditional|M) p(Beer|M) p(7UP/Sprite|M) p(M)}}{\text{p(Cool) p(Folk/Traditional) p(Beer) p(7UP/Sprite)}} \\ \\
\end{align}
$

$
\begin{align}
\text{p(F | Cool, Folk/Traditional, Beer, 7UP/Sprite)} &= \frac{\text{p(Cool|F) p(Folk/Traditional|F) p(Beer|F) p(7UP/Sprite|F) p(F)}}{\text{p(Cool) p(Folk/Traditional) p(Beer) p(7UP/Sprite)}} \\ \\
\end{align}
$


---
$
\begin{align}
\text{p(M | Cool, Folk/Traditional, Beer, 7UP/Sprite)} + \text{p(F | Cool, Folk/Traditional, Beer, 7UP/Sprite)} = 1 \\ \\
\end{align}
$

$
\begin{align}
\frac{\text{p(Cool|M) p(Folk/Traditional|M) p(Beer|M) p(7UP/Sprite|M) p(M)}}{\text{p(Cool) p(Folk/Traditional) p(Beer) p(7UP/Sprite)}} + \frac{\text{p(Cool|F) p(Folk/Traditional|F) p(Beer|F) p(7UP/Sprite|F) p(F)}}{\text{p(Cool) p(Folk/Traditional) p(Beer) p(7UP/Sprite)}} &= 1 \\ \\
\end{align}
$

$
\begin{align}
\frac{\text{p(Cool|M) p(Folk/Traditional|M) p(Beer|M) p(7UP/Sprite|M) p(M)} + \text{p(Cool|F) p(Folk/Traditional|F) p(Beer|F) p(7UP/Sprite|F) p(F)}}{\text{p(Cool) p(Folk/Traditional) p(Beer) p(7UP/Sprite)}} &= 1 \\ \\
\end{align}
$

$
\begin{align}
\text{p(Cool|M) p(Folk/Traditional|M) p(Beer|M) p(7UP/Sprite|M) p(M)} + \text{p(Cool|F) p(Folk/Traditional|F) p(Beer|F) p(7UP/Sprite|F) p(F)} &= \text{p(Cool) p(Folk/Traditional) p(Beer) p(7UP/Sprite)} \\ \\
\end{align}
$

$
\text{p(Cool|M) p(Folk/Traditional|M) p(Beer|M) p(7UP/Sprite|M) p(M)} = \text{p(c_m) p(t_m) p(b_m) p(s_m) p(M)} \\
\text{p(Cool|F) p(Folk/Traditional|F) p(Beer|F) p(7UP/Sprite|F) p(F)} = \text{p(c_f) p(t_f) p(b_f) p(s_f) p(F)} \\
\text{p(Cool) p(Folk/Traditional) p(Beer) p(7UP/Sprite)} = \text{p(c) p(t) p(b) p(s)}
$

$
\begin{align}
\text{p(c_m) p(t_m) p(b_m) p(s_m) p(M)} + \text{p(c_f) p(t_f) p(b_f) p(s_f) p(F)} &= \text{p(c) p(t) p(b) p(s)} \\ \\
\end{align}
$


In [7]:
f_c = 'Cool'  #favourite_color
f_s = 'Rock'  #favourite_music_genere
f_b = "Doesn't drink"  #favourite_beverage
f_d = '7UP/Sprite'  #favourite_drink

cp_f_c_m = conditional_p['Favorite Color'][f_c]['p( ' + f_c + ' | M)']           # p(Cool|M)
cp_f_s_m = conditional_p['Favorite Music Genre'][f_s]['p( ' + f_s + ' | M)']     # p(Rock|M)
cp_f_b_m = conditional_p['Favorite Beverage'][f_b]['p( ' + f_b + ' | M)']        # p(Doesn't drink|M)
cp_f_d_m = conditional_p['Favorite Soft Drink'][f_d]['p( ' + f_d + ' | M)']      # p(7UP|M)

cp_f_c_f = conditional_p['Favorite Color'][f_c]['p( ' + f_c + ' | F)']           # p(Cool|F)
cp_f_s_f = conditional_p['Favorite Music Genre'][f_s]['p( ' + f_s + ' | F)']     # p(Rock|F)
cp_f_b_f = conditional_p['Favorite Beverage'][f_b]['p( ' + f_b + ' | F)']        # p(Doesn't drink|F)
cp_f_d_f = conditional_p['Favorite Soft Drink'][f_d]['p( ' + f_d + ' | F)']      # p(7UP|F)

p_t = (cp_f_c_m * cp_f_s_m * cp_f_b_m * cp_f_d_m * m_p) + (cp_f_c_f * cp_f_s_f * cp_f_b_f * cp_f_d_f * f_p)

p_m = (cp_f_c_m * cp_f_s_m * cp_f_b_m * cp_f_d_m * m_p) / p_t
p_f = (cp_f_c_f * cp_f_s_f * cp_f_b_f * cp_f_d_f * f_p) / p_t

print('Male prob: {}, Female prob: {}'.format(p_m, p_f))

Male prob: 0.5436241610738255, Female prob: 0.45637583892617456
