# Naive Bayes implementation from scratch

**Context**
* Two political parties' candidates:
    * Jill Stein of the Green Party (J)
    * Gary Johnson of the Libertarian Party (G)
* We have the probabilities of these candidates saying the words "freedom", "immigration" and "environment" (F, I, E)

**Objective**
* Find the probabilities of the candidates saying the words "freedom" and "immigration" 
    * P(J|F,I)
    * P(G|F,I)

# Data

**Probabilities of giving a speech**
* P(J) = 0.5
* P(G) = 0.5

**Probabilities of candidates saying words**
* P(F|J) = 0.1
* P(I|J) = 0.1
* P(E|J) = 0.8

* P(F|G) = 0.7
* P(I|G) = 0.2
* P(E|G) = 0.1

In [16]:
# Probabilities of giving a speech
p_j = 0.5
p_g = 0.5

# Probabilities of candidates saying words
p_f_j = 0.1
p_i_j = 0.1
p_e_j = 0.8

p_f_g = 0.7
p_i_g = 0.2
p_e_g = 0.1

# Calculation of probabilities

The probabilities P(J|F,I) and P(G|F,I) can be computes using the Bayes theorm as follows:

* P(J|F,I) = P(F,I|J) * P(J) / P(F,I)
* P(G|F,I) = P(F,I|G) * P(G) / P(F,I)

and

* P(F,I|J) = P(F|J) * P(I|J)
* P(F,I|G) = P(F|G) * P(I|G)

and

* P(F,I) = P(J) * P(F,I|J) + P(G) * P(F,I|G)

In [26]:
# Compute probabilities of saying the words "freedom" and "immigration"
p_j_text = p_j * p_f_j * p_i_j
p_g_text = p_g * p_f_g * p_i_g

print("Probability of Jill Stein giving a speed saying the words: %s" % p_j_text)
print("Probability of Gary Johnson giving a speed saying the words: %s" % p_g_text)

Probability of Jill Stein giving a speed saying the words: 0.005000000000000001
Probability of Gary Johnson giving a speed saying the words: 0.06999999999999999


In [27]:
# Compute likelihood of the words beeing being said
p_f_i = p_j_text + p_g_text

print("Probability of words being said: %s" %p_f_i)

Probability of words being said: 0.075


In [28]:
# Compute Probablity of P(J|F,I) und P(G|F,I)
p_j_fi = p_j_text / p_f_i
p_g_fi = p_g_text / p_f_i

print("Probabilitiy of Jill Stein saying the words: %s" % p_j_fi)
print("Probabilitiy of Gary Johnson saying the words: %s" % p_g_fi)

Probabilitiy of Jill Stein saying the words: 0.06666666666666668
Probabilitiy of Gary Johnson saying the words: 0.9333333333333332


## Conclusion

* There is only a 6.6% chance that Jill Stein of the Green Party uses the words "freedom" or "immigration"
* ... compared to the 93.3% chance for Gary Johnson of the Libertarian party