# Naive Bayes Theorem

Bayes' Theorem is the basis of a branch of statistics called *Bayesian Statistics*, where we take prior knowledge into account before calculating new probabilities.

## Independent Events

The ability to determine whether two events are _independent_ is an important skill for statistics.

If two events are __independent__, then the occurence of one event does not affect the probability of the other event. For example:

_I take the subway to work; I eat sushi for lunch._

If two events are __dependent__, then when one event occurs, the probability of the other event occuring changes in a predictable way. For example:

_It rains on Wednesday; I carry umbrella on Wednesday._

## Conditional Probability

_Conditional probability_ is the probability that two events happen. It's easiest to calculate conditional probability when the two events are independent. 

If the probability of event `A` is `P(A)` and the probability of event `B` is `P(B)` and the two events are independent, then the probability of both events occuring is the product of the probabilities:

$P(A ∩ B) = P(A) \times P(B)$

## Bayes' Theorem

A __prior__ is extra information about how we expect the world to work.

When we only use the result of the test, for example, it's called a __Frequentist Approach__ to statistics. When we incorporate our prior, it's called a __Bayesian Approach.__

In statistics, if we have two events (`A` and `B`), we write the probability that event `A` will happen, given that event `B` already happened as `P(A|B)`. 

We can calculate `P(A|B)` using __Bayes' Theorem__, which states:

![image.png](attachment:image.png)

It's important to note that on the right side of the equation, we have the term `P(B|A)`. This is the probability that event `B` will happen given that event `A` has already happened. This is very different from `P(A|B)`, which is the probability we are trying to solve for. The order matters!

In [1]:
import numpy as np

p_positive_given_disease = (0.99 * (.00001))/ (1./100000.)
print(p_positive_given_disease)

p_disease = 1./100000.
print(p_disease)

p_positive = (0.00001) + (0.01) 
print(p_positive)

p_disease_given_positive = (p_positive_given_disease) * (p_disease) / (p_positive)

print(p_disease_given_positive)

0.9899999999999999
1e-05
0.01001
0.000989010989010989


## Spam Filters

Email spam filters use Bayes' Theorem to determine if certain words indicate that an email is spam.

Let’s take a word that often appears in spam: “enhancement”.

With just 3 facts, we can make some preliminary steps towards a good spam filter:

* “enhancement” appears in just 0.1% of non-spam emails
* “enhancement” appears in 5% of spam emails
* Spam emails make up about 20% of total emails

Given that an email contains “enhancement”, *what is the probability that the email is spam?*

In [2]:
a = 'spam'
b = 'enchancement'

p_spam = 0.2
p_enhancement_given_spam = 0.05
p_enhancement = (0.05 * 0.2) + 0.001 * (1 - 0.2)

p_spam_enhancement = (p_enhancement_given_spam * p_spam) / p_enhancement

print(p_spam_enhancement)

0.9259259259259259


# Review

* Two events are independent if the occurence of one event does not affect the probability of the second event
* If two events are independent then:\
$P(A∩B)=P(A)×P(B)$
* A _prior_ is an additional piece of information that tells us how likely an event is
* A _Frequentist approach_ to statistics does not incorporate a prior
* A _Bayesian approach_ to statistics incorporates prior knowledge
* Bayes' Theorem is the following:\
![image.png](attachment:image.png)

# Dr. Dirac's Statistics Midterm

Dr. Dirac is administering a statistics midterm exam and wants to use Bayes' Theorem to help him understand the following:
* *Given that a student answered a question correctly, what is the probability that she really knows the material?*

Dr. Dirac knows the following probabilities based on many years of teaching:
* There is a question on the exam that 60% of students know the correct answer to.
* Given that a student knows the correct answer, there is still a 15% chance that the student picked the wrong answer.
* Given that a student does not know the answer, there is still a 20% chance that the student picks the correct answer by guessing.

Using these probabilities, we can answer the question.

In [3]:
import numpy as np

# P(A|B) = P(knows the material|answers correctly)
a = 'student knows the material'
b = 'student answers a question correctly'

# P(A) or P(knows the material)
p_knows_material = 0.60

# P(B|A) or P(answers correctly|knows the material)
p_answers_correctly_given_knows_material = 1 - 0.15

# P(B)
# P(answers correctly|knows the material) * P(knows the material) or 
# P(answers correctly|does not know the material) * P(does not know the material)
p_any_student_answers_correctly = (0.85 * 0.60) + (0.20 * 0.40)

# calculating P(A|B) = P(knows the material|answers correctly)
p_knows_material_answers_correctly = p_answers_correctly_given_knows_material * p_knows_material / p_any_student_answers_correctly

print(p_knows_material_answers_correctly)

0.8644067796610169
