# 06 Probability

For our purposes you should think of probability as a way of quantifying the uncertainty associated with events chosen from some universe of events.

Notationally, we write P(E) to mean “the probability of the event E.”
We’ll use probability theory to build models. We’ll use probability theory to
evaluate models. We’ll use probability theory all over the place.
One could, were one so inclined, get really deep into the philosophy of what probability theory means. (This is best done over beers.) We won’t be doing that.


## Dependence and Independence

Roughly speaking, we say that two events E and F are dependent if knowing something about whether E happens gives us information about whether F happens (and vice versa). Otherwise, they are independent.
For instance, if we flip a fair coin twice, knowing whether the first flip is heads gives us no information about whether the second flip is heads. These events are independent. On the other hand, knowing whether the first flip is heads certainly gives us information about whether both flips are tails.

 (If the first flip is heads, then definitely it’s not the case that both flips are tails.) These two events are dependent.
Mathematically, we say that two events E and F are independent if the probability that they both happen is the product of the probabilities that each one happens:

$$
P(E, F) = P(E)P(F)
$$

In the example, the probability of “first flip heads” is 1/2, and the probability of “both flips tails” is 1/4, but the probability of “first flip heads and both flips tails” is 0.

## Conditional Probability


When two events E and F are independent, then by definition we have:

$$
P(E, F) = P(E)P(F)
$$

If they are not necessarily independent (and if the probability of F is not
zero), then we define the probability of E “conditional on F” as:

$$
P(E|F) = \frac{P(E, F)}{P(F)}
$$

You should think of this as the probability that E happens, given that we
know that F happens. We often rewrite this as:

$$
P(E, F) = P(E | F) P(F)
$$

When E and F are independent, you can check that this gives:

$$
P(E|F) = P(E)
$$

which is the mathematical way of expressing that knowing F occurred gives us no additional information about whether E occurred.

An example on a family givinf birth to two children, given the prob that being B or G is the same

In [4]:
import enum
import random

# an Enum is a typed set of enumared values we can use them
# to make our code more descriptive and readable
class Kid(enum.Enum):
    BOY = 0
    GIRL = 1
    

def random_kid() -> Kid:
    return random.choice([Kid.BOY, Kid.GIRL])


both_girls = 0
older_girl = 0
either_girl = 0

random.seed(0)

for __ in range(0, 10000):
    younger = random_kid()
    older = random_kid()
    if older == Kid.GIRL:
        older_girl += 1
    if older == Kid.GIRL and younger == Kid.GIRL:
        both_girls += 1
    if older == Kid.GIRL or younger == Kid.GIRL:
        either_girl += 1


print('P(both | older):', both_girls / older_girl)
print('P(both | either):', both_girls / either_girl)

P(both | older): 0.5007089325501317
P(both | either): 0.3311897106109325


## Bayes Theorem

Bayes theorem is a way of "reversing" conditional probabilities.
Let's say we need to know the probability of some event E conditional
on some other event occurring.
But we only have information about the probability of F conditional on E
occurring. Using the definition of conditional probability twice tells us that:

$$
P(E|F) = P(E, F) / P(F) = P(F|E)P(E) / P(F)
$$

The event F can be split into the two mutually exclusive events "F and E" and
"F and not E". If we write ¬E for "not E" (i.e. "E doesn't happen"), then

$$
P(F) = P(F, E) + P(F, \neg E)
$$

So that:
$$
P(E, F) = P(F|E)P(E) / [P(F|E)P(E) + P(F|\neg E)P(\neg E)]
$$

which is how Bayes’s theorem is often stated.

This theorem often gets used to demonstrate why data scientists are smarter than doctors. Imagine a certain disease that affects 1 in every 10,000 people. And imagine that there is a test for this disease that gives the correct result (“diseased” if you have the disease, “nondiseased” if you don’t) 99% of the time.

What does a positive test mean? Let’s use T for the event “your test is positive” and D for the event “you have the disease.” Then Bayes’s theorem says that the probability that you have the disease, conditional on testing positive, is:

$$
P(D|T) = \frac{P(T|D)P(D)}{P(T|D)P(D) + P(T|\neg D)P(\neg D)}
$$

Here we know that $P(T|D)$, the probability that someone with the disease
tests positive, is 0.99. $P(D)$, the probability that any given person has the
disease is $1/10,000 = 0.0001$. $P(T|\neg D)$, the probability that someone 
without the disease tests positive, is 0.001. And $P(\neg D)$, the probability
that any given person doesn't have the disease, is 0.9999, If you substitute
these numbers into Bayes' theorem, you find:

$$
P(D|T) = 0.98\%
$$

That is, less than 1% of the people who test positive actually have the disease.

A more intuitive way to see this is to imagine a population of 1 million people. You’d expect 100 of them to have the disease, and 99 of those 100 to test positive. On the other hand, you’d expect 999,900 of them not to have the disease, and 9,999 of those to test positive. That means you’d expect only 99 out of (99 + 9999) positive testers to actually have the disease.