# Probability notes

 ### Conditional probability:
 If two events depend on each other, what's the probability both will occur?
 Equation: P(B | A) = P(A and B) / P(A)
 i.e. probability of B given that A has occurred
 P(A intersect B): probability of A and B both occurring 
 
 ### Partition theorem / law of total probability:
 P(A) = sum (P(A and Bi) ) = sum( P(A | Bi) * P(Bi) )

### Bayes theorem
Bayes Theorem can be written as such:
Given hypothesis, H, and event, E, the likelihood of H given E is: P(H | E) = ( P(E | H) * P(H) ) / P(E)
 

Question from http://www.ams.sunysb.edu/~jsbm/courses/311/conditioning.pdf

Two cards from an ordinary deck of 52 cards are missing.
What is the probability that a random card drawn from this deck is a spade?

3 possibilities: 1 of the missing cards is a spade, both missing cards are spades,
or neither are spades

In [4]:
import random

def factorial(n):
    if n > 1:
        return n * factorial(n - 1)
    else: return n

def C(n, r): # n is size of set, r is how many we are choosing
    numerator = factorial(n)
    denom = factorial(r) * factorial(n - r)
    return numerator / denom

two_cards_selected = C(52, 2)

both_spades = C(13, 2)
prob_both = both_spades / two_cards_selected

neither_spades = C(39, 2)
prob_neither = neither_spades / two_cards_selected

one_spade_out_of_two = C(13, 1) * C(39, 1)
prob_one = one_spade_out_of_two / two_cards_selected

# we can see there's a 25% chance
probability_spade = (prob_both * 11 / 50) + (prob_neither * 13 / 50) + (prob_one * 12 / 50)
probability_spade

0.25

### Above is using Bayes Formula:
Often, for a given partition of S into sets F1, . . . , Fn,
we want to know the probability that some particular case, 
Fj occurs, given that some event E occurs.
P(Fj | E) = P(Fj and E) / P(E)

Using multiplication rule rewrite numerator:
P(Fj and E) = P(E | Fj) * P(Fj)

Using law of total probability rewrite denominator:
P(E) = sum(P(E | Fi) * P(Fi)) where i starts from 1 and goes to n

This value could've also just been computed from 13/52
My guess is this is because the probability is not conditional on the event of losing 2 cards, but I could be wrong.

### Testing out the probability with data:
- First I'll create the deck class
- Then create a data sample to pull cards from
- Keep track of how many cards are actually spades

In [5]:
class Deck:
    def __init__(self):
        suits = ['Club', 'Spade', 'Diamond', 'Heart']
        ranks = ['A'] + [rank for rank in range(2, 11)] + ['Jack'] + ['Queen'] + ['King']
        self.deck = [[rank, suit] for rank in ranks for suit in suits]
    
    def __getitem__(self, i):
        return self.deck[i]
    
    def __setitem__(self, i, val):
        self.deck[i] = val
        
    def __repr__(self):
        return str(self.deck)
    
    def __len__(self):
        return len(self.deck)
    
    def __delitem__(self, idx):
        del self.deck[idx]
        
    def __len__(self):
        return len(self.deck)
        
    def shuffle(self):
        random.shuffle(self.deck)
        
    def draw(self):
        to_draw = self.deck[len(self.deck) - 1]
        del self.deck[len(self.deck) - 1]
        return to_draw
        
d = Deck()
d.shuffle()
d

[[4, 'Spade'], [5, 'Heart'], ['Jack', 'Club'], [4, 'Heart'], [5, 'Club'], [10, 'Diamond'], [9, 'Spade'], ['Jack', 'Heart'], [2, 'Spade'], ['A', 'Spade'], [8, 'Heart'], [10, 'Club'], [2, 'Diamond'], [8, 'Spade'], [4, 'Club'], ['Queen', 'Diamond'], [2, 'Club'], [6, 'Club'], ['Queen', 'Club'], [7, 'Heart'], [9, 'Diamond'], [6, 'Diamond'], [3, 'Club'], [3, 'Diamond'], [9, 'Club'], [6, 'Spade'], [7, 'Diamond'], ['King', 'Diamond'], [3, 'Spade'], ['Jack', 'Diamond'], ['A', 'Diamond'], ['King', 'Spade'], ['A', 'Club'], ['A', 'Heart'], ['King', 'Club'], [10, 'Heart'], [7, 'Club'], [9, 'Heart'], [8, 'Diamond'], [8, 'Club'], ['Jack', 'Spade'], ['King', 'Heart'], [6, 'Heart'], [7, 'Spade'], ['Queen', 'Spade'], [4, 'Diamond'], [2, 'Heart'], [3, 'Heart'], ['Queen', 'Heart'], [10, 'Spade'], [5, 'Diamond'], [5, 'Spade']]

In [10]:
list_decks = [Deck() for x in range(1000)]
spades = 0
for i in list_decks:
    i.shuffle()
    for k in range(2):
        del i[random.randint(0, len(i) - 1)]
    if(i.draw()[1] == 'Spade'):
        spades += 1
print(spades / len(list_decks))

0.248


As we can see, the real world result is extremely close to the mathematical probability

## Bayes Theorem Notes

Bayes Theorem can be written as such:

Given hypothesis, H, and event, E,
the likelihood of H given E is:
P(H | E) = ( P(E | H) * P(H) ) / P(E)

1 ) Consider a test to detect a disease that 0.1 % of the population have.
The test is 99 % effective in detecting an infected person.
However, the test gives a false positive result in 0.5 % of cases.
If a person tests positive for the disease what is the probability that they actually have it?

In [12]:
# probability of having the disease
P_disease = .001
# probability of positive result given the person is infected
P_posi_infected = .99
# probability of a false positive
P_posi_not_infected = .005 

# need to find the probability of a positive test occuring to begin with:
# use law of total probability
# P(test positive) = P(test positive | have disease) * P(have disease) 
# + P(test positive | not having disease) * P(not having disease)
# = .99 * .001 + .005 * (1 - .001)
P_posi = P_posi_infected * P_disease + P_posi_not_infected * (1 - P_disease)
# If a preson tests positive for the disease what is the probability
# that they actually have it? i.e. P(infected | positive test)

# P(infected | positive test) =
# (P(positive test | infected) * P(infected) ) / P(positive test)
P_actually_have = ( .99 * .001 ) / P_posi
P_actually_have

0.16541353383458646

In [None]:
# to be continued...