In [6]:
from scipy.stats import binom, beta

**Probability**
- how likely do we think an event will happen?
- ie 90% sure
- Probability of event X happening: P(X)
- in 90% case: P(X) = 0.9

**Likelyhood**
- measures the frequency an event has happened in the past
- in ML likelyhood (the past) is often used to predict probability (future)

**Odds**
- alternate way of expressing probability, ie the odds are 2 or 7/3
- formulated as O(X)
- P(X) = O(X)/(1+O(X))
- Odds play a role in Bayesian statistics as well as logistic regression

**Probability vs Statistics**
- Probability calculates the probability of an outcome
- statistics uses data to estimate it

**Calculating Probabilities**
- count all possible outcomes, ie 12
- probability of 1 specific event happening: 1/12

**Marginal Probability**
- P(X) for exactly one outcome

**Joint Probability**
- probability of two independent events occuring together
- P(A AND B) = P(A) * P(B)

**Union Probability**
- probability of getting event A or B
  
- *mutually exclusive events*: Only one of the events can happen. Ie rolling a dice. You cant get a 4 and a 6 with one dice
- P(A OR B) = P(A) + P(B) = 1/6 + 1/6 = 2/6 = 1/3

- *non mutually exclusive events*: The events can happen together. Ie rolling a dice and tossing a coin. What is P(heads OR 6)? There are multiple scenarios where this can happen
- in this case you can add the probabilities, but you have to substract the joint probability
- P(A OR B) = P(A) + P(B) - P(A AND B)
- P(A OR B) = P(A) + P(B) - P(A) * P(B)


**Conditional Probability**
- P(A) given event B has occurred
- P(A GIVEN B) or P(A|B)
- ex: 85% of cancer patients drink coffee. P(coffee|cancer)
- directionality matters!
- P(coffee|cancer) doesnt really say anything. P(cancer|coffee) is important one
- P(cancer|coffee)?

**Bayes Theorem**
- Bayes Theorem can be used to flip conditional probabilities
- P(A|B) = (P(B|A)*P(A))/P(B)
- P(cancer|coffee) = 0.0065
- Bayes Theorem can also be used to chain multiple conditional probabilities together to keep updating the belief based on new information

**Joint conditional Probability**
- P(coffee AND cancer) notice: not P(coffee|cancer)
- P(coffee AND cancer) = P(coffee|cancer) * P(cancer)
- coffee And cancer = P(cancer) * P(coffee if someone has cancer)
- P(A AND B) = P(B) * P(A|B)

**Union conditional probability**
- P(A OR B) = P(A) + P(B) - P(A|B) * P(B)

**Binomial Distribution**
-  how likely k successes can happen out of n trials given P probability
-  ie: a jet test is 90% likely to be successfull. Binomial distribution shows the probability of ie 8 successes out of 10 trials.
-  binomial distribution assumes the underlying success rate

In [5]:
n = 10
p = 0.9

for k in range(n+1):
    probability = binom.pmf(k,n,p)
    print(k,"-",probability)

0 - 9.999999999999981e-11
1 - 8.999999999999986e-09
2 - 3.644999999999997e-07
3 - 8.747999999999995e-06
4 - 0.00013778099999999982
5 - 0.0014880347999999988
6 - 0.011160261000000001
7 - 0.05739562799999998
8 - 0.1937102445
9 - 0.387420489
10 - 0.3486784401000001


**Beta Distribution**
- given *alpha* successes and *beta* failures, the beta distribution shows us the likelyhood of different underlying probabilities of success or failure (or any other event in question)
- beta distribution is a *probability distribution*, meaning the entire area under the curve is 1 or 100%
- so we define a range that represents success and calculate the area under the curve

**Cumulative Distribution Function**
- every continuous probability distribution has a CDF which calculates area up to a given x-value
- Example using CDF on Beta Distribution described above:

In [7]:
a = 8 # successes
b = 2 # failures

p = beta.cdf(.90, a, b)
print(p)

0.7748409780000002


-> *this means the likelyhood of the underlying probability of success given 8 successes and 2 failures being less than 90% is 77.48%*

In [8]:
a = 8 # successes
b = 2 # failures

p = 1 - beta.cdf(.90, a, b)
print(p)

0.22515902199999982


-> *this means the likelyhood of the underlying probability of success being over 90%, given 8 successes and 2 losses, is 22.52%*

**Exercises**

*ex 4*
- 137 booked passengers
- P(NO SHOW) = 0.4
- P(at least 50 NO SHOWS)?

- binom: assuming 0.4 P of success, how likely 50 or more show up out of 137 tries
- sum the likelyhood of each case where k > 50
- beta: assuming a successes and b failures, what is likelyhood of underlying probability being in a certain range

In [18]:
n = 137
p = 0.4
p_more_than_50_noshow = 0.0

for k in range(50,138):
    p_more_than_50_noshow += binom.pmf(k,n,p)

print(p_more_than_50_noshow)

0.8220955881474249


ex 5
- 19 coin flips
- 15 heads
- 4 tails
- good prob of being fair? Ie how high likelyhood of heads between 40 and 60 percent?

In [20]:
a = 15 # heads
b = 4 # tails

p = beta.cdf(.60, a, b) - beta.cdf(.40, a, b)
print(p)

0.03256646286049279
