## Probability

**Probability** is how strongly we believe an event will happen, often expressed as a percentage. Probabilty is often expressed as a percentage, as in "There is a 70% chance my flight will be late." We will call this probability P(X)

**Likelihood** is measuring the frequency of events that already occurred. 

In statistics and machine learning, we often use likelihood (the past) in the form of data to predict probability (the future)

if $$P(X) = .70$$
$$P(not X) = 1 - P(X) = .30$$



Another distinction between probability and lilkelihood is that probability is a value between 0 and 1, while likelihood can be any positive number.

Alternatively, probability can be expressed as an odds O(x) such as 7:3, 7/3, or 2.333

To turn an odds O(X) into a proportional probability P(X), use the following formula:

$$P(X) = \frac{O(X)}{1 + O(X)}$$

e.g O(X) = 7:3 = 7/3 = 2.333 to convert to a proportional probability we would calculate:
$$ P(X) = \frac{\frac{7}{3}}{1 + \frac{7}{3}}$$
$$ P(X) = .7$$

Conversely, you can turn a probability into an odds by simply dividing the probability of the event occurring by the probability it will not occur:

$$O(X) = \frac{P(X)}{1 - P(X)}$$
$$O(X) = \frac{.70}{1 - .70}$$
$$O(X) = \frac{7}{3}$$

#### ODDS ARE USEFUL!
If I have an odds of 2.0, that means I feel event is two times more likely to happen than not to happen. That can be more intuitive to describe a belief than a percentage of 66.66%

### Joint Probabilities

joint probability is the probability of two or more events occuring simultaneously. We can calculate the joint probability of two events by multiplying the probability of one event by the probability of another event.

Lets say we have a coin and a 6 sixed die and want to calculate the probability of flipping heads and a 6

Let A be the event of flipping heads and B be the event of rolling a 6

$$P(A) = \frac{1}{2}$$
$$P(B) = \frac{1}{6}$$
$$P(A \cap B) = P(A) * P(B) = \frac{1}{2} * \frac{1}{6} = \frac{1}{12} = .08333$$

### Union Probabilities

Union probability is the probability of one event or another event occuring. We can calculate the union probability of two events by adding the probability of one event to the probability of another event and subtracting the joint probability of both events.


Let's start with <i>mutually exclusive</i> events, If I roll one die I can't simultaneously get a 4 and a 6. The union probability for these is easy

$$P(4 \:OR\: 6) = \frac{1}{6} + \frac{1}{6} = \frac{2}{6} = \frac{1}{3} = .3333$$

For non <i>mutually exlusive</i> events, we need to subtract the joint probability of both events. If I roll one die, I can get a 4 or a 6, but not both. The union probability for these is:

$$P(A \: OR \: B) = P(A) + P(B) - P(A \cap B)$$
$$P(A \:OR \: B) = P(A) + P(B) - P(A) \: * \: P(B)$$

The probability of rolling P(heads) = .5 and P(6) = .1666 is:
$$ P(heads \: OR\: 6) = .5 + .1666 - (.5 * .1666) = .5833$$
<br><br>



### Conditional Probability and Bayes' Theorem

**Conditional Probability**

Conditional probability is the probability of event A occuring given event B has occured. It is typically expressed as $$P(A | B)$$ read as "the probability of A given B"
<br><br>

**Bayes' Theorem**

P(Coffee) = .65<br>
P(Cancer) = .005<br>
P(Coffee|Cancer) = .85<br>

$$P(A|B) = \frac{P(B|A)*P(A)}{P(B)}$$
$$P(Cancer|Coffee) = \frac{P(Coffee|Cancer) * P(Cancer)}{P(Coffee)}$$
$$P(Cancer|Coffee) = \frac{.85 * .005}{.65} = .0065$$



In [33]:
p_coffee_drinker = 0.65
p_cancer = .005
p_coffee_drinker_given_cancer = .85

p_cancer_given_coffee_drinker = (p_coffee_drinker_given_cancer * p_cancer) / p_coffee_drinker
print(p_cancer_given_coffee_drinker)

0.006538461538461539


### Joint and Union Conditional Probabilities


**Joint Probability**

I want to find the probability somebody is a coffee drinker AND they have cancer.
$$P(Coffee \: and \: Cancer)$$

**Option 1**: If we don't have any conditional probability available best we can do is:

$$P(Coffee) * P(Cancer) = .65 * .005 = .00325$$

**Option 2**: If conditional probability is available, what we can do is:

$$P(Coffee | Cancer) * P(Cancer) = .85 * .005 = .00425$$

or we can do it the other way
$$P(Cancer | Coffee) * P(Coffee) = .65 * .005 = .00425$$

so are final formula for joint and conditional probability is:
$$P(A \cap B) = P(A|B) * P(B)$$


**Union Probability**

$$P(A \: OR \: B) = P(A) + P(B) - P(A|B) * P(B)$$ 

### Binomial Distribution

The binomial distribution is a discrete probability distribution that describes the probability of a success or failure outcome in an experiment that is repeated multiple times. The binomial distribution assumes a result of one trial is not affected by the outcome of another trial. The binomial distribution is also known as the binomial probability distribution or the Bernoulli distribution.

The binomial distribution formula is:
$$P(X) = C(n, k) * p^k * (1-p)^{n-k}$$
$$P(X) = \binom{n}{k} * p^k * (1-p)^{n-k}$$
$$P(X) = \frac{n!}{k!(n-k)!} * p^k * (1-p)^{n-k}$$

where:
- n = number of trials
- k = number of successes
- p = probability of success in one trial

In [34]:
from math import factorial


def binom_distr(k, n, p):
  return (factorial(n) / (factorial(k) * factorial(n - k))) * (p ** k) * ((1 - p) ** (n - k))

binom_distr(9, 10, 0.9)

0.38742048900000003

In [35]:
def factorial(n):
  fact = 1
  for i in range(n):
    i += 1
    fact *= i
  return fact

def binom_distr(k, n, p):
  return (factorial(n) / (factorial(k) * factorial(n - k))) * (p ** k) * ((1 -p)**(n - k))

binom_distr(9, 10, 0.9)

0.38742048900000003

In [36]:
from scipy.stats import binom

n = 10
p = 0.9

total_probability = 0
for k in range(n + 1):
  probability = binom.pmf(k, n, p)
  total_probability += probability
  # print("Total probability: {0}".format(total_probability))
  print("{0} - {1}".format(k, probability))
  # print('\n')

0 - 9.999999999999981e-11
1 - 8.999999999999986e-09
2 - 3.644999999999997e-07
3 - 8.747999999999995e-06
4 - 0.00013778099999999982
5 - 0.0014880347999999988
6 - 0.011160261000000001
7 - 0.05739562799999998
8 - 0.1937102445
9 - 0.387420489
10 - 0.3486784401000001


### Beta Distribution

The beta distribution allows us to see the likelihood of different underlying probabilites for an event to occur given alpha successes and beta failures.

In [37]:
from scipy.stats import beta

a = 8 # number of successes
b = 2 # number of failures

p = beta.cdf(.90, a, b)
print(p)
print('area in a beta distribution: {}'.format(1 - p))

0.7748409780000002
area in a beta distribution: 0.22515902199999982


In [38]:
a = 30
b = 6
p = 1.0 - beta.cdf(0.90, a, b)
print(p)

0.13163577484183697


In [39]:
# to find the area in the middle
a = 30
b = 6
p = beta.cdf(0.90, a, b) - beta.cdf(0.80, a, b)
print(p)

0.5962725311986752


### Exercises

1. There is a 30% chance of rain today, and a 40% chance your umbrella order will arrive on time.</br> You are eager to walk in the rain today and cannot do so without either! What is the probability that you will be able to walk in the rain today?

In [40]:
# joint probability problem

# P(A and B) = P(A) * P(B)
p_rain = 0.30
p_umbrella = 0.40
p_walking = p_rain * p_umbrella
print(p_walking)

0.12


2. There is a 30% chance of rain today, and a 40% chance your umbrella will arrive on time.<br>
You will be able to run errands only if it does not rain or your umbrella arrives.<br>
What is the probability it will not rain OR your umbrella arrives?

In [41]:
# non mutually exclusive union probability
# P(A or B) = P(A) + P(B) - P(A and B)
p_rain = 0.30
p_umbrella = 0.40
p_no_rain = 1 - p_rain

p_not_rain_or_umbrella = (p_no_rain + p_umbrella) - (p_rain * p_umbrella)
print(p_not_rain_or_umbrella)


0.9800000000000001


3. There is a 30% chance of rain today, and 40% chance your umbrella order will arrive on time.<br>
However, you found out if it rains there is only a 20% chance your umbrella will arrive on time.<br>
What is the probability it will rain AND your umbrella will arrive on time?

In [42]:
# Conditional probability
# P(p_rain | p_umblrella) = (P(p_umbrella | p_rain) * p_rain) / p_umbrella

p_rain = 0.30
p_umbrella = 0.40
p_umbrella_if_it_rains = 0.20
p_rain_and_umbrella = (p_umbrella_if_it_rains * p_rain)
print(p_rain_and_umbrella)

0.06


4. You have 137 passengers booked on a flight from Las Vegas to Dallas.</br>However, it is Las Vegas on a Sunday morning and you estimate each passenger is 40% likely to not show up. </br>You are trying to figure out how many seats to overbook so the plane does not fly empty.</br> How likely is it at least 50 passengers will not show up?

In [49]:
# Binomial distribution
p = 0.40
n = 137
k = 50

total_probability = 0
for x in range(k, (n + 1)):
  probability = binom.pmf(x, n, p)
  total_probability += probability
print("Total probability: {0}".format(total_probability))

Total probability: 0.8220955881474249


5. You flipped a coin 19 times and got heads 15 times and tails 4 times. Do you think this coin has any good probability of being fair? Why or why not?

In [54]:
heads = 15 # number of successes
tail = 4 # number of failures

p = 1.0 -  beta.cdf(.50, heads, tail)
print(p)

# the coin is unlikely to be fair

0.9962310791015625
