<a href="https://colab.research.google.com/github/LawtonAlyssa/Adafruit_BMP280_Library/blob/master/class_lectures/0126Lesson.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Course matters

* **Exam 1** scheduled for 2/10, but we used an extra day for Ch 2, which moves the exam to 2/14 (♥)
 * Might even be 2/16

* Q&A due Saturday
  * No 'insignificant' questions

# Probability

Topics:
* Motivation:
 * Too unlikely?

* RVs
 * Binomial

* Conditional probability
 * Monty Hall

* Bayes

Probability is the foundation of stats. For instance

* If I ask 10 randomly selected HPU students if they love donuts and all 10 say 'yes', should I believe that 100% of HPU students love donuts? 99%? 90%?

* If a street performer as part of the performer's act flips 10 consecutive heads. Should I believe that coin is fair?

* If 100 groups of 40 people are randomly selected, in how many of them should there there be (at least) two people who share a birthday?



We'll use Python to compute most probabilities. 
* sometimes directly/theoretically
* sometimes via *simulation*
 * we already saw simulation the 0112 lesson where we made 400 sames of 3 and 30 individuals to see why sample variance had $n-1$ in the denominator.

In [None]:
import numpy as np
from numpy.random import default_rng
rng = default_rng()

In [None]:
rfloat = rng.random()
print(rfloat)

0.22733602246716966


In [None]:
# import numpy as np
import scipy.stats

A **random variable (RV)** is a variable, $X$, who's value is generated randomly according to some specific probabilities.

Example. A **binomial RV**, $X$ with parameters $n$ and $p$ takes the values $k=0,1,\ldots,n$ with probabilities
\begin{align*}
P(X=k) &= \binom{n}{k}p^{k}(1-p)^{n-k} \\
       &= \frac{n!}{k!(n-k)!}p^{k}(1-p)^{n-k}
\end{align*}

Models repeating a 2-outcome experiment n times, independently and recording the number of 'successes' which have probability $p$.



E.g. a coin is flipped 10 times, what's the probability of

* exactly 7 H's
* at least 7 H's

Here, $n=10$, $p=0.5$. Note that this is the exact same question as

A family has 10 children, what's the probability of
* exactly 7 females
* at least 7 females.



Solution.

1. Theoretically/directy by hand:
* exactly 7:
$$
\frac{10!}{7!3!}0.5^{7}0.5^{3} = \frac{120}{1024}
$$

* at least 7: do same for exactly 7, exactly 8, exactly 9, & exactly 10 and add the results.

2. Theoretically/directy via Scipy's binomial
* exactly 7

In [None]:
print(scipy.stats.binom(10,0.5).pmf(7))
print(120/1024)

0.11718750000000014
0.1171875


* at least 7:

In [None]:
total_prob = 0
for k in range(7,11):
  total_prob += scipy.stats.binom(10,0.5).pmf(k)
print(total_prob)

0.17187500000000014


Even better:

In [None]:
1-scipy.stats.binom(10,0.5).cdf(6)

0.171875

We used this to determine what `range` outputs:

In [None]:
list(range(7,11))

[7, 8, 9, 10]

Question was asked: what if it's the probability that 6-sided die, rolled 10 times, yields at least 7 2's?

That's *still* binomial, with $n=7$, $p=1/6$.

In [None]:
1-scipy.stats.binom(10,1/6).cdf(6)

0.0002675214652237967


3. Simulation
* exactly 7:

In [None]:
trials=1000000
num_sevens = 0
for _ in range(trials):
  if np.sum(rng.integers(2,size=10)) == 7:
    num_sevens +=1
num_sevens/trials


0.117389

## Cond. probability


Conditional probability is the probability of something happening GIVEN some extra information.

Example: Suppose I roll a fair die twice. What's the probability that the first roll was a 4 given that sum of the two rolls was
* 6
* 7

Solution. Here's a table of all possible sums

|Roll 1 $\rightarrow$ |1|2|3|4|5|6
|-|-|-|-|-|-|-
|Roll 2 $\downarrow$||||
|1|2|3|4|5|6|7
|2|3|4|5|6|7|8
|3|4|5|6|7|8|9
|4|5|6|7|8|9|10
|5|6|7|8|9|10|11
|6|7|8|9|10|11|12

* 6:
 * P(1st is 4|sum is 6)=1/5 because 2nd 6 isn't possible if we know the sum is 6.
* 7: 
 * P(1st is 4|sum is 7) 1/6 because nothing is ruled out; knowing that the is 7 doesn't change the probability that 1st is 4.

Note that the definition of **independence** is this: A is independet of B if being given A doesn't change the probability of B (or vice-versa).

So above we concluded that "1st is 4" is not independent of "sum is 6". .But "1st is 4" is independent of "sum is 7".

Monty Hall problem:
You're on a gameshow with three doors, behind two of which is a goat and behind the third is a car. You pick a door and then the host reveals a goat not behind your door. Now you have the option to switch to the other unknown door. Should you?



## Bayes

|Driver type|% of drivers|prob. of collision|
|---|--|--
|Teen|8|0.15
|Young adult| 16|0.08
|Midlife| 45| 0.04
|Senior| 31| 0.05

* a. If a driver is selected at random, what's the prob of an accident?
* b. If a driver is in an accident, what's the prob it was a teen?

![test](https://docs.google.com/drawings/d/e/2PACX-1vRX-0_CvFemeOa1z0uCAYY_irewvQjoYQB4gD_xfWnwrf3BNC3DH_oqXqoO_dbvl6cnKQc3OM4yXIl9/pub?w=480&h=360)