### 1. Recall in the example from Lesson 10, there was another player, Bob, who had two Jacks and was looking to get a four-of-a-kind. For Bob, the random variable of interest was the number of Jacks among the community cards. Use a box model to argue that Bob’s random variable also has a hypergeometric distribution. What are its parameters?

- Recall that there are 48 cards in the deck, of which 2 are Jacks
- Let X be the random variable representing the number of jacks in the community cards

- $X \sim Hypergeometric(48, 2, 5)$

- From combinations perspective
$$
    \frac{\binom{2}{x} \binom{46}{5-x}}{\binom{48}{5}}
$$

### 2. In order to ensure safety, a random sample of cars on each production line are crash-tested before being released to the public. The process of crash testing destroys the car. Suppose that a production line contains 10 defective and 190 working cars. If 4 of these cars are chosen at random for crash-testing, what is:
- the probability that at least 1 car will be found defective?
- the probability that exactly 2 cars will be found defective?

- Let number of defective cars be X
- X ~ Hypergeometric(200, 10, 4)

- the probability that at least 1 car will be found defective? --> 18.7%
- the probability that exactly 2 cars will be found defective? --> 1.2%

In [4]:
from scipy import stats
import numpy as np
1 - np.sum(stats.hypergeom(200, 10, 4).pmf(0)) 
stats.hypergeom(200, 10, 4).cdf(2) - stats.hypergeom(200, 10, 4).cdf(1)

0.012490927178578581

### 3. The state proposes a lottery in which you select 6 numbers from 1 to 15. When it is time to draw, the lottery selects 8 different numbers, and you win if at least 4 of the 6 numbers you picked are among the 8 numbers that the lottery drew. What is the probability you win the prize?



- There are 15 numbers
    - 8 are positive (i.e. have been selected by the lottery)
    - 7 are negative (i.e. have not been selected by the lottery)
- Let X be the number of positives in a sample of 6
    - Therefore, X~Hypergeometric(15, 8, 6)
    - To win, you need $X \geq 4$. So compute $P(X \geq 4)$

$$\begin{align}
    P(X \geq 4) &= 1 - P(X \leq 3) \\
    &= 1 - F(3) \\
    &= 1 - f(3) - f(2) - f(1) - f(0) \\
    &\approx 0.378
\end{align}$$

- From the hypergeometric PMF, $f(3)$ is shown below. $f(2)$, $f(1)$, and $f(0)$ follows the same logic

$$
    f(3) = \frac{\binom{8}{3} \binom{7}{3}}{\binom{15}{6}}
$$

- We'll compute this below


In [98]:
import scipy
import numpy as np
prob = np.sum(scipy.stats.hypergeom(15,8,6).pmf([4,5,6]))
prob2 = 1-np.sum(scipy.stats.hypergeom(15,8,6).pmf([0,1,2,3]))
assert(prob==prob2)
prob

0.3776223776223776

In [101]:
population = [1] * 8 + [0] * 7
simulate_lottery=[np.sum(np.random.choice(population, 6, replace=False))>=4 for _ in range(10_000)]
win_rate=np.mean(simulate_lottery)
win_rate

0.3785

### 4. You are enrolled in 3 courses this quarter, and the breakdown of majors by class is as follows:

- Class 1: 4 Statistics majors and 6 Computer Science majors
- Class 2: 17 Statistics majors and 13 Computer Science majors
- Class 3: 11 Statistics majors and 9 Computer Science majors

(i) If you take a simple random sample of 20% of the students in Class 1, what is the probability that the number of Statistics majors in your sample is equal to the number of Computer Science majors? (Note: In a simple random sample, each student can be selected at most once.)

(ii) Now, suppose you pick one of your 3 classes at random and then choose a random sample of 20% of students from that class. What is the probability that the number of Statistics majors in your sample is equal to the number of Computer Science majors in your sample? 

(i)

- 20% of students in class 1 is 2 students
- So the question is, if I randomly sample 2 students without replacement, what is the probability I get 1 of each major
- Let X be the number of comp science students
- X~Hypergeom(10, 6, 2)
- Find P(X=1), or f(1)

$$\begin{align}
    f(1) = \frac{\binom{6}{1}\binom{4}{1}}{\binom{10}{2}} = 0.53
\end{align}$$


(ii)

- This is the same procedure, just weighted across all classes
- i.e. $\frac{1}{3} * f(1)_{class1,N=2} + \frac{1}{3} * f(6)_{class2,N=12} + \frac{1}{3} * f(2)_{class3,N=4}$

### 5. In Texas Hold’em, each player has 2 cards of their own, and all players share 5 cards in the center of the table. A player has a flush when there are at least 5 cards of the same suit out of the 7 total cards. The deck is shuffled between hands, so that the probability you obtain a flush is independent from hand to hand. What is the probability that you get a flush at least once in 10 hands of Texas Hold’em? (Hint: First, calculate the probability of a flush of spades. Then, repeat for the other suits, and add the probabilities together to obtain the overall probability of a flush.)


- Assumptions
    - Assume a only 1 player. 
    - Hence, total of 7 cards drawn (2 in hand, 5 in river)
    - 52 cards in a deck

- A flush can be diamonds, clubs, hearts, spades, with the last 2 cards taking on any suit
- Let's compute for spades flush first. Whatever value we get can simply be applied directly to the other suits
    - Let any spades card be tagged as a 1. In a deck, there are 13 1s and 39 0s
    - You want to choose 5 from the 13
    - You want to choose 2 from the 39

- Let X be the number of spades you draw
- X~Hypergeom(52, 13, 7)
- Compute $P(X >= 5) = 1 - P(X <= 4) = 1 - F(4)$

- Multiply by 4 to get the answer

In [105]:
import scipy
import numpy as np

np.sum(scipy.stats.hypergeom(52, 13, 7).pmf([5,6,7]))
1 - np.sum(scipy.stats.hypergeom(52, 13, 7).pmf([0,1,2,3,4]))

0.007641442330863835

In [127]:
population=[0]*39 + [1]*13
flush_draws=[np.sum(np.random.choice(population, 7, replace=False)) >= 5 for _ in range(25_000)]
np.mean(flush_draws)

0.00774

### 6. There are 25 coins in a jar. 15 are quarters, 7 are dimes, and 3 are pennies. Each time you reach in the jar, you are equally likely to pick any of the coins in the jar. The coins are not replaced in the jar after each draw. What is the minimum number of times you must reach in the jar to have at least a 50% chance of getting all 3 pennies? (Hint: In this question, n is unknown. You will have to try a few different values of n to get the answer.)

- Let X be the number of pennies drawn
- X ~ Hypergeom(25, 3, n)
- You want to find $n$ such that f(3) >= 0.5
- For simplicity, just use python 
- At least 21 draws! higher than expected

In [148]:
import numpy as np
import scipy
probs=[scipy.stats.hypergeom(25, 3, x).pmf(3) for x in range(26)]
print([i for i,x in enumerate(probs) if x >= 0.5][0])
probs;



21
