# Discrete Distributions Exercises

In [5]:
import pandas as pd 
import scipy as sp


# 1. 
(3.5.11)<br>
A computer system uses passwords that are exactly six characters and each character is one ofthe 26 letters (a-–z) or 10 integers (0-–9).<br>
Suppose that 10,000 users of the system have unique passwords. A hacker randomly selects (with replacement) 100,000 passwords from the potential set, and a match toa user’s password is called a hit.<br>

## 1.A - What is the distributions of the number of hits?
### Answer
Let <em>X</em> be a random variable that denotes the amount of correct passwords the hacker selects. Then <em>X</em> follows a binominal distributions, since it is a series of independent Bernoulli trials where the outcome is either "hit" or "no hit".<br>
<em>n</em> is number of trials, which in this case is the number of passwords.<br>
<em>p</em> is the probability of getting a hit. In this case we have 10000 users with unique passwords, where the hacker has to pick 1 correct from the 10000. There are $36^6$ possible permutations since we have replacements 

$$ 
    X \sim B(n=100000, p=\frac{10^4}{36^6})
$$

## 1.B - What is the probability of no hits?
Getting no hits is the same as X = 0 for the binominal distribution. We calculate the probability by using the PMF.

In [12]:
x = 0
n = 100000
p = pow(10,4) / pow(36,6)

p_no_hits = sp.stats.binom.pmf(x, n, p)
round(p_no_hits, 2)

0.63

## 1.C - What are the mean and variance of the hits?

In [20]:
hits_mean = sp.stats.binom.mean(n, p)
hits_variance = sp.stats.binom.var(n, p)

print(f"Mean: {round(hits_mean, 2)}")
print(f"Variance: {round(hits_variance, 2)}")

Mean: 0.46
Variance: 0.46


# 2
(3.5.13)<br>
Because all airline passengers do not show up for their reserved seat, an airline sells 125 tickets
for a flight that holds only 120 passengers.<br>
The probability that a passenger does not show up is 0.10, and the passengers behave independently.

## 2.A - What is the probability that every passenger who shows up can take the flight

### Answer
We know that the passengers behave independently and there are 2 outcomes - show up or do not show up<br>
Let <em>X</em> be a random variable that denotes how many passengers that do not show up. Then <em>X</em> follows a binomial distributions, since we have a series of independent Bernoulli distributions where each outcome is either "show up" or "do not show up" for each ticket sold<br>
<em>n</em> is the number of trials, which in this case is how many tickets sold for the flight<br>
<em>p</em> is the probability of not showing up, which from the description is 0.10<br>

$$
 X \sim B(n=125, p=0.10)
$$

To allow all passengers that show up to take a flight with a seat capacity of 120, when we have sold 125 tickets; 5 or more of the passengers who bought a ticket must not show up. <br>
$$P(X >= 5) = 1 - P(X < 5) = 1 - P(X <= 4)$$

In [25]:
n = 125
p = 0.1
p_all_can_take_flight = 1 - sp.stats.binom.cdf(4, n, p)
print(f"{round(p_all_can_take_flight, 3)}")

0.996


## 2.B - What is the probability that a flight departs with empty seats
### Answer
If the flight has 120 seats and we sell 125 tickets, then at least 6 or more must not show up in order to have a flight with empty seats
$$ P(X>=6) = 1 - P(X < 6) = 1 - P(X <= 5) $$

In [26]:
p_flight_with_empty_seats = 1 - sp.stats.binom.cdf(5, n, p)
print(f"{round(p_flight_with_empty_seats, 3)}")

0.989


# 3 
(3.6.6)<br>
A player of a video game is confronted with a series of opponents and has an 80% probability of defeating each one.<br> 
Success with any opponent is independent of previous encounters. Until defeated, the player continues to contest opponents.

## 3.A - What is the PMF of the number of opponents contested in the game
### Answer
We essentially have a geometric distributions, which is a special case of negative binomial where we are looking for how many trials until 1 success.<br> 
Quick overview: 
- Binomial Distributions
    - Models # of successes in a fixed amount of trials
- Negative Binomial Distributions
    - Models # of trials until a fixed amount of successes
- Geometric
    - Models # of trials until 1 success

The PMF of a geometric distribution is
$$
 P(X=x) = ({1-p})^{x-1}*p
$$

## 3.B - What is the probability a player defeats at least two opponents in a game?
### Answer
Since a geometric distribution models # of trials until success, and our game is how many opponents we can defeat until the player is defeated, then: <br>
Let <em>X</em> denote the number of opponents defeated and <em>p</em> be the chance of player defeated, which is $1-P_{defeat~opponent} = 1 - 0.8 = 0.2$ <br>
Then to beat at least 2 opponents we have the case 
$$
 P(X >= 2) = 1 - P(X < 2) = 1 - P(X <= 1)
$$

In [27]:
p = 0.2
p_defeat_2_or_more = 1 - sp.stats.geom.cdf(1, 0.2)
print(f"{round(p_defeat_2_or_more, 2)}")

0.8


## 3.C - What is the expected number of opponents contesed in a game?
The expectation is how many opoonents we expect to defeat until defeat, when the player has a 20% chance of getting defeated. This is essentially the mean of the distribution

In [32]:
expected_opponents = sp.stats.geom.mean(0.2) # Number of expected trials until first success
print(f"{expected_opponents}")

5.0


## 3.D - What is the probability that a player contests four or more opponents in a game?
### Answer
We are looking for 
$$
P(X >= 4) = 1 - P(X < 4) = 1 - P(X <= 3)
$$

In [35]:
p = 0.2
p_four_or_more = 1- sp.stats.geom.cdf(3, p)
print(f"{p_four_or_more}")

0.512


## 3.E - What is expected number of game plays until a player contests four or more opponents?
### Answer
The chance of defeating 4 or more opponents is 
$$
P(X >= 4)
$$
We now let X be a random variable that denotes the number of games until contest 4 or more. The expected number of games played until player contests 4 or more is then the mean of the new geometric distribution

In [39]:
expected_games_until_4_or_more_contested = sp.stats.geom.mean(p_four_or_more)
print(f"{round(expected_games_until_4_or_more_contested,2)}") # 1.95 ~ 2 games 

1.95
