In [1]:
import numpy as np
import pandas as pd
from scipy.stats import norm, binom
from scipy.special import comb
from IPython.display import display, Latex

In [2]:
from utils.src.stats import norm_prop

# [Random Variables](https://en.wikipedia.org/wiki/Random_variable)

A **random variable** (also called **random quantity**, **aleatory variable**, or **stochastic variable**) is a mathematical formalization of a quantity or object which depends on [random](https://en.wikipedia.org/wiki/Randomness "Randomness") events.[[1]](https://en.wikipedia.org/wiki/Random_variable#cite_note-:2-1)

A **random variable** ${\displaystyle X}$ is a [measurable function](https://en.wikipedia.org/wiki/Measurable_function "Measurable function") ${\displaystyle X\colon \Omega \to E}$ from a set of possible [outcomes](https://en.wikipedia.org/wiki/Outcome_(probability) "Outcome (probability)") ${\displaystyle \Omega }$ to a [measurable space](https://en.wikipedia.org/wiki/Measurable_space "Measurable space") ${\displaystyle E}$. The technical axiomatic definition requires ${\displaystyle \Omega }$ to be a sample space of a [probability triple](https://en.wikipedia.org/wiki/Probability_space "Probability space") ${\displaystyle (\Omega ,{\mathcal {F}},\operatorname {P})}$ (see the [measure-theoretic definition](https://en.wikipedia.org/wiki/Random_variable#Measure-theoretic_definition)). A random variable is often denoted by capital [roman letters](https://en.wikipedia.org/wiki/Latin_script "Latin script") such as ${\displaystyle X}$, ${\displaystyle Y}$, ${\displaystyle Z}$, ${\displaystyle T}$.

The probability that ${\displaystyle X}$ takes on a value in a measurable set ${\displaystyle S\subseteq E}$ is written as

${\displaystyle \operatorname {P} (X\in S)=\operatorname {P} (\{\omega \in \Omega \mid X(\omega )\in S\})}$

- Discrete RV
- Continuous RV

## [Expected Value](https://en.wikipedia.org/wiki/Expected_value)

In [probability theory](https://en.wikipedia.org/wiki/Probability_theory "Probability theory"), the **expected value** (also called **expectation**, **mathematical expectation**, **mean**, **average**, or **first moment**) is a generalization of the [weighted average](https://en.wikipedia.org/wiki/Weighted_average "Weighted average"). Informally, the expected value is the [arithmetic mean](https://en.wikipedia.org/wiki/Arithmetic_mean "Arithmetic mean") of a large number of [independently](https://en.wikipedia.org/wiki/Independence_(probability_theory) "Independence (probability theory)") selected [outcomes](https://en.wikipedia.org/wiki/Experiment_(probability_theory) "Experiment (probability theory)") of a [random variable](https://en.wikipedia.org/wiki/Random_variable "Random variable").

The expected value of a random variable X is often denoted by $E[X]$, $E(X)$, or $EX$, with $E$ also often stylized as $E$ or ${\displaystyle \mathbb {E} .}$

$\displaystyle {E} [X]=\sum _{i=1}^{\infty }x_{i}\,p_{i}$

$\displaystyle {Var} (X)=\operatorname {E} [X^{2}]-(\operatorname {E} [X])^{2}$

In [3]:
def rv_mean(X, P):
    """rv_mean calculates the expected value (also known as mean) of the random variable X.

    Args:
        X (array like): Random Variable X.
        P (array like): The probability distribution of random variable X.

    Returns:
        float: The expected value (also known as mean) the random variable X.
    """
    mu = np.average(a=X, weights=P)

    return mu

In [4]:
def rv_std(X, P):
    """rv_std calculates the standard deviation of the random variable X.

    Args:
        X (array like): Random Variable X.
        P (array like): The probability distribution of random variable X.

    Returns:
        float: The standard deviation of the random variable X.
    """
    mu = rv_mean(X, P)
    sd = np.sum((X - mu)**2 * P)**0.5

    return sd

### Example 1: Expected Value

Mahnoor owns and operates Mahnoor's Coffee Shop. The city of Laketown, Australia, where Mahnoor's Coffee Shop is located, recently enacted a ban on all foam cups to help protect the environment.

Instead of switching to paper cups, Mahnoor has decided to risk being fined by the city and to continue to use foam cups. She estimates that this will save her $10,000$ Australian dollars. She also estimates that there is a $12\%$ chance that she will be fined. The fine would be for $100,000$ Australian dollars.

**Find the expected value of Mahnoor's decision to continue to use foam cups.**

||Value|Probability|Value * Probability|
|:-:|:-:|:-:|:-:|
|Mahnoor is fined|-90,000|0.12|-10,800|
|Mahnoor is not fined|10,000|0.88|8800|

The expected value is -10,800 + 8,000 = -2000 Australian dollars.

### Example 2: Expected Value & Standard Deviation

A patient is sick with a certain infection where the treatment involves taking $\$20$ dollar drug that has a $90\%$ chance of curing the infection. If that drug doesn't work, then the patient takes an $\$80$ dollar drug that is almost guaranteed to cure the infection.

The table below displays the probability distribution of $X =$ the total amount of money a randomly selected patient spends on this treatment plan.

|X = total spent|\$20|\$100|
|:-:|:-:|:-:|
|P(X)|90%|10%|

Calculate $\mu_X$ and $\sigma_X$

In [5]:
X = np.array([20, 100])
P = np.array([0.9, 0.1])

In [6]:
print("Expected Value: ", rv_mean(X, P))
print("Standard Deviation: ", rv_std(X, P))

Expected Value:  28.0
Standard Deviation:  24.0


### Example 3: Expected Value

Marvin the monkey is taking a multiple choice test as part of an experiment. There are $4$ questions on the test and each question has $2$ different answer choices. Since Marvin is a monkey, he will be guessing on each question.

The test is graded according to the grading scheme below.

**What is the expected number of points Marvin will score?**  
Round your answer to the nearest hundredth.

|Correct answers|0|1|2|3|4|
|:-:|:-:|:-:|:-:|:-:|:-:|
|Points|0|3|5|7|10|

In [7]:
n, p = 4, 1/2
ks = np.arange(n+1)
points = np.array([0, 3, 5, 7, 10])
df = pd.DataFrame({'Correct answers': ks, 'Points': points})
df

Unnamed: 0,Correct answers,Points
0,0,0
1,1,3
2,2,5
3,3,7
4,4,10


In [8]:
df['Probability'] = df['Correct answers'].apply(lambda k: binom.pmf(k, n, p))
df

Unnamed: 0,Correct answers,Points,Probability
0,0,0,0.0625
1,1,3,0.25
2,2,5,0.375
3,3,7,0.25
4,4,10,0.0625


In [9]:
display(Latex(f"$E[X] = {sum(df['Probability'] * df['Points'])}$"))

<IPython.core.display.Latex object>

### Example 4: Expected Value

Ahmed is playing a lottery game where he must pick $2$ numbers from $0$ to $9$ and then one letter out of the $26$-letter English alphabet. (He may choose the same number both times.)

If his ticket matches the $2$ numbers and $1$ letter drawn in order, he wins the grand prize and receives $\$10{,}405$. If just his letter matches but one or both of his numbers do not match, he wins the small prize of $\$100$. Under any other outcome, he loses and receives nothing. The game costs him $\$5$ play.

He has chosen the ticket $\text{04R}$.

**What is the expected profit for Ahmed's when he buys a ticket?**  
Round your answer to the nearest hundredth.

$E[X] = P(\text{grand prize}) \cdot (10405 - 5) + P(\text{small prize}) \cdot (100 - 5) + P(\text{no prize}) \dot (-5)$

In [10]:
cost = -5
P_grand_prize, N_grand_price = 1/10 * 1/10 * 1/26, 10405 + cost
P_small_prize, N_small_prize = 1/26 - P_grand_prize, 100 + cost
P_no_prize = 1 - P_small_prize

E = P_grand_prize * N_grand_price + P_small_prize * N_small_prize + P_no_prize * cost
E = round(E, 2)
display(Latex(f"$E[04R] = {E}$"))

<IPython.core.display.Latex object>

### Example 5: Expected Value

Jessica is playing a game where there are $4$ blue markers and $6$ red markers in a box. She is going to pick $3$ markers without replacement.

If she picks all $3$ red markers, she will win a total of $\$500$. If the first marker she picks is red but not all $3$ markers are red, she will win a total of $\$100$. Under any other outcome, she will win $\$0$.

**What is the expected value of Jessica's winnings?**  
Round your answer to the nearest cent.

$E[X] = P(\text{R, R, R}) \cdot 500 + (P(\text{R, X, X}) - P(\text{R, R, R})) \cdot 100 + P(\text{others}) \cdot 0$

In [11]:
# R/B, R/B, R/B
N, k = 10, 3
XXX = comb(N, k)
# R, R, R
N, k = 6, 3
# Pr
RRR = comb(N, k)
p1 = RRR / XXX
p1

0.16666666666666666

In [12]:
# R, R (given 5 Reds)
N, k = 5, 2
RR = comb(N, k)
# R/B, R/B (given 9 balls)
N, k = 9, 2
XX = comb(N, k)
# Given the first ball is Red, and the other two balls are not Red
p2 = 6/10 * (XX - RR) / XX
p2

0.43333333333333335

In [13]:
E = p1 * 500 + p2 * 100
E = round(E, 2)
display(Latex(f"$E[X] = {E}$"))

<IPython.core.display.Latex object>

### Example 6: Expected Value

Xavier is trying to decide if he should spend $\$25$ to purchase virus protection software for his new computer.

There are currently three viruses in circulation. The software that he is considering purchasing will protect him perfectly from the Trojan virus and the Pikachu virus, but will not protect him from the ADA virus.

Based on his research, Xavier makes the table below which shows the probability and cost of repair for the three viruses currently in circulation. Assume that Xavier can get at most one virus and that this table is accurate.

In [14]:
df = pd.DataFrame({
    "Type of virus": ["Pikachu", "Trojan", "ADA"], 
    "Cost of repair $": [400, 200, 300],
    "Probability": [0.2, 0.2, 0.1]})

df

Unnamed: 0,Type of virus,Cost of repair $,Probability
0,Pikachu,400,0.2
1,Trojan,200,0.2
2,ADA,300,0.1


**What is the expected total cost of viruses and software if Xavier purchases the virus protection software? Round your answer to the nearest dollar.**

In [15]:
300 * 0.1 + 25

55.0

**What is the expected total cost of viruses and software if Xavier does not purchase the virus protection software? Round your answer to the nearest dollar.**

In [16]:
np.sum(df["Cost of repair $"] * df.Probability)

150.0

### Example 7: Expected Value

Paul has the option of a high deductible or a low deductible health insurance plan.

If Paul chooses the low deductible plan, he will have to pay the first $\$1000$ of any medical costs, and the rest of the costs will be paid by his insurance company. The low deductible plan costs $\$8000$ for a year.

If Paul chooses the high deductible plan, he will have to pay the first $\$2500$ of any medical costs, and the rest of the costs will be paid by his insurance company. The high deductible plan costs $\$7500$ for a year.

To help himself choose a plan, Paul found some statistics about common health problems for people similar to him. Assume that the table below correctly shows the probabilities and costs of total medical incidents within the next year.

In [17]:
df = pd.DataFrame({
    "Medical costs $": [0, 1000, 4000, 7000, 15000],
    "Probabiltiy": [0.3, 0.25, 0.2, 0.2, 0.05]})
df

Unnamed: 0,Medical costs $,Probabiltiy
0,0,0.3
1,1000,0.25
2,4000,0.2
3,7000,0.2
4,15000,0.05


**Including the cost of insurance, what are the expected total medical costs that Paul must pay with the low deductible plan? Round your answer to the nearest dollar.**

In [18]:
deductible, insurance = 1000, 8000
df["Low Deductible"] = df["Medical costs $"].apply(lambda cost: deductible if cost >= deductible else cost)
df

Unnamed: 0,Medical costs $,Probabiltiy,Low Deductible
0,0,0.3,0
1,1000,0.25,1000
2,4000,0.2,1000
3,7000,0.2,1000
4,15000,0.05,1000


In [19]:
np.sum(df["Low Deductible"] * df.Probabiltiy) + insurance

8700.0

**Including the cost of insurance, what are the expected total medical costs that Paul must pay with the high deductible plan? Round your answer to the nearest dollar.**

In [20]:
deductible, insurance = 2500, 7500
df["High Deductible"] = df["Medical costs $"].apply(lambda cost: deductible if cost >= deductible else cost)
df

Unnamed: 0,Medical costs $,Probabiltiy,Low Deductible,High Deductible
0,0,0.3,0,0
1,1000,0.25,1000,1000
2,4000,0.2,1000,2500
3,7000,0.2,1000,2500
4,15000,0.05,1000,2500


In [21]:
np.sum(df["High Deductible"] * df.Probabiltiy) + insurance

8875.0

**If Paul wants the best payoff in the long run and must buy one of the two insurance plans, he should purchase the `Low Deductible` plan since it has lower expected cost.**

### Example 8: Expected Value

Francisco is at the arcade and can buy a token for $\$2$ that gives him the choice of playing either Smack the Mole or Motorcycle Chasers.

**Smack the Mole:** He plays 3 rounds, and for each round he has a $30\%$, percent chance of smacking the mole. If he smacks the mole in all $3$ rounds, he will win a stuffed bear worth $\$50$.

**Motorcycle Chasers:** He will participate in $4$ races, and for each race he has an $80\%$, percent chance of winning the race. If he wins all $4$ races, he will win a replica motorcycle worth $\$70$.

**What is Francisco's expected value from playing Smack the Mole? Round your answer to the nearest cent.**

In [22]:
round(0.3**3 * (50-2) + (1 - 0.3**3) * -2, 2)

-0.65

**What is Francisco's expected value from playing Motorcycle Chasers? Round your answer to the nearest cent.**

In [23]:
round(0.8**4 * (70-2) + (1 - 0.8**4) * -2, 2)

26.67

### Example 9: Expected Value & Cumulative

Rory has entered The Irish Gunman’s Open. In The Irish Gunman’s Open, Rory will hit golf balls at targets. If he hits the target, he will advance to the next stage, and if he misses the target, Rory will exit The Open. The Open consists of $4$ possible stages.

If the best stage Rory completes is the second stage, he will win $100$ euros. If the best stage Rory completes is the third stage, Rory will win a total of $300$ euros. If he can complete all $4$ stages, he will win a total of $1000$ euros.

Because each stage gets progressively more difficult, the probability of Rory hitting the targets decreases. The probability of Rory hitting each target given that he has reached that stage is given in the data table below.

**What is the expected value of Rory's prize money from The Irish Gunman's Open?**  
Round your answer to the nearest hundredth.

In [24]:
stage = np.array([1, 2, 3, 4])
probability = np.array([.6, .5, .4, .3])
prize = np.array([0, 100, 300, 1000])

df = pd.DataFrame({'Stage': stage, 'Probability': probability, 'Prize': prize})
df

Unnamed: 0,Stage,Probability,Prize
0,1,0.6,0
1,2,0.5,100
2,3,0.4,300
3,4,0.3,1000


In [25]:
p4 = 0.6 * 0.5 * 0.4 * 0.3
p3 = 0.6 * 0.5 * 0.4 * 0.7
p2 = 0.6 * 0.5 * 0.6
p1 = 1 - (p4 + p3 + p2)
E = np.sum(np.array([0, 100, 300, 1000]) * np.array([p1, p2, p3, p4]))
E = round(E, 2)
display(Latex(f"$E[X] = {E}$"))

<IPython.core.display.Latex object>

### Example 10: Exptected Value - Combs

Vera has challenged Alexey to a round of Marker Mixup. Marker Mixup is a game where there is a bag of $5$ red markers numbered $1$ through $5$, and another bag with $5$ green markers numbered $6$ through $10$.

Alexey will grab $1$ marker from each bag, and if the $2$ markers add up to more than $12$, he will win $\$5$. If the sum is exactly $12$, he will break even, and If the sum is less than $12$, he will lose $\$6$.

**What is Alexey's expected value of playing Marker Mixup?**  
Round your answer to the nearest cent.

In [26]:
import itertools
from itertools import product

In [27]:
red = [1, 2, 3, 4, 5]
green = [6, 7, 8, 9, 10]

In [28]:
combs = np.array(list(product(red, green)))
combs

array([[ 1,  6],
       [ 1,  7],
       [ 1,  8],
       [ 1,  9],
       [ 1, 10],
       [ 2,  6],
       [ 2,  7],
       [ 2,  8],
       [ 2,  9],
       [ 2, 10],
       [ 3,  6],
       [ 3,  7],
       [ 3,  8],
       [ 3,  9],
       [ 3, 10],
       [ 4,  6],
       [ 4,  7],
       [ 4,  8],
       [ 4,  9],
       [ 4, 10],
       [ 5,  6],
       [ 5,  7],
       [ 5,  8],
       [ 5,  9],
       [ 5, 10]])

In [29]:
p_more_than_12 = np.sum(np.sum(combs, axis=1) > 12) / len(combs)
p_equal_to_12 = np.sum(np.sum(combs, axis=1) == 12) / len(combs)
p_less_than_12 = np.sum(np.sum(combs, axis=1) < 12) / len(combs)
print(p_more_than_12)
print(p_equal_to_12)
print(p_less_than_12)

0.24
0.16
0.6


In [30]:
E = p_more_than_12 * 5 + p_equal_to_12 * 0 + p_less_than_12 * -6
E = round(E, 2)
display(Latex(f"$E[X] = {E}$"))

<IPython.core.display.Latex object>

---

## Transforming Random Variables

### Example 1

Mr. Gupta gave his students a quiz with three questions on it. Let $X$ represent the number of questions that a randomly chosen student answered correctly. Here is the probability distribution of $X$ along with summary statistics:

|X = # correct|0|1|2|3|
|:-:|:-:|:-:|:-:|:-:|
|P(X)|0.05|0.20|0.50|0.25|

- Mean: $\mu_X = 1.95$
- SD: $\sigma_X \approx 0.8$

Mr. Gupta decides to score the tests by giving $10$ points for each correct question. He also plans to give every student $5$ additional bonus points. Let $Y$ represent a random student's score.

**What are the mean and standard deviation of $Y$?**

In [31]:
X = np.array([0, 1, 2, 3])
P = np.array([0.05, 0.20, 0.50, 0.25])
mu_X, sd_X = 1.95, 0.8

df = pd.DataFrame({'X': X, 'P(X)': P})
df

Unnamed: 0,X,P(X)
0,0,0.05
1,1,0.2
2,2,0.5
3,3,0.25


In [32]:
df['Y'] = df.X * 10 + 5
df

Unnamed: 0,X,P(X),Y
0,0,0.05,5
1,1,0.2,15
2,2,0.5,25
3,3,0.25,35


In [33]:
mu_Y = mu_X * 10 + 5
sd_Y = sd_X * 10
display(Latex(f'$\mu_Y = \mu_X \cdot 10 + 5 = {mu_Y}$'))
display(Latex(f'$\sigma_Y = \mu_X \cdot 10 = {sd_Y}$'))

<IPython.core.display.Latex object>

<IPython.core.display.Latex object>

---

## [Combining Random Variables](https://www.khanacademy.org/math/statistics-probability/random-variables-stats-library/combine-random-variables/a/combining-random-variables-article)

We can form new distributions by combining random variables. If we know the mean and standard deviation of the original distributions, we can use that information to find the mean and standard deviation of the resulting distribution.

We can combine means directly, but we can't do this with standard deviations. We can combine variances as long as it's reasonable to assume that the variables are independent.

- $E[X+Y] = E[X] + E[Y]$, $E[X-Y] = E[X] - E[Y]$
- $Var[X+Y] = Var[X] + Var[Y]$, $Var[X-Y] = Var[X] + Var[Y]$

- Make sure that the variables are independent or that it's reasonable to assume independence, before combining variances.
- Even when we subtract two random variables, we still add their variances; subtracting two variables increases the overall variability in the outcomes.
- We can find the standard deviation of the combined distributions by taking the square root of the combined variances.

---

## [Combining Normal Random Variable](https://www.khanacademy.org/math/statistics-probability/random-variables-stats-library/combine-random-variables/a/combining-normal-random-variables)

When we combine variables that each follow a normal distribution, the resulting distribution is also normally distributed. This lets us answer interesting questions about the resulting distribution.

Review module `norm_rv`, click [here](utils/norm/norm_rv.py).

In [34]:
from utils.src.stats import norm_rv

### Example 1: Proportion above a value

A carnival ride has cars that each hold $4$ adult passengers. The weights of the passengers for this ride are normally distributed with a mean of $65 \text{kg}$ and a standard deviation of $12 \text{kg}$. Assume that the weights of passengers are independent from each other.

Let $T =$ the total weight of $4$ selected adult passengers for this ride.

**Find the probability that the total weight exceeds $290 \text{kg}$.**  
_You may round your answer to two decimal places._

In [35]:
dp = 290
mus = [65]*4
sds = [12]*4
method = 'T'
area = 'above'
p = round(norm_rv(dp, mus, sds, method, area), 2)

display(Latex(f'$P(T>290) = {p}$'))

<IPython.core.display.Latex object>

### Example 2: Proportion between two values

Some nations require their students to pass an exam before earning their primary school degrees or diplomas. A certain nation gives students an exam whose scores are normally distributed with a mean of $41$ points and a standard deviation of $9$ points.

Suppose we select $2$ of these testers at random, and define the random variable $D$ as the difference between their scores. We can assume that their scores are independent.

**Find the probability that their scores are within $10$ points of each other.**  
_You may round your answer to two decimal places._

In [36]:
dp = [-10, 10]
mus = [41]*2
sds = [9]*2
method = 'D'
area = 'between'
p = round(norm_rv(dp, mus, sds, method, area), 2)

display(Latex(f'$P(|D|<=10) = {p}$'))

<IPython.core.display.Latex object>

### Example 3: Proportion above a value

A breakfast cereal producer makes its most popular product by combining just raisins and flakes in each box of cereal. The amounts of flakes in the boxes of this cereal are normally distributed with a mean of $370\text{g}$ and a standard deviation of $24\text{g}$. The amounts of raisins are also normally distributed with a mean of $170\text{g}$ and a standard deviation of $7\text{g}$.

Let $T =$ the total amount of product in a randomly selected box, and assume that the amounts of flakes and raisins are independent of each other.

**Find the probability that the total amount of product exceeds $515\text{g}$.**  
_You may round your answer to two decimal places._

In [37]:
dp = 515
mus = [370, 170]
sds = [24, 7]
method = 'T'
area = 'above'
p = round(norm_rv(dp, mus, sds, method, area), 2)

display(Latex(f'$P(T>515) = {p}$'))

<IPython.core.display.Latex object>

### Example 3: Proportion below a value

Suppose that populations of men and women have the following summary statistics for their heights (in centimeters):

- Men: $\mu_M=172$, $\sigma_M=7.2$
- Women: $\mu_W=162$, $\sigma_W=5.4$

Both distributions are approximately normal. Suppose we randomly select a man and a woman from each population, and calculate the difference between their heights. We can assume that their heights are independent.

**Find the probability that the woman is taller than the man.**  
_You may round your answer to two decimal places._

In [38]:
dp = 0
mus = [172, 162]
sds = [7.2, 5.4]
method = 'D'
area = 'below'
p = norm_rv(dp, mus, sds, method, area)
p = round(p, 2)

# The probabilty that the woman is taller than the man is equal to
# the tall distance between the man and the woman is 0
display(Latex('$P({woman \space taller}) = P(D < 0) = $ ' + f'${p}$'))

<IPython.core.display.Latex object>

---

## Binomial Random Variables

- Each trial can be classified as a success or a failure.
- Number of trials is fixed.
- Independent:
    - 10% rule: Within finite population, when sampling without replacement and sample size $\leq 10\%$ of the population size, it can be treated as independent.
    - Finite population: When sampling with replacement, it's independent.
    - Infinite population: It's independent.

### Example 1

Based on previous data, an electronics manufacturer knows that $2\%$, percent of its computer processors are defective. Suppose the manufacturer randomly selects these processors until one is found with a defect. Let $D$ represent the number of processors it takes to find the first one that is defective. Assume that defective processors are independent.

**Is $D$ a binomial variable? Why or why not?**

**Ans**: There is no fixed number of trials, so $D$ is not a binomial variable.

---

# Binomial Random Variables

See also [Binomial Distribution](./binomial_distribution.ipynb)

---

# Normal Random Variables

See also [Normal Distribution](./normal_distribution.ipynb)