In [None]:
import math
import scipy.stats as st

import ipywidgets as widgets
from ipywidgets import interact
from itertools import combinations, permutations, combinations_with_replacement, product

# Chapter 4: Discrete Random Variables

* `Discrete data`: Data you can count.
* `Random variable`: describes the outcomes of a statistical experiment in words.
    * The values of a random variable can vary with each repetition of an experiment.

## Random Variable Notation
* **Upper case** letters such as $X$ or $Y$ denote a random variable. 
* **Lower case** letters such as $x$ or $y$ denote the _value_ of a random variable.
> Said anothery way, $X$, $Y$ are given as words, whereas $x$, $y$ are given as a number.

##### <span style="color:orange">Example:</span>
Let:
* $X$ = the number of heads you get when you toss three fair coints.
* The sample Space for the toss of three fair coins is:
    * TTT; THH; HTH; HHT; HTT; THT; TTH; HHH

Then:
* $x$ = 0, 1, 2, 3

Notice $X$ is in words and $x$ is a number.  $x$ values are countable outcomes.

In [None]:
flip_options = 'HT'
output_values = list(combinations_with_replacement(flip_options, 3))
[''.join(x) for x in output_values]

In [None]:
output_values = list(product(flip_options, repeat=3))
[''.join(x) for x in output_values]

## 4.1 Probability Distribution Function (PDF) for a Discrete Random Variable
Two Characteristics:
1. Each probability is between zero and one, inclusive.
2. The sum of the probabilities is one.

##### <span style="color:orange">Example 4.1:</span>
$P(x)$ = probability that $X$ takes on a value of $x$.

|$x$|$P(x)$|
|--|--|
|0|$P(x=0)=\frac{2}{50}$|
|1|$P(x=1)=\frac{11}{50}$|
|2|$P(x=2)=\frac{23}{50}$|
|3|$P(x=3)=\frac{9}{50}$|
|4|$P(x=4)=\frac{4}{50}$|
|5|$P(x=5)=\frac{1}{50}$|

$X$ takes on the values 0, 1, 2, 3, 4, 5.  This is a discrete PDF because:
* Each $P(x)$ is between **zero** and **one**, inclusive.
* The sum of the probabilities is **one**, that is,
    * $\frac{2}{50} + \frac{11}{50} +\frac{23}{50} +\frac{9}{50} +\frac{4}{50} +\frac{1}{50} = 1$


##### <span style="color:orange">Example 4.1:</span>
Suppose Nancy has classes **three days** a week.  She attends classes three days a week **80%** of the time, **two days 15%** of the time, **one day 4%** of the time, and **no days 1% of the time.  Suppose one week is randomly selected.

Solutions:

a. Let $X$ = the number of days Nancy **attends class per week**.

b. $X$ takes on what values?
* 0, 1, 2, 3

c. Suppose one week is randomly chosen.  Construct a probability distribution table (called a PDF table) like the one in Example 4.1.  The table should have two columns labeled $x$ and $P(x)$.  What does the $P(x)$ column sum to?

|$x$|$P(x)$|
|--|--|
|0|0.01|
|1|0.04|
|2|0.15|
|3|0.80|

$$\therefore \sum_{x=0}^3 P(x) = 0.01+0.04+0.15+0.80=1.0$$


$$\sum_{x=0}^3 P(x)$$

## 4.2 Mean or Expected Value and Standard Deviation
* `Expected Value`: often referred to as the **"long-term" average or mean**.  This means that over the long term of doing an experiment over and over, you would **expect** this average.
* `Probability`: does not describe the short-term results of an experiment. It gives information about what can be expected in the long term (illustrating the **Law of Large Numbers**).
* `The Law of Large Numbers`: states that, as the number of trials in a probability experiment increases, the difference between the theoretical probability of an event and the relative frequency approaches zero (**the theorettical probability and the relative frequency get closer and closer together**).
* $\mu$ : the **mean** or **expected value** of the experiment is denoted by the Greek letter $\mu$. After conducting many trials of an experiment, you would expect this average value.

> Note: To find the expected value or long germ average, $\mu$, simply multiply each value of the random variable by its probability and add the products.


##### <span style="color:orange">Example 4.3:</span>

A men's soccer team plays soccer zero, one, or two days a week. The probability that they play zero days is 0.2,
the probability that they play one day is 0.5, and the probability that they play two days is 0.3. Find the long-term
average or expected value, $\mu$, of the number of days per week the men's soccer team plays soccer.

To solve this, let:
* the random variable $X$ = the number of days the men's soccer team plays soccer per week.
* $X$ takes on the values 0, 1, 2
* Construct a PDF adding a column $x*P(x)$, in this column multiply each $x$ value by its probability.

##### The Expected Value Table:
|$x$|$P(x)$|$x \cdot P(x)$|
|--|--|--|
|0|0.2|(0)(0.2)=0|
|1|0.5|(1)(0.5)=0.5|
|2|0.3|(2)(0.3)=0.6|

The _long term average_ or _expected value_ is 0 + 0.5 + 0.6 = 1.1

The number 1.1 is the long-term average or expected value if the men's soccer team plays soccer week after week after week.

We say $\mu=1.1$

##### <span style="color:orange">Example 4.4:</span>
Find the expected value of the number of times a newborn baby's crying wakes its mother after midnight.
The expected value is the expected number of times per week a newborn baby's crying wakes its mother after
midnight. Calculate the standard deviation of the variable as well.

|$x$|$P(x)$|$x \cdot P(x)$|$(x-\mu)^2 \cdot P(x)$|
|--|--|--|--|
|0|$P(x=0)=\frac{2}{50}$|$(0) (\frac{2}{50} ) = 0$|$(0-2.1)^2 \cdot 0.04 = 0.1764$|
|1|$P(x=1)=\frac{11}{50}$|$(1) (\frac{11}{50} ) = \frac{11}{50}$|$(1-2.1)^2 \cdot 022. = 0.2662$|
|2|$P(x=2)=\frac{23}{50}$|$(2) (\frac{23}{50} ) = \frac{46}{50}$|$(2-2.1)^2 \cdot 0.46 = 0.0046$|
|3|$P(x=3)=\frac{9}{50}$|$(3) (\frac{9}{50} ) = \frac{27}{50}$|$(3-2.1)^2 \cdot 0.18 = 0.1458$|
|4|$P(x=4)=\frac{4}{50}$|$(4) (\frac{4}{50} ) = \frac{16}{50}$|$(4-2.1)^2 \cdot 0.08 = 0.2888$|
|5|$P(x=5)=\frac{1}{50}$|$(5) (\frac{1}{50} ) = \frac{5}{50}$|$(5-2.1)^2 \cdot 0.02 = 0.1682$|

Having added the values in the third column of the table to find the expected value of $X$:

$\mu$ = Expected Value = $\frac{105}{50}=2.1$

Use $\mu$ to complete the table.  The fourth column of this table will provide the values you need to calculate the standard deviation.  For each value $x$, multiply the square of its _deviation_ by its _probability_.  (Each deviation has the format $x-\mu$).

Add the values in the fourth column of the table:

0.1764 + 0.2662 + 0.0056 + 0.1458 + 0.2888 + 0.1682 = 1.05

The _standard deviation_ of $X$ is the square root of this sum: $\sigma = \sqrt{1.05} \approx 1.0247$

##### <span style="color:orange">Example 4.5:</span>

Suppose you play a game of chance in which five numbers are chosen from 0, 1, 2, 3, 4, 5, 6, 7, 8, 9. A computer randomly selects five numbers from zero to nine with replacement. You pay \\$2 to play and could profit \\$100,000 if you match all five numbers in order (you get your \\$2 back plus \\$100,000). Over the long term, what is your **expected** profit of playing the game?

To do this problem, set up an expected value table for the amount of money you can profit.

Let $X$ = the amount of money you profit. The values of $x$ are not 0, 1, 2, 3, 4, 5, 6, 7, 8, 9. Since you are interested
in your profit (or loss), the values of $x$ are 100,000 dollars and −2 dollars.

To win, you must get all five numbers correct, in order. The probability of choosing one correct number is $\frac{1}{10}$

because there are ten numbers. You may choose a number more than once. The probability of choosing all five
numbers correctly and in order is:
* $\frac{1}{10} \cdot \frac{1}{10} \cdot \frac{1}{10} \cdot \frac{1}{10} \cdot \frac{1}{10} = 1 \cdot 10^{-5} = 0.00001$

Therefore, the probability of:
* winning is 0.00001
* losing is 1 - 0.00001 = 0.99999

The expected value table is as follows:
||$x$|$P(x)$|$x \cdot P(x)$|
|--|--|--|--|
|Loss|-2|0.99999|$-2 \cdot 0.99999 = -1.9998$|
|Profit|100,000|0.00001|$100000 \cdot 0.00001 = 1$|

Since –1.99998 is about –2, you would, on average, expect to lose approximately \\$2 for each game you play.
However, each time you play, you either lose \\$2 or profit \\$100,000. The $2 is the average or expected LOSS per
game after playing this game over and over.

##### <span style="color:orange">Example 4.6:</span>

Suppose you play a game with a biased coin.  You play each game by tossing the coing once.
* $P(\text{heads})= \frac{2}{3}$ 
* $P(\text{tails})= \frac{1}{3}$ 

If you toss a head, you pay \\$6.  If you toss a tail, you win \\$10.

If you play this game many times, will you come out ahead?

a. Define a random variable $X$.
* $X$ = amount of profit

b. Compute the following expected value table:

||$x$|$P(x)$|$x \cdot P(x)$|
|--|--|--|--|
|LOSE|-6|$\frac{2}{3}$|$\frac{-12}{3}$|
|WIN|10|$\frac{1}{3}$|$\frac{10}{3}$|

c.  What is the expected value, $\mu$?  Do you come out ahead?
* Add the last column of the table.  The expected value $\mu = \frac{-2}{3}$. You lose, on average, about 67 cents each time you play the game - so you do not come out ahead.

## 4.3 Binomial Distribution
There are three characteristics of a binomial experiment.
1. There are a fixed number of trials. Think of trials as repetitions of an experiment.
    * $n$ denotes the number of
trials.
2. There are only two possible outcomes, called "success" and "failure," for each trial.
    * $p$ denotes the probability of a success on one trial, and
    * $q$ denotes the probability of a failure on one trial.
    * $p + q = 1$.
3. The $n$ trials are independent and are repeated using identical conditions. Because the $n$ trials are independent, the
outcome of one trial does not help in predicting the outcome of another trial. Another way of saying this is that for each
individual trial, the probability, $p$, of a success and probability, $q$, of a failure remain the same. For example, randomly
guessing at a true-false statistics question has only two outcomes. If a success is guessing correctly, then a failure is
guessing incorrectly. Suppose Joe always guesses correctly on any statistics true-false question with probability $p =
0.6$. Then, $q = 0.4$. This means that for every true-false statistics question Joe answers, his probability of success ($p =
0.6$) and his probability of failure ($q = 0.4$) remain the same.

The outcomes of a binomial experiment fit a **binomial probability distribution**. The random variable $X$ = the number of
successes obtained in the $n$ independent trials.

The mean, $\mu$, and variance, $\sigma^2$, for the binomial probability distribution are $\mu=np$ and $\sigma^2 = npq$. The standard deviation, $\sigma$,
is then $\sigma = \sqrt{npq}$.

Any experiment that has characteristics two and three and where $n = 1$ is called a **Bernoulli Trial** (named after Jacob
Bernoulli who, in the late 1600s, studied them extensively). A binomial experiment takes place when the number of
successes is counted in one or more Bernoulli Trials.

##### <span style="color:orange">Example 4.6:</span>
A fair coin is flipped 15 times.  Each flip is independent. What is the probability of getting more than ten heads?

Answer:

* Let $X$ = the number of heads in 15 flips of the fair coin.
* $X$ takes on the values 0, 1, 2, 3, ..., 15.
* Since the coin is fair
    * $p=0.5$
    * $q=0.5$
* The number of trials is:
    * $n=15$
* State the probability question mathematically:
    * $P(x \gt 10)$