# Lecture 2: Story Proofs, Axioms of Probability


## Stat 110, Prof. Joe Blitzstein, Harvard University

----

## Sampling, continued

Choose $k$ objects out of $n$

|           | ordered | unordered |
|-----------|:---------:|:-----------:|
| __w/ replacement__   | $n^k$     |   ???     |
| __w/o replacement__  | $n(n-1)(n-2) \ldots (n-k+1)$ | $\binom{n}{k}$  |


* __ordered, w/ replacement__: there are $n$ choices for each $k$, so this follows from the multiplication rule.
* __ordered, w/out replacement__: there are $n$ choices for the 1<sup>st</sup> position; $n-1$ for the 2<sup>nd</sup>; $n-2$ for the 3<sup>rd</sup>; and $n-k+1$ for the $k$<sup>th</sup>.
* __unordered, w/ replacement__: _we will get to this shortly..._
* __unordered, w/out replacement__: the binomial coefficient; think of choosing a hand from a deck of cards.

To complete our discussion of sampling, recall that of the four ways of sampling as shown above, all except the case of __unordered, with replacement__ follow immediately from the multiplication rule. 

Now the solution is $\binom{n+k-1}{k}$, but let's see if we can prove this.

### A simple proof

We start off with some simple edge cases.

If we let $k=0$, then we are not choosing anything, and so there is only one solution to this case: the empty set.
\begin\{align\}
    \text{let }k = 0  \Rightarrow \binom{n+0-1}{0} &= \binom{n-1}{0} \\\\
    &= 1
\end\{align\}

If we let $k=1$, then there are $n$ ways we could select a single item out of a total of $n$. 
\begin\{align\}
    \text{let }k = 1  \Rightarrow \binom{n+1-1}{1} &= \binom{n}{1} \\\\
    &= n
\end\{align\}

Now let's consider a simple but non-trivial case. If we let $n=2$, then
\begin\{align\}
    \text{let }n = 2  \Rightarrow \binom{2+k-1}{k} &= \binom{k+1}{k} \\\\
    &= \binom{k+1}{1} \\\\
    &= k+1
\end\{align\}

Here's an example of $n=2, k=7$:

![title](images/L0201.png)

But notice that we are really doing here is placing $n-1$ dividers between $k$ elements. Or in other words, we are choosing $k$ slots for the elements out of $n+k-1$ slots in total.

![title](images/L0202.png)

And we can easily build on this understanding to other values of $n$ and $k$.

![title](images/L0203.png)

And the number of ways to select $k$ items out of $n$, unordered and with replacement, is:

\begin\{align\}
    \text{choose k out n items, unordered, with replacement}  &= \binom{n+k-1}{k} \\\\
                                        &= \binom{n+k-1}{n-1}
\end\{align\}

## Story Proof
A story proof is a proof by _interpretation_. No algebra needed, just intuition.

Here are some examples that we have already come across.

### Ex. 1 
$$ \binom{n}{k} = \binom{n}{n-k} $$

Choosing $k$ elements out of $n$ is the same as choosing $n-k$ elements out of $n$. We've just seen this above!

### Ex. 2
$$ n \binom{n-1}{k-1} = k \binom{n}{k} $$

Imagine picking $k$ people out of $n$, and then designating of the $k$ as president. You can either select all $k$ people, and then choose 1 from among those $k$. Or, you can select a president, and then choose the remaining $k-1$  out of the $n-1$ people.

### Ex. 3
$$ \binom{m+n}{k} = \sum_{j=0}^{k} \binom{m}{j} \binom{n}{k-j} $$

Suppose you had $m$ boys and $n$ girls, and you needed to select $k$ children out of them all. You could do this by first choosing $j$ out of the $m$ boys, and then choosing $k-j$ of the girls. You would have to apply the multiplication rule to get the total number of combinations, and then sum them all up. This is known as [Vandermonde's identity](https://en.wikipedia.org/wiki/Vandermonde%27s_identity).

----

## Non-na&iuml;ve Definition of Probability

Now we move from the na&iuml;ve definition of probability into the more abstract and general.

#### Definition: non-na&iuml;ve definition of probability
> Let $S$ be a sample space, the set of all possible outcomes of some experiment. $S$ might not be _finite_ anymore, and all outcomes might not be _equally probable_, either.
> 
> Let $A$ be an event in, or a subset of, $S$.
>
> Let $P$ be a function that maps an event $A$ to some value from $0$ to $1$.

And we have the following axioms:

### Axiom 1

> \begin\{align\}
>    P(\emptyset) = 0 \\\\
>    P(\Omega) = 1
> \end\{align\}

The probability of the empty set, or a null event, is by definition $0$.

The probability of the entire space is by definition $1$.

These are the 2 extremes, and this is why Prof. Blitzstein lumps them together in one rule.

### Axiom 2

> $$ P(\bigcup_{n=1}^{\infty} A_{n}) = \sum_{n=1}^{\infty} P(A_{n}) \iff A_1, A_2, ... A_n \text{ are disjoint (non-overlapping)} $$

Every theorem about probability follows from these 2 rules. You might want to have a look at [Kolmogorov's axioms](http://mathworld.wolfram.com/KolmogorovsAxioms.html).

----

View [Lecture 2: Story Proofs, Axioms of Probability | Statistics 110](http://bit.ly/2nOw0JV) on YouTube.