# Probability

## Probability versus Statistics

In probability, we assume we know what the world is like, and predict what we would then see.

In statistics, we make observations and use them to learn someting about the world.
That generally requires a mathematical model for how the obervations were generated.
Sometimes, that model comes from knowledge of the process that generated the data,
for instance, a randomized experiment or a physical measurement where the
underlying physics is understood.
In other cases, the model is basically an assumption.
When the model does not have a firm basis in actual knowledge, inferences are suspect.
As Robert L. Parker says, "the more you assume, the less you know."

Probability is like a _forward problem_ in science or engineering: the the physics, including any parameters,
is known. The issue is to predict what will be observed, up to instrumental error and other noise.

Statistics, at least inferential statistics, is like an _inverse problem_. The physics is known
but there are unknown parameters. The issue is to use the (noisy) data to learn about parameters
of the system.

Probability is a combination of mathematics and philosophy.
The philosophy connects the mathematics to the world.

We will start with the mathematics.

## Naive Set Theory


The mathematics of probability is expressed most naturally using set theory.
We will review the basic terminology and reviews naive set theory: how to define and
manipulate sets, operations on sets that yield other sets, special
relationships among sets, and so on.
Translating word problems into the language of set theory is crucial in solving logic and
probability problems.
Venn diagrams are useful for visualizing the relationships among sets.
In addition to its crucial role in probability, 
set theory is intimately connected to
categorical logic and to propositional logic.



A <span class="termOfArt">set</span> is a collection of things (called the
<span class="termOfArt">elements</span> of the
set or the <span class="termOfArt">members</span> of the set)
without regard to their order.
We often define sets by
listing their contents within curly braces \{ \}.
For example, $\{1, 2, 3\}$ is the set whose elements are the numbers 1, 2, and 3.
Another way to define a set is to characterize
its elements.
For example $\{x : P(x)\}$
is the set of all values of $x$ for which
$P(x)$ is true.
(The function $P(x)$ is called the
<span class="termOfArt">predicate function</span> or
<span class="termOfArt">membership function</span>.)

Sets usually are defined with respect to a universe of things
that contains everything of interest.
The symbol $\mathbf{S}$ denotes the universe.

The set with no elements is denoted $\{\}$
or $\emptyset$.


<span class="termOfArt">Venn diagrams</span>
represent sets and the relationships among sets pictorially.
A Venn diagram represents the universe $\mathbf{S}$ by a two-dimensional
region (usually a rectangle or a circle).
Sets within $\mathbf{S}$ are denoted by closed regions
within $\mathbf{S}$.
Shading or highlighting is used in Venn diagrams to draw attention
to special relationships or sets.
Here is a Venn diagram for $\mathbf{S}$.


<div class="venn">
<div class="vennCaption">Venn Diagram for $\mathbf{S}$</div>
<div class="vennUniverse left blue">$\mathbf{S}$</div>
</div>


If $x$ is in the set $A$ we write
$x\in A$, pronounced &quot;x is an element of
A&quot; or &quot;x is a member of A.&quot;
Equivalently, we write $A \ni x$, pronounced &quot;A contains x.&quot;
If $x$ is not an element of $A$
we write $x \not\in A$.
Here is a Venn diagram showing $\mathbf{S}$,
a set $A$ within $\mathbf{S}$,
and a point $x\in A$.


<div class="venn">
<div class="vennCaption">Venn Diagram for $x\in A$</div>
<div class="vennUniverse left">$\mathbf{S}$</div>
<div class="rect60x70_center blue left">$A$
<p class="math">&middot;$x$</p>
</div>
</div>


If the sets $A$ and $B$
have exactly the same elements, we say that the sets are <em>equal</em> and we
write $A = B$.
Remember that order does not matter: the set $\{a, b, c\}$
is equal to the set $\{c, a, b\}$, but not to the set
$\{a, b, d\}$ nor to the set $\{a, b, c, d\}$.

The <span class="termOfArt">complement</span> of the set $A$
(with respect to the universe
$\mathbf{S}$), denoted $A^c$,
is the set of all things in $\mathbf{S}$ that are <em>not</em>
in $A$.
The complement of the set $A$ is pronounced
&quot;A complement&quot; or &quot;not A.&quot;
For instance, if the universe $\mathbf{S}$ is the set of living
people and $A$ comprises all living people who are over 6' tall, then
$A^c$ comprises all living people who are no more than 6' tall.
If $\mathbf{S}$ is the set of all ravens and
$A$ comprises all black ravens, then
$A^c$ is the set of all ravens that are not black.
The complement of the complement of a set is the original set:
$(A^c)^c = A$.
The set of all ravens that are not (not black) is the set of all ravens that are black.
The empty set is the complement of the universal set $\mathbf{S}$,
and vice versa: $S^c = \emptyset$ and
$\emptyset^c = \mathbf{S}$.
Nothing in $\mathbf{S}$ is not in
$\mathbf{S}$.

Here is a Venn diagram illustrating the set $A^c$.
The yellow region is $A$ and the blue region is
$A^c$.

<div class="venn">
<div class="vennCaption">Venn Diagram for $A$ (yellow area) and
$A^c$ (blue area)</div>
<div class="vennUniverse blue left">$\mathbf{S}$
 <p align="left">$A^c$
</div>
<div class="rect60x70_center yellow right">$A$</div>
</div>

Suppose we have two sets, $A$ and $B$.
If every element of $A$ is also an element of $B$,
we say that $A$ is a <span class="termOfArt">subset</span>
of $B$ and we write $A \subset B$ or $B \supset A$.
For instance,  $\{1, 2\}$ is a subset of
$\{1, 2, 3\}$, but
$\{1, 4\}$ is not a subset of $\{1, 2, 3\}$.
With respect to the universe $\mathbf{S}$ of animals,
the set of ravens is a subset of the set of birds, but
the set of ravens is not a subset of the set of fish.
Nor is the set of black birds a subset of the set of ravens, because some
black birds are not ravens.
Here is a Venn diagram illustrating $A \subset B$.

<div class="venn">
<div class="vennCaption">Venn Diagram for $A \subset B$</div>
<div class="vennUniverse left">$\mathbf{S}$</div>
<div class="rect60x70_center blue right">$B$</div>
<div class="rect40x50_center_high yellow left" >$A$&nbsp;</div>
</div>

Unions and intersections are associative and commutative with themselves, but not with
each other.
For instance, $A \cap (B\cup C)$ is not generally equal to
$(A\cap B) \cup C$.
However, there are rules for combining (distributing) unions and intersections:

+ $ A\cap (B\cup C) = (A \cap B) \cup  (A \cap C)$.
+ $A \cup (B\cap C) = (A\cup B) \cap  (A\cup C)$.

Two sets are <span class="termOfArt">disjoint</span> or 
<span class="termOfArt">mutually exclusive</span>
if their intersection is the empty set;
that is, if the two sets have no elements in common.
For instance, the sets $\{1, 2, 3\}$ and $\{0, 4\}$
are disjoint.
In symbols, $A$ and $B$ are disjoint if
$A\cap B = \emptyset$.
Here is a Venn diagram illustrating sets $A$ and
$B$ that are disjoint:

<div class="venn">
<div class="vennCaption">
$A$ and $B$ <span class="termOfArt">disjoint</span>:
$A\cap B = \emptyset$.
</div>
<div class="vennUniverse">$\mathbf{S}$</div>
<div class="rect40x50_left blue left">$A$</div>
<div class="rect40x50_right yellow right">$B$&nbsp;</div>
</div>


A collection of sets $\{A_{1}, A_{2}, A_{3}, \ldots \}$
is <span class="termOfArt">disjoint</span> if every pair of sets in the collection
is disjoint; that is,
the collection is disjoint if
$A_{i} A_{j} = \emptyset$ whenever
$i \ne j$  ($i \ne j$ means $i$
is not equal to $j$).
For instance, the sets $\{1, 2, 3\}$, $\{0, 4\}$,
$\{5\}$, and $\{-1, -2, -3, -4\}$ are disjoint, but
the sets $\{1, 2, 3\}$, $\{1, 4\}$,
$\{5\}$, and $\{-1, -2, -3, -4\}$ are not.
The set of black ravens and the set of white ravens are mutually exclusive.
The empty set and any other set are mutually exclusive, because the intersection of the empty set
and any other set is the empty set.

Sometimes we need to ensure that every element of some set is in at least one set in a collection
of sets.
This issue arises in counting and in probability when we want to break a complicated set into smaller
pieces and know that we have not lost anything.
The corresponding technical term is <span class="termOfArt">exhaust</span>.
A collection of sets $\{A_{1}, A_{2},
A_{3}, \ldots  \}$ <span class="termOfArt">exhausts</span> a set
$A$ (the collection <em>is exhaustive of</em> $A$) if
every element of $A$ is in at least one of the sets
$A_{1}, A_{2}, A_{3}, \ldots $; that is, if
$A$ is a subset of
$A_{1} \cup  A_{2} \cup A_{3} \cup  \cdots$.
For instance, the sets $\{1, 2, 3\}$, $\{1, 4\}$
$\{3, 5\}$ and $\{-1, -2, -3, -4\}$ exhaust
the set $\{2, 3, 4, 5\}$, but they do not exhaust the set
$\{0, 1, 2\}$.
The set of men, the set of women, and the set of children exhaust the set of people.
The set of black ravens and the set of non-black ravens exhaust the set of ravens.


Sometimes we need to ensure that every element of some set is in exactly one set in a collection
of sets&mdash;and that nothing but members of that set are in the collection.
This issue arises in counting and in probability when we want to break a complicated set into
smaller parts while avoiding double-counting or counting anything extra.
The corresponding technical term is <span class="termOfArt">partition</span>.
A collection of sets $ \{A_{1}, A_{2}, A_{3}, \ldots \}$ <span class="termOfArt">partitions</span> a set
$A$ (the collection <em>is a partition of</em> $A$) if
the collection is disjoint, each element $A_{i}$
of the collection is a subset of $A$,
and the collection exhausts $A$;
that is, if each element of $A$ is in exactly one of the sets
$A_{1}, A_{2}, A_{3}, \ldots $ and $A = A_{1} \cup  A_{2} \cup 
A_{3} \cup  \cdots $.
For instance, the sets $\{a\}, \{b, d\},$ and $\{c\}$
partition the set $\{a, b, c, d\}$.
The set of black ravens and the set of non-black ravens partition the set of ravens.

### Useful Properties of Subsets, Complements, Unions and Intersections


+ If $A\subset B$ and $B\subset A$ then $A = B$. (If two sets are subsets of each other, they are the same set.)

+ $\emptyset \subset A$. (The empty set is a subset of every set.)

+ $\emptyset \cup A = A.$
(The union of the empty set and any other set is that set.)

+ $\emptyset \cap A = \emptyset.$
(The intersection of the empty set and any other set is empty.)

+ If $A \subset B$ and $B \subset C$ then $A \subset C$. (The subset relation is transitive.)

+ If $A \subset B$ then $B^c \subset A^c$.
(Complementation reverses the subset relation.)

+ $A \cap B \subset  A$.  Moreover, $A\cap B = A$ if and only if $A \subset B$.

+ $A \subset  A\cup B$.   Moreover, $A = A\cup B$ if and only if $B \subset A$.

+ $(A\cap B)^c = A^c \cup B^c$. (de Morgan)

+ $(A\cup B)^c = A^c\cap B^c$. (de Morgan)

+ $A \cap B = B \cap A$. (Intersection is commutative.)

+ $A\cup B = B\cup A.$ (Union is commutative.)

+ $A\cap (B\cap C) = (A\cap B)\cap C.$ (Intersection is associative.)

+ $A\cup (B\cup C) = (A\cup B)\cup C.$ (Union is associative.)

+ $A\cap (B\cup C) = (A\cap B)\cup (A\cap C).$ (Distribution of intersection over union.)

+ $A\cup (B\cap C) = (A\cup B)\cap (A\cup C).$ (Distribution of union over intersection.)


## Cardinality, Counting, and The Inclusion-Exclusion Principle

### Cardinality
The <span class="termOfArt">cardinality</span> of a set is the number of elements it contains.
The cardinality of the set $A$ is denoted $|A|$ or $\#A$.
If the elements of the set $A$ can be put in one-to-one correspondence with the integers $1, 2, \ldots, n$, then $A$ is <span class="termOfArt">finite</span>
and $\#A$ = n$.
If the elements of $A$ can be put in one-to-one correspondence with the 
positive integers $1, 2, 3, \ldots$, then $A$ is <span class="termOfArt">countable</span>, and $\#A = \aleph_0$.
If the elements of $A$ can be put in one-to-one correspondence with the
real numbers, then $A$ is <span class="termOfArt">uncountable</span>, and $\#A =c$.
(More generally, if $A$ has more elements than there are integers, then $A$ is uncountable.)

There are many strategies for counting the elements of a set aside from simply
enumerating them.

For instance, if $\{ A_i \}$ is a <span class="termOfArt">partition</span>
of $A$, then $\#A = \sum_i \#A_i$.

Suppose $A = A_1 \cup A_2$, but that possibly $A_1 A_2 \ne \emptyset$,
so $\{A_1, A_2\}$ might not be a partition of $A$, because
$A_1$ and $A_2$ are not necessarily disjoint.
Then still 
$$\#A = \#A_1 + \#A_2 - \#A_1A_2.$$

This is seen most easily using a Venn diagram, and can be proved
by constructing a partition of $A$,
$A = A_1A_2^c \cup A_1^cA_2 \cup A_1A_2$, and noting that
$\#A_1 + \#A_2 = \#A_1A_2^c + \#A_1^cA_2 + 2\#A_1A_2$.

If 
$A = A_1 \cup A_2 \cup A_3$ but $\{A_1, A_2, A_3\}$ are not necessarily disjoint,
then still

$$\#A = \#A_1 + \#A_2 + \#A_3 - \#A_1A_2 - \#A_1A_3 - \#A_2A_3 + \#A_1A_2A_3.$$

More generally, if $A \subset \cup_{i=1}^n A_i$, then the <span class="termOfArt">Inclusion-Exclusion Principle</span> holds:

$$ \#A = \sum_{i=1}^n \#A_i - \sum_{1 \le i_1 < i_2 \le n} \#(A_{i_1}A_{i_2}) +
\sum_{1 \le i_1 < i_2 < i_3 \le n} \#(A_{i_1}A_{i_2}A_{i_3}) - \cdots 
+(-1)^{k-1} \sum_{1 \le i_1 < i_2 < \cdots < i_k \le n} \# (A_{i_1}A_{i_2} \cdots A_{i_k}) + \cdots
$$

The Inclusion-Exclusion Principle makes some complicated counting problems tractable.

## Connecting Probability to Set Theory

A <span class="termOfArt">random experiment</span> or <span class="termOfArt">random trial</span>
is basically any
situation whose outcome is not perfectly predictable, but for which we
can specify all possible outcomes, and that shows long-term regularities.
For example, when we toss a coin, we do not know how it will land,
but it certainly must land heads, tails, on its edge, or not land at all.
There is no other possibility.
The set of all possible outcomes of a random experiment is called the
<span class="termOfArt">outcome space</span>.
The letter $\mathbf{S}$ will denote outcome space.
We are free to choose the outcome space to correspond to what we deem
relevant for the experiment, as long as it is essentially inevitable that the
random experiment will result in some outcome in the outcome space.
For example, the outcome space we just described was {heads, tails, edge, doesn't land}.
It might be adequate for our purposes for the outcome space to be {heads, not heads}.

Often we shall tailor outcome spaces for specific problems.
Here is an example: Imagine a box containing tickets that are indistinguishable
except that each has written upon it a unique number between 1 and the number
of tickets, $n$.
We shake the box, draw a ticket from the box without looking, and record
the number written on the ticket we happened to draw.
The natural outcome space of this experiment is the set of numbers
$\{1, 2,  \ldots  , n\}$.
However, suppose we are interested only in whether the number on the
ticket we draw is even.
The outcome space then could be reduced to {even number on ticket, odd number on ticket},
or coded even more abstractly as $\{0, 1\}$, 
where the outcome is the number of even-numbered tickets drawn.

An <span class="termOfArt">event</span> is a subset of outcome space: a collection of outcomes in the
outcome space.
$A$ is an event if $A \subset \mathbf{S}$.
For example, in the experiment of drawing a numbered ticket from the box,
suppose there are 10 tickets in all, and that we choose the outcome space
$\mathbf{S}$ to be the numbers
$\{1, 2, 3,  \ldots  , 9, 10\}$.
Then &quot;we draw the number 1&quot; is the event $\{1\}$, and
&quot;we draw an even number&quot; is the event $\{2, 4, 6, 8, 10\}$,
both of which are subsets of the set of possible outcomes, the outcome space
$\mathbf{S}$.

Two events are <span class="termOfArt">disjoint</span> or
<span class="termOfArt">mutually exclusive</span>
if the occurrence of one is incompatible with the occurrence of the other;
that is, if they have no outcome in common.
This is equivalent to the definition of disjoint sets, viewing
events as sets.
The event $A$ <span class="termOfArt">implies</span> the event
$B$ if $A \subset B$: then if $A$ occurs, $B$ must
also occur (if the outcome that occurs is in $A$, the outcome
that occurs is also in $B$, because every element of $A$
is an element of $B$).

1. Counting and combinatorics
    + Sets: unions, intersections, partitions
    + De Morgan's Laws
    + The Inclusion-Exclusion principle
    + The Fundamental Rule of Counting
    + Combinations
    + Permutations
    + Strategies for counting



2. Theories of probability
    + Equally likely outcomes
    + Frequency Theory
    + Subjective Theory
    + Shortcomings of the theories
    + Rates versus probabilities
    + Measurement error
    + Where does probability come from in physical problems?
    + Making sense of geophysical probabilities
        - Earthquake probabilities
        - Probability of magnetic reversals
        - Probability that Earth is more than 5B years old

In [2]:
%run talkTools.py