# Lecture 2 - Introduction to Probability Theory

> Probability theory is nothing but common sense reduced to calculation. P. Laplace (1812)

Probability theory is an extension of Aristotelian logic, see \cite{jaynes2003}, in the sense that it allows us to argue under uncertainty.

## The basic desiderata of probability theory
It is actually possible to derive the rules of probability based on a system of common sense requirements.
Paraphrasing 
[Chapter 1](http://home.fnal.gov/~paterno/images/jaynesbook/cc01p.pdf) of \cite{jaynes2003}),
we would like our system to satisfy the following desiderata:

1) *Degrees of plausibility are represented by real numbers.*

2) *The system should have a qualitative correspondance with common sense.*

3) *The system should be consistent in the sense that:*
    
   + *If a conclusion can be reasoned out in more than one way, then every possible way must lead to the same result.*
    
   + *All the evidence relevant to a question should be taken into account.*
    
   + *Equivalent states of knowledge must be represented by equivalent plausibility assignments.*

## How to speak about probabilities?
Let
+ A be a logical sentence,
+ B be another logical sentence, and
+ and I be all other information we know.

There is no restriction on what A and B may be as soon as none of them is a contradiction.
We write as a shortcut:
$$
\mbox{not A} \equiv \neg,
$$
$$
A\;\mbox{and}\;B \equiv A,B \equiv AB,
$$
$$
A\;\mbox{or}\;B \equiv A + B.
$$

We **write**:
$$
p(A|BI),
$$
and we **read**:
> the probability of A being true given that we know that B and I is true

or (assuming knowledge I is implied)

> the probability of A being true given that we know that B is true

or (making it even shorter)

> the probability of A given B.

$$
p(\mbox{something} | \mbox{everything known}) = \mbox{probability samething is true conditioned on what is known}.
$$

$p(A|B,I)$ is just a number between 0 and 1 that corresponds to the degree of plaussibility of A conditioned on B and I.
0 and 1 are special.

+ If
$$
p(A|BI) = 0,
$$
we say that we are certain that A is false if B is true.

+ If
$$
p(A|BI) = 1,
$$
we say that we are certain that A is false if B is false.

+ If
$$
p(A|BI) \in (0, 1),
$$
we say that we are uncertain about A given that B is false.
Depending on whether $p(A|B,I)$ is closer to 0 or 1 we beleive more on one possibiliy or another.
Complete ignorance corresponds to a probability of 0.5.

## The rules of probability theory

According to
[Chapter 2](http://home.fnal.gov/~paterno/images/jaynesbook/cc02m.pdf) of \cite{jaynes2003} the desiderata are enough
to derive the rules of probability.
These rules are:

+ The **obvious rule** (in lack of a better name):
$$
p(A | I) + p(\neg A | I) = 1.
$$

+ The **product rule** (also known as the Bayes rule or Bayes theorem):
$$
p(AB|I) = p(A|BI)p(B|I).
$$

All the other rules of probability theory can be derived from these two rules.
To demonstrate this, let's prove that:
$$
p(A + B|I) = p(A|I) + p(B|I) - p(AB|I).
$$
Here we go:
\begin{eqnarray*}
p(A+B|I) &=& 1 - p(\neg A \neg B|I)\;\mbox{(obvious rule)}\\
         &=& 1 - p(\neg A|\neg BI)p(\neg B|I)\;\mbox{(product rule)}\\
         &=& 1 - [1 - p(A |\neg BI)]p(\neg B|I)\;\mbox{(obvious rule)}\\
         &=& 1 - p(\neg B|I) + p(A|\neg B I)p(\neg B|I)\\
         &=& 1 - [1 - p(B|I)] + p(A|\neg B I)p(\neg B|I)\\
         &=& p(B|I) + p(A|\neg B I)p(\neg B|I)\\
         &=& p(B|I) + p(A\neg B|I)\\
         &=& p(B|I) + p(\neg B|AI)p(A|I)\\
         &=& p(B|I) + [1 - p(B|AI)] p(A|I)\\
         &=& p(B|I) + p(A|I) - p(B|AI)p(A|I)\\
         &=& p(A|I) + p(B|I) - p(AB|I).
\end{eqnarray*}

### The sum rule
Now consider a finite set of logical sentences, $B_1,\dots,B_n$ such that:
1. One of them is definitely true:
    $$
    p(B_1+\dots+B_n|I) = 1.
    $$
2. They are mutually exclusive:
    $$
    p(B_iB_j|I) = 0,\;\mbox{if}\;i\not=j.
    $$

The **sum rule** states that:
    $$
    P(A|I) = \sum_i p(AB_i|I) = \sum_i p(A|B_i I)p(B_i|I).
    $$
We can prove this by induction. Let's just prove it for $n=2$:
\begin{eqnarray*}
p(A|I) &=& p[A(B_1+B_2|I]\\
       &=& p(AB_1 + AB_2|I)\\
       &=& p(AB_1|I) + p(AB_2|I) - p(AB_1B_2|I)\\
       &=& p(AB_1|I) + p(AB_2|I),
\end{eqnarray*}
since
$$
p(AB_1B_2|I) = p(A|B_1B_2|I)p(B_1B_2|I) = 0.
$$

Why is the sum rule useful?

## Excersises

To solve the following excercises use:
+ Common sense
+ The obvious, the product, and the sum rules of probability.


1. This exercise demonstrates the probability theory is actually an extension of logic. Assume that you know that "A implies B". That is, your prior information is:
   $$
   I = \{A\implies B\}.
   $$
   Show that:
     1. $p(AB|I) = p(A|I)$ (use common sense).
     2. If $p(A|I) = 1$, then $p(B|I) = 1$.
     3. If $p(B|I) = 0$, then $p(A|I) = 0$.
     4. B and C show that probability theory is consistent with Aristotelian logic. Now, you will discover how it extends it. Show that if B is true, then A becomes more plausible, i.e.
      $$
      p(A|BI) \ge p(A|I).
      $$
     5. Give at least two examples of D that apply to various scientific fields. To get you started, here are two examples:
       1. A: It is raining. B: There are clouds in the sky. Clearly, $A\implies B$. D tells us that if there are clouds in the sky, raining becomes more plausible.
       2. A: General  relativity. B: Light is deflected in the presence of massive bodies. Here $A\implies B$. Observing that B is true makes A more plausible.
     6. Show that if A is false, then B becomes less plausible.
        $$
        P(B|\neg A I) \le p(B|I).
        $$
     7. Can you think of an example of scientific reasoning that involves F? For example:
        1. A: It is raining. B: There are clouds in the sky. F tells us that if it is not raining, then it is less plausible that there are clouds in the sky.
     7. Do D and F contradict Karl Popper's *principle of falsification*, "A theory in the empirical sciences can never be proven, but it can be falsified, meaning that it can and should be scrutinized by decisive experiments." Source: [Wikipedia](https://en.wikipedia.org/wiki/Karl_Popper))