### Phys 629, Fall 2023, University of Mississippi


# Lecture 2, Chapter 3: Probability and Statistical Distributions

Material in this lecture and notebook is based upon the Basic Stats portion of G. Richards' "Astrostatistics" class at Drexel University (PHYS 440/540, https://github.com/gtrichards/PHYS_440_540), the Introduction to Probability & Statistics portion of A. Connolly's & Ž. Ivezić's "Astrostatistics & Machine Learning" class at the University of Washington (ASTR 598, https://github.com/dirac-institute/uw-astr598-w18), J. Bovy's mini-course on "Statistics & Inference in Astrophysics" at the University of Toronto (http://astro.utoronto.ca/~bovy/teaching.html), and Stephen R. Taylor (https://github.com/VanderbiltAstronomy/astr_8070_s22). 

##### Reading:

- [Textbook](http://press.princeton.edu/titles/10159.html) Chapter 3. 
- [David Hogg: "Data analysis recipes: Probability calculus for inference"](https://arxiv.org/abs/1205.4446)

***Exercises required for class participation are in <font color='red'>red</font>.***

## Notation

First we need to go over some of the notation that the book uses.   

$x$ is a scalar quantity, measured $N$ times

$x_i$ is a single measurement with $i=1,...,N$

$\{x_i\}$ refers to the set of all N measurements

We are generally trying to *estimate* $h(x)$, the *true* distribution from which the values of $x$ are drawn. We will refer to $h(x)$ as the probability density (distribution) function or the "pdf" and $h(x)dx$ is the propobability of a value lying between $x$ and $x+dx$. 

The "left to right" integral of $h(x)$ is the cumulative distribution function ("cdf"), $H(x) = \int_{-\infty}^{x}h(x')dx'$. The inverse function of the cdf is the **quantile function**, e.g., what value has 90% of the distribution below it?

While $h(x)$ is the "true" pdf, what we *measure* from the data is the **empirical** pdf, which is denoted $f(x)$.  So, $f(x)$ is a *model* of $h(x)$.  In principle, with infinite data $f(x) \rightarrow h(x)$, but in reality measurement errors keep this from being strictly true. Likewise, the empirical cdf is denoted $F(x)$.

## Probability

The probability of $A$, $p(A)$, is the probability that some event will happen (say a coin toss), or if the process is continuous, the probability of $A$ falling in a certain range.

### Probability axioms (Kolmogorov axioms):
Probability, p(A) must satisfy three Kolmogorov axioms:
1. $p(A) \geq 0$ for each A. ($p(A)$ must be positive definite)
2. $p(\Omega) = 1$, where $\Omega$ is the set of all possible outcomes. (sum/integral of the pdf must be unity)
3. If $A_1, A_2,...$ are disjoint events, then $p(\cup_{i=1}^{\infty} A_i)=\sum_{i=1}^{\infty}p(A_i)$, where $\cup$
stands for "union".

If we have two events, $A$ and $B$, the possible combinations are illustrated by the following figure:
![Figure 3.1](http://www.astroml.org/_images/fig_prob_sum_1.png)

$A \cup B$ is the *union* of sets $A$ and $B$.

$A \cap B$ is the *intersection* of sets $A$ and $B$.

The probability that *either* $A$ or $B$ will happen is the *union*, given by

$$p(A \cup B) = p(A) + p(B) - p(A \cap B)$$

The figure makes it clear why the last term is necessary.  Since $A$ and $B$ overlap, we are double-counting the region where *both* $A$ and $B$ happen, so we have to subtract this out.  

If $\bar{A}$ is the complement of the event A, then

$$p(A)+p(\bar{A}) = 1$$

The probability that *both* $A$ and $B$ will happen, $p(A \cap B)$, is 
$$p(A \cap B) = p(A|B)p(B) = p(B|A)p(A)$$

where p(A|B) is the probability of A *given that* B is true and is called the *conditional probability*.  So the $|$ is short for "given that".

The **law of total probability** says that

$$p(A) = \sum_ip(A|B_i)p(B_i)$$

Note that different people use different notation and the following all mean the same thing

$$p(A \cap B) = p(A,B) = p(AB) = p(A \,{\rm and}\, B)$$

We will use the comma notation as in the textbook.


It is important to realize that the following is *always* true

$$p(A,B) = p(A|B)p(B) = p(B|A)p(A)$$

## Random Variables
A random or stocastic variable is a variable that can take a set of possible different values, each with an associated probability. 


However, if $A$ and $B$ are independent random variables, then 

$$p(A,B) = p(A)p(B)$$

Let's look an example.

If you have a bag with 5 marbles, 3 yellow and 2 blue and you want to know the probability of picking 2 yellow marbles in a row, that would be

$$p(Y_1,Y_2) = p(Y_1)p(Y_2|Y_1).$$

$p(Y_1) = \frac{3}{5}$ since you have an equally likely chance of drawing any of the 5 marbles.

If you did not put the first marble back in the back after drawing it (sampling *without* "replacement"), then the probability

$p(Y_2|Y_1) = \frac{2}{4}$, so that

$$p(Y_1,Y_2) = \frac{3}{5}\frac{2}{4} = \frac{3}{10}.$$

But if you put the first marble back, then

$p(Y_2|Y_1) = \frac{3}{5} = p(Y_2)$, so that 

$$p(Y_1,Y_2) = \frac{3}{5}\frac{3}{5} = \frac{9}{25}.$$

In the first case $A$ and $B$ (or rather $Y_1$ and $Y_2$) are *not* independent, whereas in the second case they are.

<font color='red'>What will be the probability of both balls being blue? Compute this in both cases (i) when you put back the first blue ball and (ii) when you did not put back the first blue ball.</font>

We say that two random variables, $A$ and $B$ are independent *if*

$p(A,B) = p(A)p(B)$ (knowing $B$ does not give any information about $A$ and vice versa).

Here is a more complicated example from 
[Jo Bovy's class at UToronto](http://astro.utoronto.ca/%7Ebovy/teaching.html)
![Bovy_L1-StatMiniCourse_page21](figures/Bovy_L1-StatMiniCourse_page21.png)

As illustrated, 

$$p(A \,{\rm or}\, B|C) = p(A|C) + p(B|C) - p(A \, {\rm and}\, B|C)$$ 

This illustration also explains why $$p(x|y)p(y) = p(y|x)p(x)$$ (used below),

or in the notation of this figure: 

$$p(A \, {\rm and}\, B) \equiv p(A,B) = p(A|B)p(B) = p(B|A)p(A)$$



Need more help with this?  Try watching some Khan Academy videos and working through the exercises:
[https://www.khanacademy.org/math/probability/probability-geometry](https://www.khanacademy.org/math/probability/probability-geometry)