
# Probability

* An **experiment** is any activity or process whose outcome is subject to uncertainty

* The **sample space** of an experiment, denoted by $\mathcal{S}$, is the set of all possible
outcomes of that experiment.

* An event is any collection (subset) of outcomes contained in the sample space $\mathcal{S}$. 
 
## Example:  
Consider an experiment in which each of **three** vehicles taking a particular freeway
exit turns left (L) or right (R) at the end of the exit ramp. The **eight** possible outcomes
that comprise the sample space are 
$$\mathcal{S}=\{LLL, RLL, LRL, LLR, LRR, RLR, RRL, and RRR\}.$$
Thus, there are eight simple events, among which are $E1=\{LLL\}$ and $E5=\{LRR\}$.
Some compound events include 

 + $A=\{RLL, LRL, LLR\}$: the event that exactly one of the three
vehicles turns right; 

 + $B=\{LLL, RLL, LRL, LLR\}$:  the event that at most one of the
vehicles turns right; 

 + $C=\{LLL, RRR\}$: the event that all three vehicles turn in the
same direction. 


# Axioms of Probability

* For any event $A$, $P(A)\ge 0$.
* $P(\mathcal{S})=1$.
* If $A_1$, $A_2$, $A_3$, $\ldots$ is an infinite collection of disjoint events, then $P(A_1 \cup  A_2 \cup A_3 \cup \ldots)=\sum_{i=1}^{\infty} P(A_i)$. 

*Note*: Events $A_1$, $A_2$, $A_3$, $\ldots$  are called *disjoint* or *mutually exclusive* if for any pair of events $A_{i}$ and $A_{j}$, we have $A_i \cap A_j=\emptyset$.  



# Conditional Probability

For any two events $A$ and $B$ with $P(B)> 0$, the conditional probability of $A$
given that $B$ has occurred is defined by
\begin{equation*}
    P(A|B)=\frac{P(A\cap B)}{P(B)}.
\end{equation*}

# The Law of Total Probability

Let $A_1, \ldots, A_k$ be mutually exclusive and exhaustive events. Then for any other event $B$,
\begin{equation*}
    P(B)=\sum_{i=1}^{k} P(B|A_i)P(A_i).
\end{equation*}

*Note*: Events $A_1$, $A_2$, $A_3$, $\ldots$, $A_k$  are called *exhaustive* if $A_1 \cup  A_2 \cup A_3 \cup \ldots \cup A_k =\mathcal{S}$. 


# Bayes' Theorem

Let $A_1, \ldots, A_k$ be a collection of $k$ mutually exclusive and exhaustive events
with prior probabilities $P(A_i)$, $i=1, \ldots, k$. Then for any other event $B$ for
which $P(B) \ge 0$, the posterior probability of $A_j$, $j=1, \ldots, k$, given that $B$ has occurred is
\begin{align*}
    P(A_j|B) & =\frac{P(A_j \cap B)}{P(B)} \\
            & = \frac{P(B|A_j)P(A_j)}{\sum_{i=1}^{k} P(B|A_i)P(A_i)}.
\end{align*}


# Independence

* Two events $A$ and $B$ are independent if $P(A | B)= P(A)$ and are dependent
otherwise. 

* The above definition is symmetric. That is $P(B | A)= P(B)$ if and only if (iff) $P(A | B)= P(A)$. 

* $A$ and $B$ are independent iff 
$P(A \cap B)= P(A) \times  P(B)$. 

* Events $A_1, \ldots,  A_n$ are mutually independent if for every $k$, $k=2, \ldots, n$ and
every subset of indices $i_1, i_2, \ldots, i_k$, we have 
$$P(A_{i_1} \cap A_{i_2} \cap \ldots A_{i_k}=P(A_{i_1}) P(A_{i_2}) \ldots P(A_{i_k}).$$

# Random Variables

* For a given sample space $\mathcal{S}$ of some experiment, a random variable (rv) is
any rule that associates a number with each outcome in $\mathcal{S}$. In mathematical
language, a random variable is a function whose domain is the sample space
and whose range is the set of real numbers, i.e., $X : \mathcal{S} \mapsto \mathbb{R}$.

## Example

When a student calls a university help desk for technical support, he/she will either
immediately be able to speak to someone ($S$, for success) or will be placed on hold
($F$, for failure). With $\mathcal{S}=\{S, F\}$, define an rv $X$ by
$$X(S)=1, \;  X(F)=0 .$$
The rv $X$ indicates whether (1) or not (0) the student can immediately speak to
someone.

# Types of Random Variables


A *discrete* random variable is an rv whose possible values either constitute a finite set or else can be listed in an infinite sequence in which there is a first element, a second element, and so on ("countably" infinite).



A random variable is *continuous* if both of the following apply:
1. Its set of possible values consists either of all numbers in a single interval on the number line (possibly infinite in extent, e.g., from $-\infty$ to $\infty$) or all numbers in a disjoint union of such intervals (e.g., $[0, 10] \cup [20, 30]$).
2. No possible value of the variable has positive probability, that is, $P(X = c) =0$ for any possible value $c$.

# Probability Distributions


* The probability distribution or probability mass function (pmf) of a discrete rv $X$
is defined for every number $x$ by $$p(x)= P(X = x)=  P(all \; \omega \in \mathcal{S} : X(\omega)= x).$$


*Note*: The conditions $p(x) \ge 0$ and $\sum_{all \ possible \ x} p(x)=1$ are required of any pmf.

* Let $X$ be a continuous rv. Then a probability distribution or probability density
function (pdf) of $X$ is a function $f(x)$ such that for any two numbers $a$ and
$b$ with $a \le b$,
$$P(a \le X \le  b) = \int_{a}^{b} f(x) d x .$$

*Note*: For $f(x)$ to be a legitimate pdf, it must satisfy the following two conditions: 
1. $f (x) \ge 0$ for all $x$; 
2. $\int_{-\infty}^{\infty} f(x)dx=1$.


# Cumulative Distribution Function 

The cumulative distribution function  (cdf) of a random variable $X$ is the probability that the observed value pf $X$ is at most $x$: 

\begin{equation*}
    F(x)=P(X \le x)= \begin{cases}
        \sum_{y: y \le x} p(y)\\
        \int_{-\infty}^{x} f(x)dx.
    \end{cases}
\end{equation*}

# Expected Value 

The expected value of a rv $X$ is:

\begin{equation*}
    E(X)=\mu_{X}=\begin{cases}
        \sum_{x} x p(x)\\
        \int_{-\infty}^{\infty} x f(x)dx.
    \end{cases}
\end{equation*}

The expected value of any function $h(X)$ of a rv $X$ is: 

\begin{equation*}
    E(h(X))=\begin{cases}
        \sum_{x} h(x) p(x)\\
        \int_{-\infty}^{\infty} h(x) f(x)dx.
    \end{cases}
\end{equation*}

# Variance

The variance  of a rv $X$ is:

\begin{equation*}
    Var(X)=\sigma^{2}_{X}=E[(X-\mu)^{2}]= \begin{cases}
        \sum_{x} (x-\mu)^{2} p(x)\\
        \int_{-\infty}^{\infty}  (x-\mu)^{2} f(x)dx.
    \end{cases}
\end{equation*}


*Note 1*: $Var(X)=E(X^2)-[E(X)]^2$.

*Note 2*: Standard deviation (SD) of $X$ is $\sigma_{X}=\sqrt{\sigma_{X}^{2}}$. 

# Expected Value and Variance of a Linear Function of $X$

* $E(aX+b)=aE(X)+ b$.

* $Var(aX+b)=a^2 Var(X)$.

# Higher Order Moments

* The $k$th moment of the distribution of a rv $X$ is defined as 
\begin{equation*}
    E(X^k)=\begin{cases}
        \sum_{x} x^k p(x)\\
        \int_{-\infty}^{\infty} x^k f(x)dx.
    \end{cases}
\end{equation*}


* The $k$th centered-moment of the distribution of a rv $X$ is defined as 
\begin{equation*}
    E[(X-\mu)^k]=\begin{cases}
        \sum_{x} (x-\mu)^k p(x)\\
        \int_{-\infty}^{\infty} (x-\mu)^k f(x)dx.
    \end{cases}
\end{equation*}



# Percentile of a Continuous Distribution

Let $p$ be a number between $0$ and $1$. The $(100p)$th percentile of the distribution
of a continuous rv $X$, denoted by $\eta(p)$, is defined by
$$ p= F(\eta(p))=\int_{-\infty}^{\eta(p)} f(x)d(x).$$

![](images/percentile.png)

# Important Discrete RVs

## Bernouli: 

* Any random variable whose only possible values are 0 (failure) and 1 (success). 
* $X \sim Bernouli(p)$ $\Rightarrow$ $P(X=0)=1-p$, $P(X=1)=p$.   
* $E(X)=p$.
* $Var(X)=p(1-p)$. 

## Binomial: 

* **Binomail experiment**: A sequence of $n$ idependently and identically distributed Bernouli trials. 
* The numbe of successes among the $n$  idependently and identically distributed trials. 
* $X \sim Binomial(n,p)$.
* $P(X=x)= {n \choose x} p^x (1-p)^x, \; x=0, 1, \ldots, n$.
* $E(X)=np$.
* $Var(X)=np(1-p)$. 

# Important Discrete RVs (cont'd)

* **Hypergeometric**

* **Negative Binomial**

* **Poisson** 

Review above distributions from the textbook. 

# Important Continuous RVs

## Uniform:

* $X \sim Uniform(a,b)$.
* $f(x;a,b)=\frac{1}{b-a}, \; x \in [a,b]$.

## Exponential:

* $X \sim Exp(\lambda)$.
* $f(x)=\lambda e^{-\lambda x}, \; x\ge 0$. 
* $F(x)=1-e^{-\lambda x}, \; x\ge 0$. 
* $E(X)=\frac{1}{\lambda}$.
* $Var(X)=\frac{1}{\lambda^2}$.

and 

* Gamma

* Chi-Squared

* Weibull

* Beta

* Lognornmal

Review above distributions from the textbook. 

# Normal Distribution

* A continuous rv $X$ is said to have a normal distribution with parameters $\mu$
and $\sigma$ (or $\mu$ and $\sigma^2$), where $-\infty < \mu < \infty$ and $\sigma>0$ if the pdf of $X$ is
\begin{equation*}
    f(x; \mu, \sigma)=\frac{1}{\sqrt{2\pi\sigma}}  e^{-\frac{(x-\mu)^2}{2\sigma^2}}, \;  -\infty < x < \infty. 
\end{equation*}

* $X \sim N(\mu, \sigma^2)$. 


# Standard Normal Distribution

* The normal distribution with parameters $\mu=0$ and $\sigma=1$ is called *standard normal distribution*. 
* A rv having standard normal distribution is called *standard normal rv* and will be denoted by $Z$.
\begin{equation*}
    f(z; 0, 1)=\frac{1}{\sqrt{2\pi}}  e^{-\frac{z^2}{2}}, \;  -\infty < z < \infty. 
\end{equation*}

* The graph of $f (z; 0, 1)$ is called the standard normal (or $z$) curve. Its inflection
points are at -1 and 1. The cdf of $Z$ is $P(Z \le  z)=\int_{-\infty}^{z} f(y; 0, 1)$, which is denoted by $\Phi(z)$.


![](images/stnormal.png)

# Percentiles of Standard Normal Distribution

![](images/percentile_stnormal.png)

# $z_{\alpha}$ Notation for $z$ Critical Values

* $z_{\alpha}$ will denote the value on the $z$ axis for which $\alpha$ of the area under the $z$ curve lies to the right of $z_{\alpha}$. 

* $z_{\alpha}$ is the $100(1-\alpha)$th percentile of the standard normal distribution. 

![](images/critical_val.png)

![](images/symmetry.png)

# Nonstandard Normal Distribution

* If $X \sim Normal(\mu, \sigma^2)$, then 
$$Z=\frac{X-\mu}{\sigma}$$
has a standard normal distribution. 

\begin{align*}
    P(a \le X \le b) = & P(\frac{a-\mu}{\sigma} \le Z \le \frac{b-\mu}{\sigma})\\
    =& \Phi(\frac{a-\mu}{\sigma}) -\Phi(\frac{b-\mu}{\sigma}).
\end{align*}

* $P(X\le a)=\Phi(\frac{a-\mu}{\sigma})$. 
* $P(X\ge b)=1- \Phi(\frac{b-\mu}{\sigma})$. 

# Jointly Distributed Random Variables


## Joint Probability Distribution

* $p(x,y)=P(X=x, Y=y)$.

\begin{equation*}
    P[(X,Y) \in A]=\begin{cases}
        \sum_{(x,y)} \sum_{\in A} p(x,y)\\
        \int_{(x,y)} \int_{ \in A} f(x,y) dx dy.
    \end{cases}
\end{equation*}

## Marginal Distribution

* $p_X(x)=\sum_{y : p(x,y)>0} p(x,y)$

* $f_{X}(x)=\int_{\infty}^{\infty} f(x,y)dy$

## Independence

Two random variables $X$ and $Y$ are said to be *independent* if for every pair of
$x$ and $y$ values, we have 
\begin{equation*}
    p(x,y)=p_X(x) p_Y(y), \; when\  X \ and \ Y \ are \ discrete
\end{equation*}
or 
\begin{equation*}
    f(x,y)=f_X(x) f_Y(y), \; when\  X \ and \ Y \ are \ continuous.
\end{equation*}
Otherwise, of the above is not satisfied for all $(x,y)$, then $X$ and $Y$ are dependent. 


# More Than Two Random Variables

* $p(x_1,x_2, \ldots, x_n)=P(X_1=x_1, X_2=x_2, \ldots, X_n=x_n)$. 

* $P(a_1 \le X_1 \le b_1, \ldots, a_n \le X_n \le b_n)=\int_{a_1}^{b_1} \ldots \int_{a_n}^{b_n} f(x_1, \ldots, x_n)  dx_1 \ldots d x_n$. 

* The random variables $X_1, \ldots, X_n$ are said to be *independent* if for every subset $X_{i_1}, \ldots, X_{i_{k}}$ of the variables (each pair, each triple, etc), the joint pmf or pdf of the subset is equal to the product of the marginakl pmf's or pdf's. 

# Expected Value

\begin{equation*}
    E[h(X,Y)]=\begin{cases}
        \sum_{x} \sum_{y} h(x,y)p(x,y)\\
        \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} h(x,y) f(x,y) dx dy.
    \end{cases}
\end{equation*}

# Covariance 
\begin{equation*}
    Cov(X,Y)=E[(X-\mu_X)(Y-\mu_Y)]=\begin{cases}
        \sum_{x} \sum_{y} (x-\mu_X)(y-\mu_Y)p(x,y)\\
        \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} (x-\mu_X)(y-\mu_Y) f(x,y) dx dy.
    \end{cases}
\end{equation*}

*Note*: $Cov(X,Y)=E(XY)-\mu_X \mu_Y$. 

# Correlation

\begin{equation*}
    Corr(X,Y)=\rho_{X,Y}= \frac{Cov(X,Y)}{\sigma_X \sigma_Y}.
\end{equation*}

* If $a$ and $c$ are either both positive or both negative, $Corr(aX + b, cY + d) = Corr(X, Y)$.
* Fr any two rv's $X$ and $Y$, $-1 \le \rho \le 1$. The two variables are said to be
uncorrelated when $\rho=0$. 
* If $X$ and $Y$ are independent, then $\rho=0$, but $\rho=0$ does not imply
independence.
* $\rho=1$ or $\rho=-1$ iff $Y=aX + b$ for some numbers $a$ and $b$ with $a  \neq 0$.

