# Chapter 1: Review of Probability Theory

This chapter provides a comprehensive review of fundamental concepts in probability theory that are essential for understanding more advanced topics in wireless communications. We will cover key topics such as conditional probability, independence, probability density functions (PDF), cumulative distribution functions (CDF), and functions of random variables. Additionally, we will delve into moments, random variables, and stochastic processes, including stationarity, ergodicity, and power spectral density. The chapter also includes detailed discussions on Gaussian random variables and vectors, both real and complex, which are crucial for modeling and analysis in communication systems.



## Conditional Probability
- **Definition:**

  $$ 
  P(A|L)=\frac{P(AL)}{P(L)}
  $$  

- **Total Probability Theorem:**

  $$ 
  P(B)=\sum{P(B|A_i)P(A_i)}, \quad S=\sum{A_i} \text{ and } A_iA_j=0
  $$  

- **Bayes' Theorem:**

  $$ 
  P(s|y)=\frac{P(y|s)P(s)}{P(y)}
  $$  

## Independence
$A$ and $B$ are independent if and only if

$$ 
P(AB) = P(A)P(B)
$$  

$$ 
P(A|B) = P(A)
$$  

$$ 
P(B|A) = P(B)
$$  

## PDF and CDF
- **CDF (Cumulative Density Function):** 

  $$ 
  F_{\mathbf{x}}(x)=P(\mathbf{x}\leq x)
  $$  

- **PDF (Probability Density Function):**

  $$ 
  f_{\mathbf{x}}(x)=\frac{dF_{\mathbf{x}}(x)}{dx}
  $$  
  
  $$ 
  F_{\mathbf{x}}(x)=\int_{-\infty}^{x}f_{\mathbf{x}}(x)dx
  $$  

## Function of a Random Variable
Let $X$ be a random variable with a distribution function $f_X(x)$, we define:

$$ 
y=g(x)
$$  

$$ 
P(\mathbf{y}\leq y)=P(\mathbf{x}\leq g^{-1}(y))
$$  

$$ 
f_{\mathbf{y}}(y)=\frac{f_{\mathbf{x}}(g^{-1}(y))}{\left|\frac{dy}{dx}\right|}
$$  

If $\mathbf{x}\sim U(0,1)$ then 

$$\mathbf{y}=F_{\mathbf{y}}^{-1}(x)\sim f_{\mathbf{y}}(y)$$

In the case of a function $y=g(x)$ having multiple roots ($x_1,x_2,\dots,x_m$):

$$ 
f_Y(x)=\frac{f_X(x_1)}{|g'(x_1)|}+\frac{f_X(x_2)}{|g'(x_2)|}+...+\frac{f_X(x_m)}{|g'(x_m)|}
$$  

## First and Second Moments

$$ 
E\{\mathbf{x}\}=\sum_kx_kp_k \quad \text{and} \quad E\{\mathbf{x}\}=\int_{-\infty}^{\infty}xf_{\mathbf{x}}(x)dx 
$$  

$$ 
E\{\mathbf{x}|M\}=\sum_kx_kP(\mathbf{x}=x_k|M) \quad \text{and} \quad E\{\mathbf{x}|M\}=\int_{-\infty}^{\infty}xf_{\mathbf{x}}(x|M)dx 
$$  

$$ 
E\{g(\mathbf{x})\}=\sum_kg(x_k)p_k \quad \text{and} \quad E\{g(\mathbf{x})\}=\int_{-\infty}^{\infty}g(x)f_{\mathbf{x}}(x)dx 
$$  

$$ 
V\{\mathbf{x}\}=\sum_k(x_k-E\{\mathbf{x}\})^2p_k  \quad \text{and} \quad  V\{\mathbf{x}\}=\int_{-\infty}^{\infty}(x-E\{\mathbf{x}\})^2f_{\mathbf{x}}(x)dx
$$  

## Two Random Variables

$$ 
F_{\mathbf{x}\mathbf{y}}(x,y)=P(\mathbf{x}\leq x,\mathbf{y}\leq y)
$$  

$$ 
f_{\mathbf{x}\mathbf{y}}(x,y)=\frac{\partial F_{\mathbf{x}\mathbf{y}}(x,y)}{\partial x \partial y}
$$  

$$ 
F_{\mathbf{x}\mathbf{y}}(x,y)=\int_{-\infty}^{x}\int_{-\infty}^{y}f_{\mathbf{x}\mathbf{y}}(x,y)dx dy
$$  

Marginal PDF:

$$ 
f_{\mathbf{x}}(x)=\int_{-\infty}^{\infty}f_{\mathbf{x}\mathbf{y}}(x,y)dy
$$  

- Let $\mathbf{z}=\mathbf{x}+\mathbf{y}$ where $\mathbf{x}$ and $\mathbf{y}$ are independent, then 

$$f_{\mathbf{z}}(z)=f_{\mathbf{x}}(z) \ast f_{\mathbf{y}}(z)$$

- All previous results can be generalized to vectors of random variables.

## Real Scalar Gaussian Random Variable
- **PDF of a real scalar Gaussian random variable with mean $\mu$ and variance $\sigma^2$:**

  $$ 
  f_{\mathbf{x}}(x)=\frac{1}{\sqrt{2\pi}\sigma}\exp\left(\frac{(x-\mu)^2}{2\sigma^2}\right)
  $$ 

- The random variable $\mathbf{x}$ is denoted by $\mathbf{x}\sim\mathcal{N}(\mu,\sigma^2)$ and the standard Gaussian by $w\sim\mathcal{N}(0,1)$.
- The function $Q(\cdot)$ is the complementary CDF of the standard Gaussian and decreases exponentially:

  $$ 
  Q(a)=P(w>a)=\int_a^\infty\frac{1}{\sqrt{2\pi}}e^{-\frac{x^2}{2}} dx
  $$ 

  $$ 
  \frac{1}{\sqrt{2\pi}a}\left(1-\frac{1}{a^2}\right)e^{-\frac{a^2}{2}}<Q(a)<e^{-\frac{a^2}{2}}
  $$  

## Real Gaussian Random Vector
- **PDF of a real Gaussian vector with mean $\vec{\mu}$ and covariance matrix $\vec{C}$:**

  $$ 
  f_{\vec{\mathbf{x}}}(\vec{x})=\frac{1}{(2\pi)^{(n/2)}\sqrt{\det \vec{C}}}e^{\frac{-1}{2}(\vec{x}-\vec{\mu})^T\vec{C}^{-1}(\vec{x}-\vec{\mu})} 
  $$  

  $$ 
  \vec{\mu}= E\{\vec{\mathbf{x}}\} \text{ and } \vec{C}=E\{(\vec{\mathbf{x}}-\vec{\mu})(\vec{\mathbf{x}}-\vec{\mu})^T\}
  $$  

- The vector $\vec{\mathbf{x}}$ is denoted by $\vec{\mathbf{x}} \sim \mathcal{N}(\vec{\mu},\vec{C})$
- Standard Gaussian vector: $n$ iid standard Gaussian elements:

  $$ 
  \vec{\mathbf{w}} \sim \mathcal{N}(0,\vec{I})
  $$  

  $$ 
  f_{\vec{\mathbf{w}}}(\vec{w})=\frac{1}{(2\pi)^{(n/2)}}e^{-\frac{||\vec{w}||^2}{2}}
  $$ 

- The PDF of the standard Gaussian vector depends only on the magnitude of $\vec{\mathbf{w}}$

## Real Gaussian Random Vector

- If $\vec{\mathbf{x}} \sim \mathcal{N}(\vec{\mu}_x,\vec{C}_x)$ then

  $$ 
  \vec{\mathbf{y}}=\vec{A}\vec{\mathbf{x}}+\vec{b} \sim \mathcal{N}(\vec{A}\vec{\mu}_x+\vec{b},\vec{A}\vec{C}_x\vec{A}^T)
  $$  

- Let $\vec{O}$ be an orthonormal transform ($\vec{O}^T\vec{O}=I$), then $\vec{O}\vec{\mathbf{w}}$ is also a standard Gaussian vector:
  - $\vec{\mathbf{w}}$ is invariant under coordinate system rotations
  - Projections of a standard Gaussian vector in orthonormal directions are independent
  - These results can be generalized for vectors of iid Gaussian elements with zero mean and variance $\sigma^2$

## Complex Random Vector
- For the analysis of communication systems, we use the baseband representation which requires complex random vectors:

  $$ 
  \vec{\mathbf{x}}=\vec{\mathbf{x}}_R+j\vec{\mathbf{x}}_I \quad \text{where $\vec{\mathbf{x}}_R$ and $\vec{\mathbf{x}}_I$ are real random vectors}
  $$  

- Additionally, in communication systems, we mainly work with complex random vectors that have the same distribution for different phase shifts:

  $$ 
  \text{$\vec{\mathbf{x}}$ is circularly symmetric if } e^{j\theta}\vec{\mathbf{x}}\sim\vec{\mathbf{x}}, \forall \theta
  $$  

- The first and second-order statistics of a circularly symmetric vector are completely described by $E[\vec{\mathbf{x}}\vec{\mathbf{x}}^H]$ where $(\cdot)^H$ denotes the Hermitian transpose (conjugate transpose)

## ZMCCSG
- $\mathbf{x}=\mathbf{x}_R+j\mathbf{x}_I$ where $\mathbf{x}_R$ and $\mathbf{x}_I$ are iid $\mathcal{N}(0,\frac{\sigma^2}{2})$ is a zero mean complex circularly symmetric Gaussian (ZMCCSG) vector with zero mean and variance $\sigma^2$
- The ZMCCSG random variable $\mathbf{x}$ is denoted by $\mathbf{x}\sim\mathcal{CN}(0,\sigma^2)$ and the standard ZMCCSG random variable $\mathbf{w}$ is denoted by $\mathbf{w}\sim\mathcal{CN}(0,1)$
- The phase of $\mathbf{w}$ is uniformly distributed between 0 and $2\pi$, $|\mathbf{w}|$ follows a Rayleigh distribution, and $\mathbf{|w|^2}$ follows an exponential distribution

## Complex Gaussian Vector
- A vector of $n$ iid elements $\mathcal{CN}(0,1)$ is a standard complex Gaussian vector $\vec{\mathbf{w}}\sim\mathcal{CN}(0,I)$

  $$ 
  f_{\vec{\mathbf{w}}}(\vec{w})=\frac{1}{\pi^{n}}e^{-||\vec{w}||^2} \quad \vec{w} \in \mathcal{C}^n
  $$  

- Let $\vec{A}$ be a complex matrix then

  $$ 
  \vec{\mathbf{x}}=\vec{A}\vec{\mathbf{w}}\sim\mathcal{CN}(0,\vec{K}=\vec{A}\vec{A}^*) 
  $$ 

  $$ 
  f_{\vec{\mathbf{x}}}(\vec{x})=\frac{1}{\pi^{n}\det\vec{K}}e^{-\vec{x}^*\vec{K}\vec{x}}
  $$  

- Let $\vec{U}$ be a unitary transform ($\vec{U}^*\vec{U}=I$), then $\vec{U}\vec{\mathbf{w}}\sim\vec{\mathbf{w}}$:
  - $\vec{\mathbf{w}}$ has the same distribution in different orthonormal bases (i.e., $\vec{\mathbf{w}}$ is isotropic)
  - The projections of $\vec{\mathbf{w}}$ in orthonormal directions are independent
  - These results can be generalized for $\mathcal{CN}(0,\sigma^2I)$

## Definition of a Stochastic Process
**Definition:**
A stochastic process $\mathbf{x}(t)$ is a function $x(t)$ that depends on the outcome of a random experiment $\zeta$.

Interpretations of $\mathbf{x}(t)$:
- A set of functions when $t$ and $\zeta$ are variable
- A non-random function (or a sample of the process) when $t$ is variable and $\zeta$ is fixed
- A random variable when $t$ is fixed and $\zeta$ is variable
- A specific value when both $t$ and $\zeta$ are fixed

## Stationarity
- A process is strictly stationary if

  $$ 
  f_{\mathbf{x}(t_1),\dots,\mathbf{x}(t_n)}(x_1,\dots,x_n)=f_{\mathbf{x}(t_1+\tau),\dots,\mathbf{x}(t_n+\tau)}(x_1,\dots,x_n), \forall \tau, \forall n
  $$  

- A process is strictly cyclo-stationary if

  $$ 
  f_{\mathbf{x}(t_1),\dots,\mathbf{x}(t_n)}(x_1,\dots,x_n)=f_{\mathbf{x}(t_1+\tau),\dots,\mathbf{x}(t_n+\tau)}(x_1,\dots,x_n) \quad \tau=kT\text{ ($k$ is an integer)}, \forall n
  $$ 

- A process is wide-sense stationary (WSS) if

  $$ 
  E\{\mathbf{x}(t)\}=\mu, \forall t 
  $$  

  $$ 
  E\{\mathbf{x}(t)\mathbf{x}(t+\tau)\}=R(t,t+\tau)=R(\tau), \forall t, \forall \tau
  $$  

- A process is wide-sense cyclo-stationary if

  $$ 
  E\{\mathbf{x}(t)\}=E\{\mathbf{x}(t+kT)\}, \forall t, \text{ ($k$ is an integer)} 
  $$  

  $$ 
  R(t_1,t_2)=R(t_1+kT,t_2+kT)
  $$  

- If $\mathbf{x}(t)$ is wide-sense cyclo-stationary then $\mathbf{x}(t+\theta)$, $\theta\sim U(0,T)$, is wide-sense stationary.

## Power Spectral Density
- The power spectral density of a continuous stationary process is defined as follows:

  $$ 
  S(f)=\int_{-\infty}^{\infty}R(\tau)e^{-j2\pi f\tau}d\tau 
  $$  

  $$ 
  R(\tau)=\int_{-\infty}^{\infty}S(f)e^{j2\pi f\tau}df
  $$  

- The power spectral density of a discrete stationary process 

$$\{\mathbf{x}(n),\text{ ($n$ is an integer)}\}$$

is defined as follows:

  $$ 
  R(n)=E\{\mathbf{x}(m)\mathbf{x}(m+n)\} 
  $$ 

  $$ 
  S(f)=\sum_{n=-\infty}^{\infty}R(n)e^{-j2\pi f n}
  $$  
  
  $$ 
  R(n)=\int_{-\infty}^{\infty}S(f)e^{j2\pi f n}df
  $$  

## Power
Power in a band from $f_1$ to $f_2$:

$$ 
P_{f_1-f_2}=\int_{f_1}^{f_2}S(f)df
$$  

Physical interpretations:
- $E\{X(t)\}$: DC component
- $E\{X(t)^2\}=R(0)=\int_{-\infty}^{\infty}S(f)df$: total power
- $E\{X(t)\}^2$: DC power
- $V\{X(t)\}=E\{X(t)^2\}-E\{X(t)\}^2$: AC power

## Ergodicity
- A stationary process is ergodic if
  $$ 
  E\{\mathbf{x}^m(t_1)\}=\lim_{T\rightarrow\infty}\frac{1}{2T}\int_{-T}^T\mathbf{x}^m(t)dt
  $$  
- For an ergodic process, the statistics and power spectral density can be extracted from a single realization of the process
- Complete ergodicity is difficult to demonstrate but is often assumed
- A process is ergodic in the mean if the previous relation is true for $m=1$
- A stationary process is ergodic in the mean if
  $$ 
  \lim_{T\rightarrow\infty}\frac{1}{T}\int_{0}^T(R(\tau)-\mu^2)d\tau=0
  $$  

## Linear Systems
The WSS process $\mathbf{x}(t)$ is filtered by a linear time-invariant system as follows:

![Linear System](./Figures/sys_lineaire.png)

- $\mathbf{y}(t)$ is a WSS process
- The power spectral density of $\mathbf{y}(t)$ is given by:

  $$ 
  S_{\mathbf{y}}(f)=S_{\mathbf{x}}(f)\left|H(f)\right|^2
  $$  

## Gaussian Process
- A process $\mathbf{x}(t)$ is Gaussian if

  $$ 
  \vec{\mathbf{x}}=\begin{pmatrix}
  \mathbf{x}(t_1) \\
  \vdots \\
  \mathbf{x}(t_n)
  \end{pmatrix}=
  \begin{pmatrix}
  \mathbf{x}_1 \\
  \vdots \\
  \mathbf{x}_n
  \end{pmatrix} \forall n
  $$  
  
  follows the PDF of a Gaussian vector.
- The output of a linear time-invariant system with a WSS Gaussian process as input is also a WSS Gaussian process.