With everey random variable $X$, we associate a function called the cumulative distribution function of $X$.

## Cummulative distribution function (cdf)

The cummulative distribution function of cdf of a random varible $X$, denoted by $F_X(x)$, is defined by

$$F_{X}(x)=P_{X}(X\leq x),\forall x$$

The cdf is defined for both a discrete and a continuous random variable
- If $X$ is discrete: $F(x)$ is a step-function.
- If $X$ is continuous: $F(x)$ is a continuous function.

## Properties of a cdf
### Theorem 1

A function $F(x)$ is a cdf **if and only if** the following 3 conditions hold:

1. $\lim_{x\to-\infty}F(x)=0$ and $\lim_{x\to\infty}F(x)=1$
2. $F(x)$ is a non-decreasing function of $x$
3. $F(x)$ is right continuous. That is, $\forall x_0\in \mathbb{R}$ we have $\lim_{x\downarrow x_0}F(x)=F(x_0)$

## Identically distributed

Random variables $X$ and $Y$ are identically distributed (id) if for every $A\in\mathcal{B}^1$ we have

$$P(X\in A)=P(Y\in A)$$

$\mathcal{B}^1$ = smallest $\sigma$-alebra generated by half-open intervals of $\mathbb{R}$.

### Theorem

The following are equivalent
1. $X$ and $Y$ are identically distributed ($X\overset{D}{\operatorname*{=}}Y$)
2. $F_X(t) = F_Y(t)$ for all $t\in\mathbb{R}$

**Identically distributed does not necessarily mean equal**

$X\overset{D}{\operatorname*{=}}Y$ does NOT necessarily mean that $X=Y$

Consider the following example:

- Let $ X $ be a random variable defined on a sample space $ S = \{s_1, s_2\} $ where:
  
  $
  X(s_1) = 0, \quad X(s_2) = 1.
  $
  
- Let $ Y $ be a different random variable on the same sample space:
  
  $
  Y(s_1) = 1, \quad Y(s_2) = 0.
  $

- Both $ X $ and $ Y $ have the same distribution. For instance, if $ P(s_1) = P(s_2) = 0.5 $, then:
  
  $
  P(X = 0) = 0.5, \quad P(X = 1) = 0.5,
  $
  
  and similarly:
  
  $
  P(Y = 0) = 0.5, \quad P(Y = 1) = 0.5.
  $
  
- Therefore, $ X \overset{D}{=} Y $ (they are identically distributed).
  
- However, $ X \neq Y $ because they take different values for the same outcomes. For $ s_1 $, $ X(s_1) = 0 $ and $ Y(s_1) = 1 $, so they are clearly not the same random variable.

### Conclusion

When $ X \overset{D}{=} Y $, it only means that $ X $ and $ Y $ behave the same in terms of their probability distribution. It does not mean they are equal as random variables, i.e., they may assign different values to individual outcomes.

## Probability mass function (pmf)

The *pmf* of a **discrete random variable** is defined as

$$f(x)=P_X(X=x),\forall x\in\mathbb{R}$$

- If $X$ is a discrete random variable with cdf $F(x)$ and pmf $f(x)$ then 

$$F(x) = P_{X}(X\leq x) = \sum_{t\leq x}f(t)=\sum_{t\leq x}P(X=t), \forall x\in\mathbb{R}$$

- Note that both $F(x)$ and pmf $f(x)$ are defined for all $x\in\mathbb{R}$

## Probability density function (pdf)

The probability density function (pdf)  of a continuous random variable $X$ is a function $f_X(x)$ that satisfies

$$F_X(x)=P(X\leq x)=\int_{-\infty}^xf_X(u)du$$

- Generally use $F(x)$ for cdf and $f(x)$ for pmf and pdf

## Properties of a pdf or pmf

### Theorem

A function $f_X{(x)}$ is a pdf or pmf of a random variable $X$ if and only if
1. $f_{x}(x)\geq0,\forall x\in\mathbb{R}$

2. $\begin{cases}\sum_{x}f(x)=1,\text{if X is discrete}.\\\int_{-\infty}^{\infty}f(x)dx=1,\text{if X is continuous}.\end{cases}$

- If $f(x)$ is a pdf/pmf then (1) and (2) hold.
- If (1) and (2) hold for some function $f(x)$, then $f(x)$ is pdf or pmf.

### To determine the distribution of a random variable
- To determine the probability distribution of a random variable it is sufficient to know either $F$ or $f$
- Say we know $f(x)$. Then we can determine $F(x)$
    - If $X$ is discrete:
    $$F(x)=\sum_{u\leq x}f(u)$$
    - If $X$ is continuous:
    $$F(x)=\int_{-\infty}^xf(u)du$$
- Say we know $F(x)$. Then we can determine $f(x)$
    - If $X$ is discrete:
    $$f(x)=F(x)-\lim_{u\uparrow x}F(u)$$
    - If $X$ is continuous and the derivative of $F$ exists:
    $$f(x)=\frac d{dx}F(x)$$

### More on cdfs

### Theorem

Let $X$ be a random variable (discrete or continuous) with cdf $F$ and let $a,b\in\mathbb{R}$ where $a<b$. Then

1. $P(a<X\leq b)=F(b)-F(a)$
2. $P(a<X\leq b)=P(x\leq b)-P(x\leq a)$
3. $P(a<x\leq b)=P(\{s\in S=a<x(s)\leq b\})$

Let $X$ be a **continuous random variable**. Then

$\begin{aligned}
F(b)-F(a)& =P(a<X\leq b) \\
&=P(a\leq X\leq b) \\
&=P(a\leq X<b) \\
&=P(a<X<b)
\end{aligned}$

## Example: Partly discrete and partly continuous $X$

For example used to model truncated observations:

$$F(x)=\left\{\begin{array}{ll}0&\mathrm{if~}x<0\\\frac12+\frac x2&\mathrm{if~}0\leq x\leq1\\1&\mathrm{if~}x\geq1\end{array}\right.$$

![描述文字](1-6-1.png)

`A point mass at zero` means that a random variable takes the value 0 with **positive probability**. Let’s assume we have a random variable $X$ with a point mass at zero. This could be expressed as:

$$P(X=0)=\alpha$$

where $0<\alpha\leq 1$ is the probability that $X$ equals zero. The remaining probability, $1-\alpha$, is distributed over other possible outcomes of $X$ (possibly continuously over some range).

CDF Representation

For a distribution with a point mass at zero, the cumulative distribution function (CDF) would reflect a jump at $x=0$. For example:

$$F(x)=\begin{cases}0&\text{if }x<0\\\alpha&\text{if }x=0\\F_{\text{cont}}(x)&\text{if }x>0\end{cases}$$

- $F_\text{cont}(x)$ represents the CDF for values greater than zero, which could be continuous or discrete.
- The CDF jumps from 0 to $\alpha$ at $x=0$, indicating the point mass.

Therefore, our CDF can be converted as:

$$F(x)=\begin{cases}0&\text{if }x<0\\\frac{1}{2}&\text{if }x=0\\\frac{1}{2}+\frac{1}{2}x&\text{if }x\in(0,1]\\1&\text{if }x>1\end{cases}$$

$$f(x)=\begin{cases}0&\text{if }x<0\\\frac{1}{2}&\text{if }x=0\\x&\text{if }x\in(0,1]\\0&\text{if }x>1\end{cases}$$

That means our $X$ is half `Partly discrete and partly continuous`. At the point of x = 0, there is a discrete pmf, and when x > 0, there is a pdf.

## It is a kind of misture distribution.

### Another solution

If $F_1(x)$ and $F_2(X)$ are both cdfs, and $x\in[0,1]$, thus $F(x)=\alpha F_{1}(x)+(1-\alpha)F_{2}(x)$, is also a cdf.

Here we have, 
- for discrete part 
$F_{1}(x)=\begin{cases}0&x<0\\1&x\geq0\end{cases}$, 

- for the continuous part 
$F_2(x)=\begin{cases}0&x<0\\x&x\in[0,1]\\1&x>1\end{cases}$

Then $F(x)=\alpha F_{1}(x)+(1-\alpha)F_{2}(x)$, $F(x)=\begin{cases}\frac{1}{2}\cdot0+\frac{1}{2}\cdot0&\mathrm{if~}x<0\\\frac{1}{2}\cdot1+\frac{1}{2}x&\mathrm{if~}x\in[0,1]\\\frac{1}{2}\cdot1+\frac{1}{2}\cdot1&\mathrm{if~}x>1\end{cases}$