# Seminar 4

## Random variables

A **random variable** is a function from sample space to the real numbers $X: S \to \mathbb{R}$.

It means that for every outcome $\omega \in S$ there is a real number $X(\omega)$.

Not all functions are allowed, but this is slightly beyond the scope of our course.

## Distribution of a random variable

Consider random variable $X: S \to \mathbb{R}$. We introduce **the distribution** (or distribution law) $\mathcal{L}$ of random variable $X$. Distribution acts on numbers in $\mathbb{R}$ in the same way as probability function $P$ acts on outcomes. We will write $X \sim \mathcal{L}$.

## Two types of distributions

A probability distribution can be **discrete** or **continuous**.

Discrete random variables can only take countably many values (think integers), continuous random variables can take uncountably many values (think reals).

There is also a third type of distributions, which you never encounter in practice; it's possible for a distribution to be a mix of several types, which you also do not normally encounter.

## Example 1

Consider event $A$ and a random variable $X = \mathbb{I}\text{nd}_A$, an indicator:
$$
\mathbb{I}\text{nd}_A(x) = \begin{cases}
1, x \in A, \\
0, \text{else}
\end{cases}
$$

$$
\mathbb{P}(X = 1) = \mathbb{P}(A) = p
$$

$$
\mathbb{P}(X = 0) = 1 - \mathbb{P}(A) = 1 - p
$$

We say that $X$ follows **Bernoulli distribution** with parameter $p$ and write $X \sim Be(p)$.

We will call $\mathbb{P}_X(\omega)$ a **probability mass function** (PMF).

## Bernoulli trial scheme

Previously we have worked with independent events that were happening in one probability space. But sometimes we want to have multiple trials, where for every trial the probability space is known, but we are interested in the probability space covering all the trials at once. We can achieve it via direct product of probability spaces.

If all probability spaces are the same and equal to:
- $S = \{0, 1\}$
- $\mathbb{P}(1) = p$ and $\mathbb{P}(0) = 1 - p$

Then we call such experiment a **Bernoulli trial scheme**, and the probability space of it is:
- $S = \{(i_1, \ldots, i_n), i_j \in \{0, 1\}\}$
- $\mathbb{P}(i_1, \ldots, i_n) = p^{\text{num} j \text{ such that } i_j = 1} (1 - p)^{\text{num} j \text{ such that } i_j = 0}$

## Example 2

Consider $X_1, \ldots, X_n \sim Be(p)$ independent random variables. Then $Y = \sum_{k=1}^n X_k$ follows **Binomial distribution** with parameters $n$ and $p$, $Y \sim Bi(n, p)$. $\mathbb{P}(Y = k) = ?$

## Solution 2

If $Y \sim Bi(n, p)$, then
$$
\mathbb{P}(Y = k) = \begin{pmatrix}n\\k\end{pmatrix} p^k (1-p)^{n-k}
$$

## Example 3

We say $X$ follows discrete uniform distribution $DU([1, n])$ and we write $X \sim DU([1, n])$ if
$$
P(X = k) = \frac1n
$$

## Example 4

Consider an urn with $w$ white balls and $b$ black balls. We draw $n$ balls out of the urn at random without replacement. Let $X$ be the number of white balls in the sample. What is the distribution of $X$? What is its PMF?

## Solution 4

If $X \sim HGeom(w, b, n)$, then
$$
\mathbb{P}(X = k) = \frac{\begin{pmatrix}w\\k\end{pmatrix}\begin{pmatrix}b\\n-k\end{pmatrix}}{\begin{pmatrix}w+b\\n\end{pmatrix}}
$$

## Example 5

What is the difference between hypergeometric and binomial distributions?

Reminder:
- Binomial story: Consider and urn with $w$ white balls and $b$ black balls. We draw $n$ balls from the urn with replacement. Let $X$ be the number of white balls in the sample.
- Hypergeometric story: Consider an urn with $w$ white balls and $b$ black balls. We draw $n$ balls out of the urn at random without replacement. Let $X$ be the number of white balls in the sample.

Bernoulli trials in Binomial story are independent. The Bernoulli trials in the Hypergeometric story are dependent, since the sampling is done without replacement.

## Example 5

Consider $X$ and $Y$ independent $\mathbb{Z}$-valued random variables. $\mathbb{P}(X + Y = k) = ?$

## Solution 5

$$
\mathbb{P}(X + Y = k) = \sum_{m} \mathbb{P}(X = m) \mathbb{P}(Y = k - m)
$$

## Example 6

Let $X \sim Bi(n, p)$ and $Y \sim Bi(m, p)$ be independent. What is the distribution of $Z = X + Y$?

## Solution 6

$$
\begin{aligned}
\mathbb{P}(X + Y = k) & = \sum_j \begin{pmatrix}n\\j\end{pmatrix} p^j (1-p)^{n-j} \begin{pmatrix}m\\k-j\end{pmatrix} p^{k-j} (1-p)^{m-k+j} = \\
& = p^{k} (1-p)^{n+m-k} \sum_j \begin{pmatrix}n\\j\end{pmatrix} \begin{pmatrix}m\\k-j\end{pmatrix} = \\
& =\begin{pmatrix}n+m\\k\end{pmatrix} p^{k} (1-p)^{n+m-k}
\end{aligned}
$$

$$
Z \sim Bi(n+m, p)
$$

## Cumulative distribution function

Note that distribution of a discrete distribution of a random variable $X$ is uniquely defined by its PMF $\mathbb{P}(X = x_i)$. In general, we define the distribution using cumulative distribution function (CDF):
$$
F_X(x) = \mathbb{P}(X < x)
$$

It has the following properties:
- $F_X$ is non-decreasing
- $\lim\limits_{x\to-\infty}F_X(x) = 0$
- $\lim\limits_{x\to+\infty}F_X(x) = 1$
- $F_X$ if left continuous

Interesting enough, the converse is also true. Any function that conforms to the properties above defines some probability distribution on $\mathbb{R}$ and this relation is unique.

## Probability density function

- If $X$ has a discrete distribution, then $F_X$ has a countable number of jumps $p_i = \mathbb{P}(X = x_i)$ and at $x = x_i$ it is continuous
- If $X$ has absolutely continuous distribution, then $F_X$ is differentiable a.e. and can be recovered from its derivative:
    $$
    F_X(x) = \int\limits_{-\infty}^x f_X(t) dt
    $$
    
    where $f_X(t)$ is the probability density function and $f_X(t) = F'_X(x)$ a.e.

## Example 7

We say that random variable $X$ is distributed uniformly on $[a, b]$ and write $X \sim U([a, b])$ if
$$
f_X(x) = \begin{cases}
\frac{1}{b-a}, a \leqslant x \leqslant b, \\
0, \text{else}
\end{cases}
$$

## Example 8

Consider $X$ and $Y$ independent random variables with PDFs $f_X$ and $f_Y$ respectively. Then, their sum $Z = X + Y$ has absolutely continuous distribution with density
$$
f_Z(z) = \int f_X(x) f_Y(z-x) dx
$$

## Example 9

Let $X, Y \sim U([0, 1])$ and $Z = X + Y$. Find $f_Z(z)$.

## Solution 9

$$
f_Z(z) = \int\limits_{0}^1 f_X(x) f_Y(z-x) dx = \int\limits_{0}^1 f_Y(z-x) dx = \begin{cases}
z, & 0 \leqslant z \leqslant 1, \\
2 - z, & 1 \leqslant z \leqslant  2, \\
0, & \text{else}
\end{cases}
$$