# A1 Probability distributions

## Support functions for probability mass functions

In [18]:
# Checks if a list sums to 1
check_sum_equals_one <- function(a_list) {
    # declare the sum
    sum <- 0

    for(i in 1:length(a_list)) {
        sum <- sum + a_list[[i]]
    }
    
    sum == 1
}

In [19]:
# Checks if every element in list > 0
check_all_positive <- function(a_list) {
    # declare flag
    a_positive <- TRUE
    
    for(i in 1:length(a_list)) {
        if(a_list[[i]] <= 0) {
            a_positive <- FALSE
        }
    }
    
    a_positive == TRUE
}

In [10]:
# calaculates the mean of a pmf
mean_pmf <- function(x, p) {
    mean <- 0
    
    for(i in 1:length(x)) {
        mean <- mean + x[[i]] * p[[i]]
        }
    mean
}

In [13]:
var_pmf <- function(x, p) {
    mean <- 0
    var <- 0
    
    # calculate mean
    for(i in 1:length(x)) {
        mean <- mean + x[[i]] * p[[i]]
        }

    # calculate var
    for(i in 1:length(x)) {
        var <- var + (x[[i]] - mean) ** 2 * p[[i]]
        }    
    var
}

## 1 Probability density function of a continuous random variable

### Example 1.1.1

Suppose that a random variable $X$ has range $(0,1)$ and that its p.d.f. is given by

$$f(x) = \frac{3}{4} (x^{2} + 1), \hspace{3mm} x \in (0,1).$$

Calculate the following:

**(a)** $P(X < 1/5)$

**(b)** $P(X > 3/8)$

**(c)** $P(1/4 \leq X \leq 1/2)$

**(d)** $E(X)$

**(e)** $V(X)$

In [1]:
f <- function(x) {0.75 * (x ** 2 + 1)}

In [2]:
xf <- function(x) {x * 0.75 * (x ** 2 + 1)}

In [3]:
x2f <- function(x) {x ** 2 * 0.75 * (x ** 2 + 1)}

#### (a)

The probability $P(X \leq x) = F(x)$ for a continuous random variable is

$$
F(x) = \int_{a}^{x} f(x) \> dx.
$$

So $P(X < 1/5) = F(1/5) = \cdots$

In [5]:
integrate(f = f, lower = 0, upper = 1/5)[[1]]

#### (b)

The probability $P(X > x) = P(X \geq x)$ for a continuous random variable is

$$
P(X \geq x_{1}) = \int_{x}^{b} f(x) \> dx.
$$

So $P(X > 3/8) = \cdots$

In [6]:
integrate(f = f, lower = 3/8, upper = 1)[[1]]

#### (c)

The probability $P(x_{1} \leq X \leq x_{2})$ for a continuous random variable is

$$
P(x_{1} \leq X x_{2})
  = \int_{x_{1}}^{x_{2}} f(x) \> dx.
$$

So $P(1/4 \leq X \leq 1/2) = \ldots$

In [7]:
integrate(f = f, lower = 1/4, upper = 1/2)[[1]]

#### (d)

The expected value of a continuous random variable with p.d.f. $f(x)$ is

$$
E(X) = \mu = \int_{a}^{b} x \> f(x) \> dx.
$$

In [10]:
mu <- integrate(f = xf, lower = 0, upper = 1)[[1]]
mu

#### (e)

The variance of a p.d.f. of a continuous random variable with p.d.f. $f(x)$ is

$$
V(X) = E(X^{2}) - E(X)^{2}
= \bigg\{\int_{a}^{b} x^{2} \> f(x) \> dx \bigg\} - \mu^{2}.
$$

In [11]:
integrate(f = x2f, lower = 0, upper = 1)[[1]] - mu ** 2

### Example 1.1.2

Suppose that a continuous variable $X$ has range $(-1,1)$. The following is a function $f(x)$ of $x$:

$$f(x) = 1 - x^{2}.$$

Is $f(x)$ a valid p.d.f. for $X$?

In [12]:
f <- function(x) {1 - x ** 2}

Check the properties of a valid p.d.f.

(1) $\int f(x) = 1$

(2) $f(x) > 0$

In [14]:
integrate(f = f, lower = -1, upper = 1)[[1]]

Therefore $f(x)$ is not a valid p.d.f.
But does a normalising constant, $k$, exist, such that $k \> f(x)$ would be a valid p.d.f.?

$$
\begin{aligned}
    1 = \int_{-1}^{1} k \> (1 - x^{2}) \> dx &= k \int_{-1}^{1} (1 - x^{2}) \> dx \\
      &= k \bigg( \frac{4}{3} \bigg) \\
    k &= \frac{3}{4}.
\end{aligned}
$$

Hence, $f(x)$ is not a valid p.d.f., but $\frac{3}{4} f(x)$ is valid.

#### Example 1.1.3

Suppose that a continuous variable $X$ can only take values in the range $0$ to $1$. The following is a function $f(x)$ of $x$:

$$f(x) = \frac{3}{4} (x^{2} + 1), \hspace{3mm} x \in (0,1).$$

What is the c.d.f. associated with $X$?

The c.d.f. for a continuous random variable with p.d.f. $f(y)$  is

$$
F(x) = \int_{a}^{x} f(y) \> dy.
$$

Therefore for the p.d.f. in question

$$
\begin{aligned}
F(x) = \int_{0}^{x} \frac{3}{4} (x^{2} + 1) \> dy &= \frac{3}{4} \int_{0}^{x} (x^{2} + 1) \> dy \\
  &= \frac{3}{4} \bigg[ \frac{1}{3} y^{3} + y \bigg]_{0}^{x} \\
  &= \frac{3}{4} \bigg\{ \frac{1}{3} x^{3} + x - \bigg( 0 + 0 \bigg) \bigg\} \\
  &= \frac{1}{4} ( x^{3} + 3x ).
\end{aligned}
$$

## 2 Probability mass function of a discrete random variable

-----

### Example 1.2.1

The random variable $X$ has a range $\{1, 2, 3, 4, 5\}$.
**Table 1** shows a function $p(x)$ of $X$.

| $x$    | 1    | 2    | 3    | 4    | 5    |
|--------|------|------|------|------|------|
| $p(x)$ | 0.30 | 0.25 | 0.10 | 0.20 | 0.15 |

**(a)** Confirm $p(x)$ a valid p.m.f. for $X$

Calculate the following:

**(b)** $P(X = 3)$

**(c)** $P(X > 2)$

**(d)** $E(X)$

**(e)** $V(X)$ 

In [3]:
# declare list for x
x <- c(1, 2, 3, 4, 5)

In [4]:
# declare list for p
p <- c(0.3, 0.25, 0.1, 0.2, 0.15)

#### (b)

Check the properties of a valid p.m.f.

(1) $\sum_{x} p(x) = 1$

(2) $p(x) > 0$

In [20]:
# property 1.
check_sum_equals_one(p)

In [21]:
# propery 2
check_all_positive(p)

As both tests return `True`, $p(x)$ is a valid p.m.f. for $x$

#### (b)

The probability $p(3) = \cdots$

In [26]:
p[[3]]

#### (c)

The probability $P(X > 2) = \cdots$

In [27]:
p[[3]] + p[[4]] + p[[5]]

#### (d)

The expected value of a discrete random variable with p.m.f. $p(x)$ is

$$
E(X) = \mu = \sum x \> p(x).
$$

In [11]:
mean_pmf(x, p)

#### (e)

The variance of a discrete random variable with p.m.f. $p(x)$ is

$$
V(X) = E[(X - \mu)^{2}] = \sum_{x} (x - \overline{x})^{2} \> p(x).
$$

In [14]:
var_pmf(x, p)

### Example 1.2.2 (June 2017)

The random variable $X$ has a range $\{1, 2, 3, 4, 5\}$.
**Table 1** shows a function $p(x)$ of $X$.

| $x$    | 1    | 2    | 3    | 4    |
|--------|------|------|------|------|
| $p(x)$ | 0.20 | 0.15 | 0.30 | 0.35 |

**(a)** Confirm $p(x)$ a valid p.m.f. for $X$

Calculate the following:

**(b)** $P(X = 2)$

**(c)** $P(X \leq 2)$

**(d)** $E(X)$

**(e)** $V(X)$ 

In [30]:
# declare list for x
x <- c(1, 2, 3, 4)

In [31]:
# declare list for p
p <- c(0.2, 0.15, 0.30, 0.35)

#### (a)

Check the properties of a valid p.m.f.

(1) $\sum_{x} x = 1$

(2) $p(x) > 0$

In [22]:
# property 1.
check_sum_equals_one(p)

In [23]:
# propery 2
check_all_positive(p)

As both tests return `True`, $p(x)$ is a valid p.m.f. for $x$

#### (b)

The probability $p(2) = \cdots$

In [26]:
p[[2]]

#### (c)

The probability $P(X \leq 2) = \cdots$

In [27]:
p[[1]] + p[[2]]

#### (d)

The expected value $E(X) = \cdots$

In [32]:
mean_pmf(x, p)

#### (e)

The variance $V(X) = \cdots$

In [33]:
var_pmf(x, p)