# Continuous Random Variables

Date: 2020-11-09

## Scenario

Let $X$ be a continuous random variable with p.d.f.

$$
f(x) = \begin{cases}
    &4x^{3} \hspace{3mm} &x \in (0, 1] \\
    &0 \hspace{3mm} &\text{otherwise}
\end{cases}
$$

-----

## Questions

**(a)** Prove that $f(x)$ a valid p.d.f. for $X$.

**(b)** State the c.d.f. of $X$.

**(c)** Calculate $P(X \geq 0.5)$

**(d)** Calculate $P(X < 0.42)$

**(e)** Calculate $P(0.15 \leq X \leq 0.65)$

**(f)** Calculate the expected value of $X$.

**(g)** Calculate the variance of $X$.

-----

In [1]:
from scipy.integrate import quad
import numpy as np
import pandas as pd

## (a)

A probability density function $f(x)$ for a continuous random variable $X$ has the following properties.

1. $f(x) > 0$ for all $x$ in $X$
2. $\int f(x) = 1$ over the range of $X$

Let us confirm the p.d.f. has these properties.
We first define a function that returns the value of $f(a)$, where $a \in (0,1)$.

In [2]:
def pdf(a: float) -> float:
    '''Calculates value of f(a).
    Valid range of a: (0, 1)'''
    return 4 * (a ** 3)

### Property 1: $f(x) > 0$ 

Rather than using iteration, we will instead

1. Create a **DataFrame** with columns $x$ and $f(x)$
2. Filter the **DataFrame** for rows $f(x) \leq 0$
3. Check size of filtered **DataFrame** is 0

If $f(x)>0$, we would expect the size of the DataFrame to be size 0.

In [3]:
# declare an array
x = np.arange(start=0, stop=1, step=0.000001)

# create the df
df_pdf = pd.DataFrame(data=x, columns={'x'})
df_pdf['f'] = pdf(x)

# filter the df, check df.size is 0
filtered_df = df_pdf.query('f < 0')
filtered_df.size == 0

True

### Property 2: $\int f(x) = 1$

Let us calculate

$$
\int_{0}^{1} 4x^{3} \> dx = \ldots
$$

In [4]:
# function quad returns a tuple, select first element
quad(pdf, a=0, b=1)[0] == 1

True

As both tests returned `True`, we can confirm that $f(x)$ is a valid p.d.f.

## (b)

The c.d.f. of a p.d.f. for a continuous random variable $X$ is

$$
F(x) = \int_{a}^{x} f(y) \> dx.
$$

So for $f(y)$ given in the scenario

$$
\begin{aligned}
    F(x) &= \int_{0}^{x} 4 y^{3} \> dx \\
        &= 4 \bigg[ \frac{1}{4} y^{4} \bigg]_{0}^{x} + c \\
        &= 4 \bigg( \frac{1}{4} x^{4} - \frac{1}{4} 0^{4} \bigg) \\
        &= x^{4}.
\end{aligned}
$$

## (c)

The probability that $X$ is less than or eqal to 0.5 is $P(X \leq 0.5) = F(0.5)$.
Using the c.d.f. we defined in **(b)**

$$
F(4) = 0.5^{4} = \ldots
$$

In [5]:
# calculate using scipy
quad(pdf, a=0, b=0.5)[0]

0.0625

## (d)

The probability that $X$ is greater than 0.42 is $P(X > 0.42) = 1 - F(0.42)$.
Using the c.d.f. we defined in **(b)**

$$
P(X > 0.42) = 1 - F(0.42) = 1 - (0.42)^{4} = \ldots
$$

In [6]:
# calculate using scipy
1 - quad(pdf, a=0, b=0.42)[0]

0.96888304

## (e)

The probability that $X$ is between 0.15 and 0.65 is $P(0.15 < X < 0.65) = F(0.65) - F(0.15)$.
Using the c.d.f. we defined in **(b)**

$$
P(0.15 < X < 0.65) = F(0.65) - F(0.15) = (0.65)^{4} - (0.15)^{4} = \ldots
$$

In [7]:
# confirming using scipy
quad(pdf, a=0.15, b=0.65)[0]

0.178

## (f)

For the given p.d.f., $E(X)$ is

$$
\begin{aligned}
    E(X) = \int_{a}^{b} x \> f(x) \> dx &= \int_{0}^{1} x \> 4x^{3} \> dx \\
        &= 4 \int_{0}^{1} x^{4} \> dx \\
        &= 4 \bigg[ \frac{1}{5} x^{5} \bigg]_{0}^{1} \\
        &= \frac{4}{5}(1^{5} - 0^{5}) \\
        &= \ldots
\end{aligned}
$$

In [8]:
# define xfx
def xfx(a: float) -> float:
    '''Calculates value of a.f(a).
    Valid range of a: (0, 1)'''
    return a * 4 * (a ** 3)

In [9]:
# integrate function within valid range.
mean = quad(xfx, a=0, b=1)[0]
mean

0.8

## (g)

For the given p.d.f., $V(X)$ is

$$
\begin{aligned}
    V(X) =  E(X^{2}) - E(X)^{2} &= \int_{0}^{1} x^{2} \> 4x^{3} \> dx - (E(X)^{2}) \\
        &= 4 \int_{0}^{1} x^{5} \> dx - \bigg( \frac{4}{5} \bigg)^{2} \\
        &= 4 \bigg[ \frac{1}{6} x^{6} \bigg]_{0}^{1} - \frac{16}{25} \\
        &= \frac{2}{3}(1^{6} - 0^{6}) - \frac{16}{25} \\
        &= \frac{2}{3} - \frac{16}{25} \\
        &= \ldots
\end{aligned}
$$

In [10]:
def x2fx(a: float) -> float:
    '''Calculates value of a^2.f(a).
    Valid range of a: (0, 1)'''
    return (a ** 2) * 4 * (a ** 3)


quad(x2fx, a=0, b=1)[0] - (mean ** 2)

0.026666666666666616