# Continuous Probability Distributions

## Introduction

In the field of probability, a distribution function is a function that maps numerical values to probabilities. Typically, all outcomes in the sample space will have a probability associated with them. There are two types of probability distribution functions - continuous and discrete. In this lesson we will focus on continuous distributions.

Continuous probability is distributed in a range. **To calculate the probability, we compute the total area under the distribution curve of the sample space.** At a given point of the curve, the area is approximately zero. It should be noted that we can convert a continuous distribution to discrete distribution by converting the smooth curve into multiple bins where each bin represents the probability value of a certain range. See image below for example: 


[Link to the Image here](https://commons.wikimedia.org/wiki/File:Compound_Interest_with_Varying_Frequencies.svg#/media/File:Compound_Interest_with_Varying_Frequencies.svg)


## Range and Domain

Before we move on, let us have a look at two important concepts in mathematics. 

- **Domain**: The domain of a function is the set of values for which the function is defined.

- **Range**: The set of all values a function can take. 

This is exemplified below:

![alt text](https://s3-us-west-2.amazonaws.com/courses-images-archive-read-only/wp-content/uploads/sites/924/2015/11/25200622/CNX_Precalc_Figure_01_02_0062.jpg)

## Continuous Random Variables

Continuous distributions come from continuous random variables. For these random variables, **the set of possible values is uncountable**. They are defined by probability density functions (PDF). We say $X$ is a continuous random variable if there exists a non-negative function $f (x)$ defined for all $x ∈ B$, where $B$ is a set of [real numbers](https://en.wikipedia.org/wiki/Real_number) in  $(−\infty,\infty)$.

$$P(X \in B) = \int_{B} f(x) dx.$$

To explain a little deeper, let B be defined as an interval  $[a, b]$:

$$P(a \leq X \leq b) = \int_{b}^{a} f(x) dx.$$

If we let a = b in the preceding formula, then: 

$$P(X = a) = \int_{a}^{a} f(x) dx = 0.$$

In words, this equation states that **the probability that a continuous random variable will assume any particular value is zero.**


![alt text](https://work.thaslwanter.at/Stats/html/_images/PDF.png)

## Cumulative Distribution Function



Recall the following example from a previous lecture:

*   $P(X = 1) = 1/6$
*   $P(X = 2) = 1/6$
*   $P(X = 3) = 1/6$
*   $P(X = 4) = 1/6$
*   $P(X = 5) = 1/6$
*   $P(X = 6) = 1/6$

As discussed before, this table outlines the probability of throwing a particular number using a fair dice. Now suppose we are playing a board game and that we are not interested in the probability of throwing any of the numbers above, but only about the probability that we throw above (or below a certain number). We can define this as:

$$F(A) = P(X \leq x),$$

where $x$ is a particular number. Then, the cumulative probability as defined here gives the probability that the random variable is less than or equal to $x$. 

The cumulative probability distribution can then be defined as follows:

*   $P(X \leq 1) = 1/6$
*   $P(X \leq 2) = 2/6$
*   $P(X \leq 3) = 3/6$
*   $P(X \leq 4) = 4/6$
*   $P(X \leq 5) = 5/6$
*   $P(X \leq 6) = 6/6$




### Cumulative Distribution Formal Defintion

A cumulative distribution is defined as: 

$$F(A) = P(X \in (-\infty, a]) = \int_{-\infty}^{a} f(x) dx.$$

Some important properties of CDF are as follows:

- It is a continuously increasing function (non-decreasing);
- Range of values lies between [0,1];
- Differentiating CDF gives PDF.



![alt text](http://work.thaslwanter.at/Stats/html/_images/PDF_CDF.png)

## Common Continuous Distributions

Now we will discuss some common types of continuous distributions.


### Uniform Distribution

Uniform distribution is defined by the following function:

$$f(x)=
    \begin{cases}
      1, & \text{if}\ 0 < x < 1 \\
      0, & \text{otherwise}
    \end{cases}$$

A more general definition is given as follows where x is not just restricted between 0 and 1:

$$f(x)=
    \begin{cases}
      \frac{1}{\beta - \alpha}, & \text{if}\ \alpha < x < \beta \\
      0, & \text{otherwise}
    \end{cases}$$

*Note: The range of values on which the function is defined is called the domain whereas the values of the function mapped to the domain is called range.*

- Domain for x is  (-∞, ∞)
- Range is 0, 1
- Parameters: a, b

**One of the most common applications of uniform numbers is to generate random numbers without any bias.** The probability of obtaining an unbiased random number in the range (a,b) is always constant.

*NOTE: The uniform distribution can also be discrete. Consider the example of rolling a die. Probability of each event happening is the same (1/6).*

Standard Uniform Distribution in Python code example:

In [None]:
import numpy as np
from scipy.stats import uniform
import matplotlib.pyplot as plt

In [None]:
fig, ax = plt.subplots(1, 1)
x = np.linspace(0,1, 100)
ax.plot(x, uniform.pdf(x))

Uniform Distribution between point (a,b):

In [None]:
fig, ax = plt.subplots(1, 1)
a=1
b=6
x = np.linspace(a,b, 100)
y = uniform.pdf(x, a, b)
ax.plot(x, y)

## Normal Distributions (Gaussian)

Normal Distribution is also known as the bell shaped curve. The distribution function is defined as:

$$f(x) = \frac{1}{\sqrt{2 \pi \sigma}} e^{-(x-\mu)^{2}/2 \sigma^{2}}$$

- Domain: Any real number 
- Range: Any real number
- Parameters: mean, sd 

Some important properties of normal distribution are:

- It is symmetrical about the mean
- Total area under the curve is 1
- The parameters and determine the shape of the curve
- Mean and standard deviation can take on any real number
- When =0 and =1, it is called a Standard Normal Deviation

Python code example:



In [None]:
import math
from scipy.stats import norm

In [None]:
normal = norm(0, math.sqrt(9)) 
fig, ax = plt.subplots(1, 1)
x = np.linspace(-3,3,1000)
y = norm.pdf(x)
ax.plot(x,y)
plt.show()

In [None]:
# Print the probabiliy of 4
print(normal.pdf(3))

In [None]:
# Print the probability that X \leq 2
print(normal.cdf(2))     

In [None]:
# Draw a random sample
print(normal.rvs()) 

## Exponential Distribution

Exponential Distribution function is defined as: 


$$f(x)=
    \begin{cases}
      \lambda e^{-\lambda x}, & \text{if}\ x \geq 0 \\
      0, & \text{otherwise}
    \end{cases}$$

- Domain: Any real number
- Range: Non Negative real numbers
- Parameter: lambda

![alt text](https://upload.wikimedia.org/wikipedia/commons/thumb/0/02/Exponential_probability_density.svg/1200px-Exponential_probability_density.svg.png)

CDF is given as: 

$$F(A) = 1 - e^{-\lambda a}$$

**The exponential distribution is often concerned with the amount of time until some specific event occurs. For instance, the amount of time customers spend in a store, arrival of buses at the station, the amount of time customers spend on a website, etc.** The distributions of these time measures are often exponential. 




Python code example:


In [None]:
from scipy.stats import expon

In [None]:
"""
N.B. "A common parameterization for expon is in terms of the rate parameter 
lambda, such that pdf = lambda * exp(-lambda * x). This parameterization 
corresponds to using scale = 1 / lambda." (SciPy documentation)
"""
exp = expon(1) 
x = np.linspace(0,10,100)
y = expon.pdf(x, 0, 1)
fig, ax = plt.subplots(1, 1)
ax.plot(x,y)

In [None]:
print(exp.pdf(4))
print(exp.cdf(2))       
#print(exp.rvs()) 

## Summary

In this lesson we learned about continuous random variable and continuous probability distributions including continuous uniform, normal, and exponential distribution. We learn how to characterize distributions by their PDF (probability density function), CDF (cumulative distribution function) and its parameters. We also looked at how to generate this random variables from a distribution and calculate some statistics using SciPy.

Additional Resources:



*   https://www.statisticshowto.datasciencecentral.com/cumulative-distribution-function/
*   https://www.youtube.com/watch?v=OWSOhpS00_s
* https://wiki.ubc.ca/1.4_The_Cumulative_Distribution_Function
* https://towardsdatascience.com/probability-concepts-explained-probability-distributions-introduction-part-3-4a5db81858dc
* https://medium.com/@srowen/common-probability-distributions-347e6b945ce4
* https://machinelearningmastery.com/continuous-probability-distributions-for-machine-learning/

