## Sampling Probability Mass Functions (PMFs)
The probability mass function (PMF) of a discrete random variable $X$ is a function that specifies the probability of obtaining $X = x$, where $x$ is a particular event in the set of possible events we're interested in $\mathcal{F}\subseteq{X\left(\Omega\right)}$:

$$
\begin{equation*}
p_{X}(x) = P\left(X=x\right)
\end{equation*}
$$

where $\mathcal{F}$ is the event space, and $\Omega$ is the sample space. 
A probability mass function must satisfy the condition: 

$$
\begin{equation*}
\sum_{x\in{X(\Omega)}}p_{X}(x)=1
\end{equation*}
$$

### Learning objectives
In this example, we'll use the `Distributions.jl` package to sample various probability mass functions. In particular, we'll look at the PMFs and properties of: [Bernoulli distributed random variables](https://en.wikipedia.org/wiki/Bernoulli_distribution), [Geometric random variables](https://en.wikipedia.org/wiki/Geometric_distribution), [Binomial random variables](https://en.wikipedia.org/wiki/Binomial_distribution) and [Poisson random variables](https://en.wikipedia.org/wiki/Poisson_distribution)

### Setup

In [2]:
include("Include.jl");

[32m[1m  Activating[22m[39m project at `~/Desktop/julia_work/CHEME-5760-Examples-F23`
[32m[1m    Updating[22m[39m registry at `~/.julia/registries/General.toml`
[32m[1m    Updating[22m[39m git-repo `https://github.com/varnerlab/VLDecisionsPackage.jl.git`
[32m[1m   Installed[22m[39m CodecBzip2 ─────── v0.8.0
[32m[1m   Installed[22m[39m MathOptInterface ─ v1.20.0
[32m[1m    Updating[22m[39m `~/Desktop/julia_work/CHEME-5760-Examples-F23/Project.toml`
  [90m[10f378ab] [39m[93m~ VLDecisionsPackage v0.1.0 `https://github.com/varnerlab/VLDecisionsPackage.jl.git#main` ⇒ v0.1.0 `https://github.com/varnerlab/VLDecisionsPackage.jl.git#main`[39m
[32m[1m    Updating[22m[39m `~/Desktop/julia_work/CHEME-5760-Examples-F23/Manifest.toml`
  [90m[523fee87] [39m[93m↑ CodecBzip2 v0.7.2 ⇒ v0.8.0[39m
  [90m[187b0558] [39m[93m↑ ConstructionBase v1.5.3 ⇒ v1.5.4[39m
  [90m[b8f27783] [39m[93m↑ MathOptInterface v1.19.0 ⇒ v1.20.0[39m
  [90m[892a3eda] [39m[93m↑ StringM

budget (generic function with 1 method)

In [20]:
p = 0.64;
number_of_samples = 100;
number_of_trials = 100;

### Bernoulli random variables
A Bernoulli random variable $X$ models a binary outcome: either \texttt{1} or \texttt{0}, 
where \texttt{1} occurs with probability $p$ and \texttt{0} occurs with probability $1-p$. 
The probability mass function (pmf) of the Bernoulli random variable $X$ is:

$$
\begin{equation}
p_{X}(x) = \begin{cases}
    p & \text{if } x = 1 \\
    1 - p & \text{if } x = 0
  \end{cases}
\end{equation}
$$

where $0<p<1$ is called the Bernoulli parameter. The expectation a Bernoulli random variable equals:

$$
\begin{equation}
\mathbb{E}\left[X\right] = p
\end{equation}
$$

while the variance $\text{Var}(X)$ equals:

$$
\begin{equation}
\text{Var}\left[X\right] = p\cdot{(1-p)}
\end{equation}
$$

In [17]:
let p = p, number_of_samples = number_of_samples

    # build a Bernoulli distribution
    d = Bernoulli(p)

    # sample (check expectation, and variance)
    samples = rand(d, number_of_samples);

    # build a table -
    data_for_table = Array{Any,2}(undef, 2, 3)
    table_header = ["", "E(X)", "Var(X)"]

    # row 1: model
    data_for_table[1,1] = "model"
    data_for_table[1,2] = mean(d);
    data_for_table[1,3] = var(d);

    # row 2: samples
    data_for_table[2,1] = "samples"
    data_for_table[2,2] = mean(samples);
    data_for_table[2,3] = var(samples);
    pretty_table(data_for_table, header=table_header);
end

┌─────────┬──────┬─────────┐
│[1m         [0m│[1m E(X) [0m│[1m  Var(X) [0m│
├─────────┼──────┼─────────┤
│   model │ 0.64 │  0.2304 │
│ samples │ 0.62 │ 0.23798 │
└─────────┴──────┴─────────┘


### Geometric random variables
Geometric random variables model the number of trials required 
to obtain the first success in a sequence of independent Bernoulli trials.  The probability mass function for a geometric random variable is given by:

$$
\begin{equation*}
p_{X}(k) = (1-p)^{(k-1)}p\qquad{k=1,2,\dots}
\end{equation*}
$$

where $p$ denotes the geometric parameter $0<p<1$. The expectation and variance of a geometric random variable $X$ is given by:

$$
\begin{equation*}
\mathbb{E}\left[X\right] = \frac{1}{p}
\end{equation*}
$$

while the variance $\text{Var}(X)$ is given by:

$$
\begin{equation*}
\text{Var}\left[X\right] = \frac{1-p}{p^2}
\end{equation*}
$$

In [26]:
let p = p, number_of_samples = number_of_samples
   
    # build a Geometric distribution
    d = Geometric(p)

    # sample (check expectation, and variance)
    samples = rand(d, number_of_samples);

    # build a table -
    data_for_table = Array{Any,2}(undef, 2, 3)
    table_header = ["", "E(X)", "Var(X)"]

    # row 1: model
    data_for_table[1,1] = "model"
    data_for_table[1,2] = succprob(d);
    data_for_table[1,3] = var(d);

    # row 2: samples
    data_for_table[2,1] = "samples"
    data_for_table[2,2] = mean(samples);
    data_for_table[2,3] = var(samples);
    pretty_table(data_for_table, header=table_header);
end

┌─────────┬──────┬──────────┐
│[1m         [0m│[1m E(X) [0m│[1m   Var(X) [0m│
├─────────┼──────┼──────────┤
│   model │ 0.64 │ 0.878906 │
│ samples │ 0.58 │ 0.811717 │
└─────────┴──────┴──────────┘


### Binomial random variables
The binomial distribution, the probability of $k$ successes in $n$ independent Bernoulli trials, has the 
probability mass function:

$$
\begin{equation*}
p_{X}(k) = \binom{n}{k}p^{k}\left(1-p\right)^{n-k}\qquad{k=0,1,\dots,n}
\end{equation*}
$$

where $k$ denotes the number of successes in $n$ independent experiments, the binomial parameter $0<p<1$ is the probability 
of a successful trial and:

$$
\begin{equation*}
\binom{n}{k} = \frac{n!}{k!\left(n-k\right)!}
\end{equation*}
$$

is the binomial coefficient. The expectation and variance of a binomial random variable is given by:

$$
\begin{eqnarray*}
\mathbb{E}\left[X\right] &=& np\\
\text{Var}\left[X\right] &=& np(1-p)
\end{eqnarray*}
$$

In [27]:
let p = p, number_of_samples = number_of_samples, number_of_trials = number_of_trials

    # build a Binomial distribution
    d = Binomial(number_of_trials,p)

    # sample (check expectation, and variance)
    samples = rand(d,number_of_samples);

    # build a table -
    data_for_table = Array{Any,2}(undef, 2, 3)
    table_header = ["", "E(X)", "Var(X)"]

    # row 1: model
    data_for_table[1,1] = "model"
    data_for_table[1,2] = mean(d);
    data_for_table[1,3] = var(d);

    # row 2: samples
    data_for_table[2,1] = "samples"
    data_for_table[2,2] = mean(samples);
    data_for_table[2,3] = var(samples);
    pretty_table(data_for_table, header=table_header);
end

┌─────────┬───────┬─────────┐
│[1m         [0m│[1m  E(X) [0m│[1m  Var(X) [0m│
├─────────┼───────┼─────────┤
│   model │  64.0 │   23.04 │
│ samples │ 63.43 │ 24.4496 │
└─────────┴───────┴─────────┘


### Poisson random variables
Poisson random variables model the number of occurrences of an event in a fixed interval of time or space.
The probability mass function for a Poisson random variable is given by:

$$
\begin{equation*}
p_{X}(x) = \frac{\lambda^{x}}{x!}\exp\left(-\lambda\right)
\end{equation*}
$$

where $\lambda>0$ denotes the Poisson parameter, and $!$ denotes the factorial function. The expectation of a Poisson random variable $X$ is given by:

$$
\begin{equation*}
\mathbb{E}\left[X\right] = \lambda
\end{equation*}
$$

while the variance $\text{Var}(X)$ is given by:

$$
\begin{equation*}
\text{Var}\left[X\right] = \lambda
\end{equation*}
$$

In [30]:
let λ = p, number_of_samples = number_of_samples

    # build a Poisson distribution
    d = Poisson(p)

    # sample (check expectation, and variance)
    samples = rand(d, number_of_samples);

    # build a table -
    data_for_table = Array{Any,2}(undef, 2, 3)
    table_header = ["", "E(X)", "Var(X)"]

    # row 1: model
    data_for_table[1,1] = "model"
    data_for_table[1,2] = mean(d);
    data_for_table[1,3] = var(d);

    # row 2: samples
    data_for_table[2,1] = "samples"
    data_for_table[2,2] = mean(samples);
    data_for_table[2,3] = var(samples);
    pretty_table(data_for_table, header=table_header);
end

┌─────────┬──────┬──────────┐
│[1m         [0m│[1m E(X) [0m│[1m   Var(X) [0m│
├─────────┼──────┼──────────┤
│   model │ 0.64 │     0.64 │
│ samples │ 0.58 │ 0.650101 │
└─────────┴──────┴──────────┘
