![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)
# **Review:`pdf`, `cdf`, and `ppf` of Continuous Random Variables**

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)
## **Terms and Properties of Continuous Probability Distributions**

### **$\color{green}{\textbf{p}}\color{red}{\textbf{df:}}$ Probability Density Function => $\color{green}{\textbf{ a curve}}$**:
* f(X=x)

### **$\color{blue}{\textbf{c}}\color{red}{\textbf{df:}}$ Cumulative Distribution Function => $\color{blue}{\textbf{ area}}$ under the $\color{red}{\textbf{ pdf}} \color{green}{\textbf{ curve}}$**

* $f(k_1 \le x \le k_2) = f(k_1 < x \le k_2) = f(k_1 \le x < k_2) = f(k_1 < x < k_2)$
* The **total** area under any $\color{red}{\textbf{ pdf}} =$ total cummulative probability $\color{blue}{\textbf{cdf}} = 1$

### **$\color{orange}{\textbf{pp}}\color{red}{\textbf{f:}}$ Percent point function $\color{blue}{\textbf{c}}\color{red}{\textbf{df}}$'s inverse => $\color{orange}{\textbf{ a point}}$ x axis for random variable, corresponding to the $\color{blue}{\textbf{ area}}$ = $\color{blue}{\textbf{c}}\color{red}{\textbf{df}}$ under the $\color{green}{\textbf{p}}\color{red}{\textbf{df:}}$ \color{green}{\textbf{ curve}}$**

$x = f^{-1}(p)$ $p$ is the percentile

In [1]:
#@title

mu =   16.43
population_std =  0.8
sample_size =   15
alpha = 0.20

import numpy as np
import plotly.graph_objects as go
import scipy.stats  as stats

std = population_std /np.sqrt(sample_size)  # sample maen std

#Compute the probability
x_left = stats.norm.ppf(alpha, loc = mu, scale = std)
prob_left = stats.norm.cdf(x= x_left, loc = mu, scale= std
                      )

prob_right = 1 - prob_left

fig = go.Figure()

x_min = mu - 5 * std
x_max = mu + 5 * std

#add the density curve ppf to the plot
x_pdf = np.arange(x_min, x_max,0.01) # array of input values for pdf
y_pdf = stats.norm.pdf(x_pdf, loc = mu, scale = std)
fig.add_trace(go.Scatter(
                          x = x_pdf, y = y_pdf, mode = "lines", line_color ="black"
                      ))

# inputs for creating shade under the left tail
xx_left = np.arange(x_min, x_left, 0.01)
yy_left = stats.norm.pdf(xx_left, loc = mu, scale = std)
fig.add_trace(go.Scatter(
                          x = xx_left, y = yy_left, line_color = "black", fill = "tozeroy", fillcolor = 'lightgreen'
                      ))

# compute the critical value at significance level of alpha

#x_ct = stats.norm.ppf(alpha, loc = mu, scale = std)


y_ct = stats.norm.pdf(x_left, loc = mu, scale = std)

fig.add_shape(type = 'line', x0 = x_left, y0 = -.03, x1 = x_left, y1 = 0.03, line = dict( color="red", dash = 'dot',width = 9))

# describe cdf
fig.add_annotation(
    x = x_left - 0.15
    , y = (1/8) * stats.norm.pdf(mu, loc = mu, scale = std)
    , text = f"Cumulative Distribution Function<br> An<b>  area</b> represents <br> <b>  cdf</b>(x < {x_left: .1f} ) <br>= probability <br>= P( x  < {x_left: .1f} ) = {prob_left: .3f}"
    , font=dict(size=25, color="purple", family="Sans Serif")
    , align="left",

      ax= -120,
      ay=-120,

      arrowhead=3,
      arrowsize=1,
      arrowwidth=3,
      xanchor="center",
      yanchor="bottom",
    )
#describe ppf
fig.add_annotation(
      x=x_left,
      y=0,
      ax=60,
      ay=-60,
      text = "Percent Probability Function <br>A <b>point</b> represents <br>Value of variable <br>= ppf(area/probability)",
      arrowhead=3,
      arrowsize=1,
      arrowwidth=3,
      xanchor="left",
      yanchor="bottom",
      font=dict(size=25, color="purple", family="Sans Serif")
      )
# describe pdf
fig.add_annotation(
      x= mu + 0.1,
      y= stats.norm.pdf(mu + 0.1, loc = mu, scale = std) ,
      ax = 100,
      ay= 0 ,
      text="probability density function<br>a <b>curve </b><br>represents pdf(x) ",
      arrowhead=3,
      arrowsize=1,
      arrowwidth=3,
      xanchor="left",
      yanchor="middle",
      font=dict(size=25, color="purple", family="Sans Serif")
        )

fig.update_layout(
    height = 800, width = 1200,
    title = "Illustration of pdf, cdf and ppf",
    title_x = 0.5,
    xaxis = dict( title = r"$X$", zeroline=True,  linewidth=1, linecolor='black', mirror=True,),
    yaxis = dict (title = "Probability Density Function", zeroline=True,  linewidth=1, linecolor='black', mirror=True),
    showlegend = False,
    font=dict(size=16, color="black", family="Sans Serif"),
    plot_bgcolor='rgba(0,0,0,0)',
    )

fig.show()

![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)
# **Visualization of Cumulative Distribution Function `cdf`**

<details>
  <summary><b>Show Visual Illustration of different intervales of random variables for pdf and cdf</b></summary>
<img src = "https://drive.google.com/uc?id=1bL0qo_K5ChM0Yu6YfX1Yw7vhP0VRAXCK" />
</details>

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)
## **Mathematical Notations of Continuous Random Variables and their Probability Distributions Functions**

### **Uniform Distribution Function: $ X \sim U(a, b)$**

<details>
  <summary><b>Show explanation of parameters of uniform distribution </b></summary>
$U: \text{Uniform Probability Density Distribution Function}\\
X: \text{ a continuous random variable} $

<b>The Parameters:</b>\
$a= \text{the lowest value of }X \text{ and } \\
b = \text{ the highest alue of }X.$
</details>

### **Normal Distribution Function: $ X \sim N(\mu, \sigma)$ or $ X \sim N(\bar{x}, s)$**

<details>
  <summary><b>Show explanation of parameters of normal distribution distribution </b></summary>
$N: \text{Normal Probability Density Distribution Function}\\
X: \text{ a continuous random variable} $

<b>The Parameters:</b>\
$\mu= \text{population mean } \\
\sigma = \text{ population standard deviation } \\
\bar{x} = \text{sample mean } \\
x = \text{ sample standard deviation }$
</details>

### **Standard Normal Distribution Function: $ X \sim N(0, 1)$**

### **Student's t Distribution Function: $ X \sim T_{df}$**

<details>
  <summary><b>Show explanation of parameters of Student's t distribution </b></summary>
$T: \text{Student's t Probability Density Distribution Function}\\
X: \text{ a continuous random variable} $

<b>The Parameters:</b>\
$df = \text{the degree of freedom of the distribution} = n-1, \text{where } n \text{ is the sample size.}$
</details>:


### **Chi-Square t Distribution Function: $ X \sim \chi^2_{df}$**

<details>
  <summary><b>Show explanation of parameters of Chi-Squared distribution </b></summary>
$\chi^2: \text{Chi-Sqaure: Probability Density Distribution Function}\\
X: \text{ a continuous random variable} $

<b>The Parameters:</b>\
$df = \text{the degree of freedom of the distribution}$
</details>

### **F Distribution Function: $ X \sim F_{df_n\,, \, df_d}$**

<details>
  <summary><b>Show explanation of parameters of Chi-Square distribution </b></summary>
$F: \text{Chi-Sqaured Probability Density Distribution Function}$

$$F_{df_n \, ,\, df_d} =\frac{\chi_n^{2}\, \big / \,df_n}{\chi_d^{2} \,\big / \,df_d}$$

$X: \text{ a continuous random variable} $

<b>The Parameters:</b>\
$df_n,  df_d =$ the degree of freedom of the Chi-Square Distributions in the numerator and denominator respectively.
</details>

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)
# **Normal Distribution $ X \sim N(\mu, \sigma)$ or $ X \sim N(\bar{x}, s)$**

---

## **Syntax Structure:** $\hspace{20mm}$ `scipy.stats.norm`.**$\color{green}{\text{method}}( \color{magenta}{\text{parameters}})$**

|Method with Parameters|Output|
|--|--|
|```rvs(loc=0, scale=1, size=1, random_state=None)```|Random variates|
|```pdf(x, loc=0, scale=1)```|Probability density function|
|```cdf(x, loc=0, scale=1)```|Cumulative distribution function|
|```ppf(q, loc=0, scale=1)```|Percent point function (inverse of cdf — percentiles)|
|```stats(loc=0, scale=1, moments=’mv’)```|Mean(‘m’), variance(‘v’), skew(‘s’), and/or kurtosis(‘k’)```|
|```median(loc=0, scale=1)```|Median of the distribution.|
|```mean(loc=0, scale=1)```|Mean of the distribution.|
|```var(loc=0, scale=1)```|Variance of the distribution.|
|```std(loc=0, scale=1)```|Standard deviation of the distribution.|
|```interval(cl, loc=0, scale=10)```|Endpoints of the range that contains alpha percent of the distribution|

* `loc` represents mean and `scale` represents standard deviation

---

## **🔖 $\color{blue}{\textbf{Lab Work:}}$** Find the Probability $P(1.8 < x < 2.75)$ where the random variable $X$ has a normal distribution $X \sim N(2, 0.5)$ and graph the probability

In [2]:
import numpy as np
import plotly.graph_objects as go
import scipy.stats as stats

x_left = 1.8 # left end of the random variable
x_right = 2.75 # right end of the random variable
mu = 2 # mean
std = 0.5 # standard deviation

# Calculate the probability for the random variable with values between 1.8 and 2.75
# Calculate the cdf for the x_left
prob_left = stats.norm.cdf(
    x_left, # left end point
    loc=mu,
    scale=std
)

# Calculate the cdf for x_right, the shaded area from -infinity < x < x_right
prob_right = stats.norm.cdf(
    x_right,
    loc=mu,
    scale=std
)
# You could calculate the right tail by doing 1 - prob_right

prob_middle = prob_right - prob_left
print(f"The probability of a random variable 1.8 < x < 2.75: {prob_middle:.4f}")


The probability of a random variable 1.8 < x < 2.75: 0.5886


In [3]:
x_min = mu - 5 * std # Spread to the left by five standard deviations
x_max = mu + 5 * std # Spread to the right by five standard deviations

# Compute the ordered pairs to plot

# Create a list of values of the random variable between `x_min` and `x_max`
# The more points we have, the smoother the pdf curve will be
x_pdf = np.arange(
    x_min,
    x_max,
    0.01 # Resolution
)

# Calculate the y-values with pdf (Probability Density Function) for each value of a
# random variable in the previously defined list. This is distribution dependent.
y_pdf = stats.norm.pdf(
    x=x_pdf, # list of values of the random variable
    loc=mu, # location of the center
    scale=std # Spread of the curve
)

# Plot the normal probability density function (pdf)
fig = go.Figure() # Opens a plotting area
fig.add_trace(
    go.Scatter(
        x=x_pdf, # x-coordinate for the plot
        y=y_pdf, # y-coordinate for the plot
        mode="lines",
        line_color="black"
    )
)

# Shade area from x_left to x_right (the middle part)
xx_mid = np.arange(x_left, x_right, 0.01) # An array of values for shading
yy_mid = stats.norm.pdf(
    x=xx_mid,
    loc=mu,
    scale=std
)

fig.add_trace(
    go.Scatter(
        x=xx_mid, # x-coordinate for the middle plot
        y=yy_mid, # y-coordinate for the middle plot
        line_color="rgba(47, 128, 29, 1.0)",
        fill="tozeroy",
        fillcolor="rgba(47, 128, 29, 0.5)"
    )
)

# Specify layout parameters
fig.update_layout(
    height=500,
    width=1000,
    title="Normal Distribution Probability Function",
    title_x=0.5,
    xaxis=dict(
        title="Random Variable X",
        zeroline=True,
        linewidth=1,
        linecolor="black"
    ),
    yaxis=dict(
        title="Probability Density Function: f(x)",
        zeroline=True,
        linewidth=1,
        linecolor="black"
    ),
    font=dict(
        size=16,
        color="grey",
        family="Sans Serif"
    ),
    plot_bgcolor="white"
)
fig.add_annotation(
    x=mu - 3 * std,
    y=(1/3) * stats.norm.pdf(x=mu, loc=mu, scale=std),
    text=f"Shaded area represents <br>P({x_left} < x < {x_right})<br>value = {prob_middle:.4f}",
    font=dict(
        size=18,
        color="blue",
        family="Sans Serif"
    ),
    align="left"
)
fig.show()

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)
## **🔖 $\color{green}{\textbf{TODO 1: }}$**


> ### **1.** Compute the Probability $P(x < 70)$ where the random variable $X$ has a Student's t distribution $X \sim t_{9},\, \mu = 65$, and $std = 3$
> ### **2.** Graph the pdf and indicate the area that represents the probability.
> ### **3.** Adap your plot code in **1** into an APP to compute and illustrate $P(x < x_{\text{left}})$ to allow the user to input the value of random variable x_left (or another variable name of your choice) and degree of freedom.

In [8]:
# @title Interactive Probability Visualization {"run":"auto"}
df_user = 5 # @param {"type":"slider","min":1,"max":50,"step":1}
random_variable = 65 # @param {"type":"slider","min":50,"max":80,"step":1}
mu = 65
std = 3
df = df_user
x_right = random_variable

# Calculate the cdf for x_right, the shaded area from x < x_right
prob_right = stats.t.cdf(
    x_right,
    df=df,
    loc=mu,
    scale=std
)

x_min = mu - 5 * std # Spread to the left by five standard deviations
x_max = mu + 5 * std # Spread to the right by five standard deviations

# Compute the ordered pairs to plot

# Create a list of values of the random variable between `x_min` and `x_max`
# The more points we have, the smoother the pdf curve will be
x_pdf = np.arange(
    x_min,
    x_max,
    0.01 # Resolution
)

# Calculate the y-values with pdf (Probability Density Function) for each value of a
# random variable in the previously defined list. This is distribution dependent.
y_pdf = stats.t.pdf(
    x=x_pdf, # list of values of the random variable
    df=df,
    loc=mu, # location of the center
    scale=std # Spread of the curve
)

# Plot the normal probability density function (pdf)
fig = go.Figure() # Opens a plotting area
fig.add_trace(
    go.Scatter(
        x=x_pdf, # x-coordinate for the plot
        y=y_pdf, # y-coordinate for the plot
        mode="lines",
        line_color="black"
    )
)

# Shade area from x_left to x_right (the middle part)
x_shade = np.arange(x_min, x_right, 0.01) # An array of values for shading
yy_mid = stats.t.pdf(
    x=x_shade,
    df=df,
    loc=mu,
    scale=std
)

fig.add_trace(
    go.Scatter(
        x=x_shade, # x-coordinate for the middle plot
        y=yy_mid, # y-coordinate for the middle plot
        line_color="rgba(47, 128, 29, 1.0)",
        fill="tozeroy",
        fillcolor="rgba(47, 128, 29, 0.5)"
    )
)

# Specify layout parameters
fig.update_layout(
    height=500,
    width=1000,
    title="Student's T Distribution Probability Function",
    title_x=0.5,
    xaxis=dict(
        title="Random Variable X",
        zeroline=True,
        linewidth=1,
        linecolor="black"
    ),
    yaxis=dict(
        title="Probability Density Function: f(x)",
        zeroline=True,
        linewidth=1,
        linecolor="black"
    ),
    font=dict(
        size=16,
        color="grey",
        family="Sans Serif"
    ),
    plot_bgcolor="white"
)
fig.add_annotation(
    x=mu - 3 * std,
    y=(1/3) * stats.t.pdf(x=mu, df=df, loc=mu, scale=std),
    text=f"Shaded area represents <br>P(x < {x_right})<br>value = {prob_right:.4f}",
    font=dict(
        size=18,
        color="blue",
        family="Sans Serif"
    ),
    align="left"
)
fig.show()

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)
## **🔖 $\color{green}{\textbf{TODO 2: }}$**


> ### **1.** Compute the Probability $P(1.4 < x < 2.2)$ where the random variable $X$ has a normal distribution $X \sim N(2.1, 0.7)$

>### **2.**Graph the pdf and indicate the area that represents the probability.
>### **3.** Adap your plot code in **1** into an APP to compute and illustrate $P( x_{\text{left}} < x < x_{\text{right}}$ to allow the user to input the value of random variable x_left  and x_right (or other variable names of your choices), standard deviation, and mean of the distribution.

In [None]:
# @title Probability within a Normal Distribution {"run":"auto"}
# @markdown Drag the slider to modify the left bound
user_x_left = 2.1 # @param {"type":"slider","min":0,"max":4,"step":0.1}
# @markdown Drag the slider to modify the right bound
user_x_right = 2.2 # @param {"type":"slider","min":0,"max":4,"step":0.1}
if user_x_left >= user_x_right:
    raise ValueError("Left bound must be less than right bound")

x_left = user_x_left # left end of the random variable
x_right = user_x_right # right end of the random variable
mu = 2.1 # mean
std = 0.7 # standard deviation

# Calculate the probability for the random variable with values between 1.8 and 2.75
# Calculate the cdf for the x_left
prob_left = stats.norm.cdf(
    x_left, # left end point
    loc=mu,
    scale=std
)

# Calculate the cdf for x_right, the shaded area from -infinity < x < x_right
prob_right = stats.norm.cdf(
    x_right,
    loc=mu,
    scale=std
)
# You could calculate the right tail by doing 1 - prob_right
prob_middle = prob_right - prob_left

x_min = mu - 5 * std # Spread to the left by five standard deviations
x_max = mu + 5 * std # Spread to the right by five standard deviations

# Compute the ordered pairs to plot

# Create a list of values of the random variable between `x_min` and `x_max`
# The more points we have, the smoother the pdf curve will be
x_pdf = np.arange(
    x_min,
    x_max,
    0.01 # Resolution
)

# Calculate the y-values with pdf (Probability Density Function) for each value of a
# random variable in the previously defined list. This is distribution dependent.
y_pdf = stats.norm.pdf(
    x=x_pdf, # list of values of the random variable
    loc=mu, # location of the center
    scale=std # Spread of the curve
)

# Plot the normal probability density function (pdf)
fig = go.Figure() # Opens a plotting area
fig.add_trace(
    go.Scatter(
        x=x_pdf, # x-coordinate for the plot
        y=y_pdf, # y-coordinate for the plot
        mode="lines",
        line_color="black"
    )
)

# Shade area from x_left to x_right (the middle part)
xx_mid = np.arange(x_left, x_right, 0.01) # An array of values for shading
yy_mid = stats.norm.pdf(
    x=xx_mid,
    loc=mu,
    scale=std
)

fig.add_trace(
    go.Scatter(
        x=xx_mid, # x-coordinate for the middle plot
        y=yy_mid, # y-coordinate for the middle plot
        line_color="rgba(47, 128, 29, 1.0)",
        fill="tozeroy",
        fillcolor="rgba(47, 128, 29, 0.5)"
    )
)

# Specify layout parameters
fig.update_layout(
    height=500,
    width=1000,
    title="Normal Distribution Probability Function",
    title_x=0.5,
    xaxis=dict(
        title="Random Variable X",
        zeroline=True,
        linewidth=1,
        linecolor="black"
    ),
    yaxis=dict(
        title="Probability Density Function: f(x)",
        zeroline=True,
        linewidth=1,
        linecolor="black"
    ),
    font=dict(
        size=16,
        color="grey",
        family="Sans Serif"
    ),
    plot_bgcolor="white"
)
fig.add_annotation(
    x=mu - 3 * std,
    y=(1/3) * stats.norm.pdf(x=mu, loc=mu, scale=std),
    text=f"Shaded area represents <br>P({x_left} < x < {x_right})<br>value = {prob_middle:.4f}",
    font=dict(
        size=18,
        color="blue",
        family="Sans Serif"
    ),
    align="left"
)
fig.show()

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)
## **$\color{green}{\textbf{TODO 3:}}$**



>### **1.** Compute the Probability for a normal approximation of the binomial distribution of one sample proportion $P(\hat{p} < -\frac{x_{\text{success_ sample}}}{\text{sample size}}=-\frac{55}{120})$ or $P(\hat{p} > \frac{x_{\text{succes_ssample}}}{\text{sample size}})$ where the success of each trial is $p = 0.41$ and the random variable $\hat{p}$ has a normal distribution $ \hat{p} \sim N\left(p, \sqrt{\frac{p(1-p)}{\text{sample size}}}\right)$
>### **2.**Graph the pdf and indicate the area that represents the probability.

In [None]:
import numpy as np
import plotly.graph_objects as go
import scipy.stats as stats

# Parameters for binomial experiment
p_success = 0.41
p_hat = (55/120)
sample_size = 120

# Approximate 100 binomial experiments with normal distribution
mu = p_success
std = np.sqrt(p_success * (1 - p_success)/sample_size)

x_right = p_hat
x_left = p_success - (p_hat - p_success)

# Calculate the probability of two tails
prob_left = stats.norm.cdf(
    x=x_left,
    loc=mu,
    scale=std
)
prob_tails = 2 * prob_left

fig = go.Figure()

x_min = mu - 5 * std
x_max = mu + 5 * std

x_pdf = np.arange(x_min, x_max, 0.001)
y_pdf = stats.norm.pdf(
    x_pdf,
    loc=mu,
    scale=std
)

fig.add_trace(
    go.Scatter(
        x=x_pdf,
        y=y_pdf,
        mode="lines",
        line_color="black"
    )
)

# Left tail
xx_left = np.arange(x_min, x_left, 0.001)
yy_left = stats.norm.pdf(
    x=xx_left,
    loc=mu,
    scale=std
)

fig.add_trace(
    go.Scatter(
        x=xx_left,
        y=yy_left,
        line_color="magenta",
        fill="tozeroy",
        fillcolor="rgba(204, 235, 52, 0.7)"
    )
)

# Right tail
xx_right = np.arange(x_right, x_max, 0.001)
yy_right = stats.norm.pdf(
    x=xx_right,
    loc=mu,
    scale=std
)

fig.add_trace(
    go.Scatter(
        x=xx_right,
        y=yy_right,
        line_color="magenta",
        fill="tozeroy",
        fillcolor="rgba(204, 235, 52, 0.7)"
    )
)

fig.add_annotation(
    x=mu - 3 * std,
    y=(1/3) * stats.norm.pdf(x=mu, loc=mu, scale=std),
    text=f"Shaded area <br>represents Probability<br>P(x < {x_left:.3f})<br>={prob_left:.3f}",
    font=dict(
        size=14,
        color="black"
    ),
    align="left"
)
fig.add_annotation(
    x=mu + 3 * std,
    y=(1/3) * stats.norm.pdf(x=mu, loc=mu, scale=std),
    text=f"Shaded area <br>represents Probability<br>P(x > {x_right:.3f})<br>={prob_left:.3f}",
    font=dict(
        size=14,
        color="black"
    ),
    align="left"
)

fig.update_layout(
    width = 1000, height = 500,
    title = f"Normal Distribution Probability Function (Two Tails)",
    title_x = 0.5,
    yaxis = dict(
        title="Probability Density Function",
        zeroline=True,
        linewidth=1,
        linecolor="black"
    ),
    xaxis = dict(
        title="Random Variable X",
        zeroline=True,
        linewidth=1,
        linecolor="black"
    ),
    xaxis_tickangle = 0,
    showlegend = False,
    plot_bgcolor = "white"
)

fig.show()

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)
## **$\color{green}{\textbf{TODO 4:}}$**



> ### **1.** Find the Probability $P(x > 0.3)$ where the random variable $X$ has an F distribution $X \sim F_{1, 11}$
> ### **2.**Graph the pdf and indicate the area that represents the probability.

In [None]:
mu = 0
std = 1
dfn = 1
dfd = 11
x_min = 0  # F distribution is not negative
x_max = mu + 5 * std

x_left = 0.3 # right end of the random variable

# Calculate the cdf for the x_left
prob_left = stats.f.cdf(
    x_left, # end point
    dfn=dfn,
    dfd=dfd
)
prob_right = 1 - prob_left
print(f"The probability of a random variable x < {x_left}: {prob_left:.4f}")
print(f"The probability of a random variable x > {x_left}: {prob_right:.4f}")

x_pdf = np.arange(x_min, x_max, 0.01)  # 0.01 is called resolution, the larger the less rsolution
#calculate the y values pdf for each value of random variable in x_pdf
#this is distribution dependent
y_pdf = stats.f.pdf(
    x = x_pdf,
    dfn = dfn,
    dfd = dfd
)

xx_mid = np.arange(x_left, x_max, 0.01) # An array of values for shading

yy_mid = stats.f.pdf(
    x=xx_mid,
    dfn=dfn,
    dfd=dfd,
    loc=mu,
    scale=std
)

fig = go.Figure()  #open a plotting area

fig.add_trace(
    go.Scatter(
        x=x_pdf,  #x coordinate for the plot
        y=y_pdf,  #y coordinate
        mode="lines",
        line_color="black"
    )
)

fig.add_trace(
    go.Scatter(
        x=xx_mid, # x-coordinate for the middle plot
        y=yy_mid, # y-coordinate for the middle plot
        line_color="rgba(47, 128, 29, 1.0)",
        fill="tozeroy",
        fillcolor="rgba(47, 128, 29, 0.5)"
    )
)

fig.add_annotation(
    x=mu + 3 * std,
    y=(1/2) * stats.f.pdf(x=mu, dfn=dfn, dfd=dfd, loc=mu, scale=std),
    text=f"Shaded area represents <br>P(x > {x_left})<br>value = {prob_middle:.4f}",
    font=dict(
        size=18,
        color="black",
        family="Sans Serif"
    ),
    align="right"
)

fig.update_layout(
    height=500,
    width=1000,
    title="F Distribution Probability Function",
    title_x=0.5,
    xaxis=dict(
        title="Random Variable X",
        zeroline=True,
        linewidth=1,
        linecolor="black"
    ),
    yaxis=dict(
        title="Probability Density Function: f(x)",
        zeroline=True,
        linewidth=1,
        linecolor="black"
    ),
    font=dict(
        size=16,
        color="grey",
        family="Sans Serif"
    ),
    plot_bgcolor="white"
)

fig.show()

The probability of a random variable x < 0.3: 0.4052
The probability of a random variable x > 0.3: 0.5948


![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)
## **$\color{green}{\textbf{TODO 5:}}$**



> ### **1.** Compute the Probability  $P(x > 2)$ where the random variable $X$ has a $\chi^2$ distribution $X \sim \chi^2_{3}$
> ### **2.**Graph the pdf and indicate the area that represents the probability.

In [None]:
dof = 3
mu = dof
std = np.sqrt(2 * dof)
# use mean as a center
#define the range of values the random variable
x_min = 0  # Because we are squaring chi, we will not have any negative numbers
x_max = mu + 5 * std  # spread to the right by 5 standard deviation


x_left = 2
prob_left = stats.chi2.cdf(
    x_left, # end point
    df=dof
)
prob_right = 1 - prob_left
print(f"The probability of a random variable x < {x_left}: {prob_left:.4f}")
print(f"The probability of a random variable x > {x_left}: {prob_right:.4f}")

xx_mid = np.arange(x_left, x_max, 0.01) # An array of values for shading

yy_mid = stats.chi2.pdf(
    x=xx_mid,
    df=dof
)

#create a list of values of the random variable between the x min and x max
#the more points we have, the smoother the pdf curve will be
x_pdf = np.arange(x_min, x_max, 0.01)  # 0.01 is called resolution, the larger the less rsolution
#calculate the y values pdf for each value of random variable in x_pdf
#this is distribution dependent
y_pdf = stats.chi2.pdf(
    x = x_pdf,
    df = dof, # Degree of freedom for the distribution
)

#plot the chi^2 probabilty density function pdf

fig = go.Figure()  #open a plotting area

fig.add_trace(
    go.Scatter(
        x = x_pdf,  #x coordinate for the plot
        y = y_pdf,  #y coordinate
        mode = "lines",
        line_color = "black"
    )
)
fig.add_trace(
    go.Scatter(
        x=xx_mid, # x-coordinate for the middle plot
        y=yy_mid, # y-coordinate for the middle plot
        line_color="rgba(47, 128, 29, 1.0)",
        fill="tozeroy",
        fillcolor="rgba(47, 128, 29, 0.5)"
    )
)

fig.update_layout(
    height = 500, width = 800,
    title = r"$\chi^2 \text{ Probability Density Function}$",
    title_x = 0.5,
    xaxis = dict(
        title = "Random Variable x",
        zeroline = True,
        linewidth = 1,
        linecolor = "black"
    ),
    yaxis = dict(
        title = "Probability Density Function pdf: f(x)",
        zeroline = True,
        linewidth = 1,
        linecolor = "black"
    ),
    font = dict(
        size = 16,
        color = "black",
        family = "Sans Serif"
    ),
    plot_bgcolor = "white"
)

fig.show()

The probability of a random variable x < 2: 0.4276
The probability of a random variable x > 2: 0.5724


![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)
## **$\color{green}{\textbf{TODO 6:}}$**

> ### **1.** Find the 25 percentile of the normal distribution distribution with mean = 15 and standard deviation of 2
> ### **2.** Find the 25 percentile of the student's t distribution with mean = 15 and standard deviation 2, and degree of freedom of 5.
> ### **3** Discuss in this case if you can replace the Student's t distribution with the related normal distribution.

In [None]:
mu = 15
std = 2
dof = 5
percentile = 0.25
print(f"The {percentile:.1%} percentile of the normal distribution(mu: {mu}, std: {std}) is: {stats.norm.ppf(0.25, mu, std):.4f}")
print(f"The {percentile:.1%} percentile of the student's t distribution(mu: {mu}, std: {std}, df: {dof}) is: {stats.t.ppf(0.25, df=dof, loc=mu, scale=std):.4f}")

The 25.0% percentile of the normal distribution(mu: 15, std: 2) is: 13.6510
The 25.0% percentile of the student's t distribution(mu: 15, std: 2, df: 5) is: 13.5466


In this instance, the degrees of freedom for the Student's t distribution are still too few to replace with the normal distribution. From some of the searching I have done, it seems that at 30 degrees of freedom or greater, you can substitute the normal distribution for the Student's t distribution, though, this is not always a hard and fast rule.