In [None]:
# Initialize Otter
import otter
grader = otter.Notebook("lab06.ipynb")

### CE 93: Lab Assignment 06

You must submit the lab to Gradescope by the due date. You will submit the zip file produced by running the final cell of the assignment.

## About this Lab
The objective of this assignment is to work with different common discrete and continuous distributions.

## Instructions 
**Run the first cell, Initialize Otter**, to import the autograder and submission exporter.

Throughout the assignment, replace `...` with your answers. We use `...` as a placeholder and theses should be deleted and replaced with your answers.

Any part listed as a "<font color='red'>**Question**</font>" should be answered to receive credit.

**Please save your work after every question!**

To read the documentation on a Python function, you can type `help()` and add the function name between parentheses.

**Run the cell below**, to import the required modules.

In [None]:
# Please run this cell, and do not modify the contents
import math
import numpy as np
import scipy
from scipy.stats import *                    # common distributions
import pandas as pd
import statistics as stats
import cmath
import re
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import hashlib
import ipywidgets as widgets
from ipywidgets import FileUpload
from IPython.display import display
from PIL import Image
import os
import resources

def get_hash(num):
    """Helper function for assessing correctness"""
    return hashlib.md5(str(num).encode()).hexdigest()

### Introduction

Construction is a part of every day life for most driving Californians. It is pretty common to come across one or more instances of traffic being redirected during daily commute. CalTrans (California Department of Transportation) is interested in the amount of cars using Highway 17 every day because they will need to **redirect traffic in one lane** in order to fix a pothole. Because this is a busy highway, they plan to complete the construction at around 11pm on a weekday.

They have asked you, as someone with experience in engineering data analysis, to help them understand the traffic patterns and the impact of the planned construction. So, in this lab, you will use properties of common distributions to perform these analyses.

### Modeling the Number of Cars

<font color='red'>**Question 1.0.**</font> Assume the number of cars that pass through the construction part of the Highway can be modeled as a **Poisson distribution**. Based on the lecture notes, what is(are) possible input parameter(s) for the Poisson distribution? Assign ALL that apply to the variable `q1_0`. (0.5 pts)

**A.** Average rate of occurrence \
**B.** Standard deviation of occurrence \
**C.** Probability of success \
**D.** Average number of occurrence \
**E.** Number of trials \
**F.** None of the options

Answer in the next cell. Add each selected choice as a string and separate each two answer choices by a comma. For example, if you want to select `"A"` and `"B"`, your answer should be `"A", "B"`.\
Assign your answer to the given variable.
Remember to put quotes around each answer choice.

In [None]:
# ANSWER CELL
q1_0 = ...
q1_0

In [None]:
grader.check("q1.0")

### Poisson Distribution using `scipy.stats`
The `scipy.stats` module contains a large number of common probability distributions. After importing `scipy.stats` (which was done in the second Code cell above), we can use built-in methods to get the probability mass function, cumulative density function, mean, variance, and more of a Poisson distribution. However, it is **very important** to understand the inputs to these distributions in Python, as they might differ from what we discussed in class.

Read the documentation for `scipy.stats.poisson` [here](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.poisson.html). This function in Python takes one input parameters, which is called `mu` (we defined it as $\lambda$ in the lecture).

In the documentation of `scipy.stats.poisson`, the probability mass function (PMF) of the Poisson distribution is given as:

$$f(k) = \text{exp}(-\mu)\dfrac{\mu^k}{k!}, \ \ \ k = \{0, 1, 2, ...\}$$

This is equivalent to what we defined in the lecture. If $\lambda$ is the average **number** of occurrences in the time interval we are interested in, $t$, we defined the PMF as:

$$f(x) = \dfrac{e^{-\lambda} \lambda^x}{x!}, \ \ \ x = \{0, 1, 2, ...\}$$

If $\lambda$ is the average **rate** of occurrences in the time interval we are interested in, $t$, we defined the PMF as:

$$f(x) = \dfrac{e^{-\lambda t} (\lambda t)^x}{x!}, \ \ \ x = \{0, 1, 2, ...\}$$

By comparing the given equation for the probability mass function in the documentation of `scipy.stats.poisson` with the lecture notes, it should be evident that $\mu$ represents the average **number** of occurrences in the time interval we are interested in, $t$.

### Probability Mass Function

To get the probability mass function of a Poisson distribution in Python, we can use `poisson.pmf(k, mu)`.

The input `mu` is the parameter of the distribution as discussed above. The input `k` represents possible values of the random variable (equivalent to $x$). 

`k` could be a single value:
> For example, if the average number of occurrences is 1 and you want to compute $P(X=3)$ for a Poisson distribution, you can directly use `poisson.pmf(k=3, mu=1)`

`k` could also be a data structure:
> For example, if the average number of occurrences is 1 and you want to compute $P(X=0), P(X=1), P(X=2), P(X=3),$ for a Poisson distribution, you can directly use `poisson.pmf(k=np.arange(4), mu=1)`

<div class="alert alert-block alert-warning"> <b>NOTE!</b> <code>np.arange(start, stop, step)</code> is used to create arrays that are in order and evenly spaced. <code>start</code> is the number at which to start the array and is an optional argument, and thus, if not specified, it will take the default value 0. <code>stop</code> is the number at which to stop the array (excluded) and is a required argument. <code>step</code> is the increment in the values of the array and is an optional argument, and thus, if not specified, it will take the default value 1. So when using <code>np.arange(4)</code>, since there is only one argument, this has to be the required argument <code>stop</code>. So, <code>start</code> and <code>step</code> are assigned their default values. This means that <code>np.arange(4)</code> is equivalent to <code>np.arange(0, 4, 1)</code>. Keep in mind that the value <code>stop</code> is excluded from the array, so <code>np.arange(4)</code> creates an array with the values [0, 1, 2, 3]. </div>

Recall that CalTrans is interested in the amount of cars using Highway 17 because they will need to **redirect traffic in one lane** in order to fix a pothole. Because this is a busy highway, they plan to complete the construction at around 11pm on a weekday.

Based on historical data, the average **rate** of cars passing on one lane at 11pm on a weekday is 1.2 cars **per minute**. CalTrans estimates that the construction will take between 1 to 2 **hours**.

For simplicity, we will use **1.5 hours** as the duration we are interested in (middle value of the estimate provided by CalTrans of 1 to 2 hours).

<font color='red'>**Question 2.0.**</font> What is the probability that exactly 100 cars will need to be redirected if the construction project takes 1.5 hours? Recall that the average rate is 1.2 cars per minute. Assign your answer to `q2_0`. Do not just manually type the numeric answer. Use Python expressions that return the desired answer and assign them to the variable. (0.5 pts)

*Hint: `poisson.pmf(k, mu)` returns the probability mass function of a Poisson distribution. So, you simply need to use the value of `k`, which represents the number of occurrences of interest, and the correct value of `mu`, which represents the average number of occurrences in the time interval we are interested in.*

In [None]:
# ANSWER CELL
q2_0 = ...
print(f'Probability of exactly 100 cars passing is P(X=100) = {q2_0:.3f}')

In [None]:
grader.check("q2.0")

Next, you want to plot the full probability mass function of the number of cars.

<font color='red'>**Question 2.1.**</font> Plot the PMF of the number of cars during a duration of 1.5 hours. Follow these steps: (1.0 pt)

1. Create an array named `k` for the possible number of cars starting at 0 and ending at 200 (inclusive): {0, 1, 2, ..., 199, 200}
2. Calculate the input parameter `mu` correctly based on the documentation of `scipy.stats.poisson`. Recall that the average rate is 1.2 cars per minute and we want the probability mass function over a duration of 1.5 hours.
3. Plot the PMF using [`plt.plot()`](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.plot.html). Click on the function to read its documentation. Set the line style of the plot to a dotted line  `linestyle = ':'` and the marker to a circle `marker = 'o'`.
4. Set the x-axis label to 'Number of cars' and the y-axis label to 'Probability'

*Hint: Remember that `poisson.pmf(k, mu)` returns the probability mass function of a Poisson distribution. So, we simply need to create an array for the possible values of the number of cars, `k`, calculate the probability mass function at all of the possible values of the number of cars using `poisson.pmf(k, mu)`, and then plot them.*

In [None]:
# ANSWER CELL

# Do not modify this line for grading purposes
import matplotlib.pyplot as plt
from matplotlib.lines import Line2D

# Edit the code below to plot the PMF of the number of cars (only edit where you have ...)

# Create an array for the possible values of the random variable
# Recall that np.arange(start, stop, step) returns values within the half-open interval [start, stop),
# with spacing between the values given by step: [start, start+step, start+2*step, ..., stop-step].
k = ...

# Define the input parameter based on the documentation of scipy.stats.poisson and any given input
mu = ...

# create figure and axes
fig_1, ax_1 = plt.subplots(nrows=1, ncols=1, figsize=(5,3))

# Plot the PMF using plt.plot(). You give it first the x-axis values, then the y-axis values. 
# You can also control other properties of the plot (color, line style, etc.) 
# specify linestyle = ':'
# specify marker = 'o'
...

# Label the axes and change any other properties
...

# Display the plot
plt.tight_layout()
plt.show()

In [None]:
grader.check("q2.1")

### Cumulative Distribution Function

We can also use `poisson.cdf(k, mu)` to calculate and plot the cumulative distribution function of a Poisson distribution in Python. Similar to `poisson.pmf(k, mu)`, `k` can be a single value or an array of values.

> For example, if the average number of occurrences is 1 and you want to compute $P(X\leq 3)$ for a Poisson distribution, you can directly use `poisson.cdf(k=3, mu=1)`

<font color='red'>**Question 3.0.**</font> What is the probability that less than or equal to 100 cars will need to be redirected if the construction project takes 1.5 hours? Recall that the average rate is 1.2 cars per minute. Assign your answer to `q3_0`. Do not just manually type the numeric answer. Use Python expressions that return the desired answer and assign them to the variable. (0.5 pts)

In [None]:
# ANSWER CELL
q3_0 = ...
print(f'Probability of less than or equal to 100 cars passing is P(X <= 100) = {q3_0:.3f}')

In [None]:
grader.check("q3.0")

Next, you want to plot the full cumulative distribution function of the number of cars.

<font color='red'>**Question 3.1.**</font> Plot the CDF of the number of cars during a duration of 1.5 hours. Follow these steps: (0.75 pts)

1. Create an array named `k` for the possible number of cars starting at 0 and ending at 200 (inclusive): {0, 1, 2, ..., 199, 200}
2. Calculate the input parameter `mu` correctly based on the documentation of `scipy.stats.poisson`. Recall that the average rate is 1.2 cars per minute and we want the probability mass function over a duration of 1.5 hours.
3. Plot the CDF using [`plt.plot()`](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.plot.html). Click on the function to read its documentation. Set the line style of the plot to a dotted line  `linestyle = ':'`, the marker to a circle `marker = 'o'`, and the marker size to 3 `markersize=3`.
4. Set the x-axis label to 'Number of cars' and the y-axis label to 'Cumulative Probability'

In [None]:
# ANSWER CELL

# Do not modify this line for grading purposes
import matplotlib.pyplot as plt
from matplotlib.lines import Line2D

# Edit the code below to plot the PMF of the number of cars (only edit where you have ...)

# Create an array for the possible values of the random variable
# Recall that np.arange(start, stop, step) returns values within the half-open interval [start, stop),
# with spacing between the values given by step: [start, start+step, start+2*step, ..., stop-step].
k = ...

# Define the input parameter based on the documentation of scipy.stats.poisson and any given input
mu = ...

# create figure and axes
fig_2, ax_2 = plt.subplots(nrows=1, ncols=1, figsize=(5,3))

# Plot the CDF using plt.plot(). You give it first the x-axis values, then the y-axis values.
# You can also control other properties of the plot (color, line style, etc.) 
# specify linestyle = ':'
# specify marker = 'o'
# specify markersize=3
...

# Label the axes and change any other properties
...

# Display the plot
plt.tight_layout()
plt.show()

In [None]:
grader.check("q3.1")

### Probability Calculations

There are different ways to calculate probabilities of a Poisson distribution using `scipy.stats.poisson`.

<font color='red'>**Question 4.0.**</font> If `mu=1` (not the cars problem), which of the following are appropriate methods to calculate $P(X>3)$ for a Poisson distribution? For this example, assume 999 represents $\infty$. You can run the different functions below and check their output. Assign ALL that apply to the variable `q4_0`. (0.5 pts)

**Recall that `np.arange(start, stop, step)` returns values within the half-open interval [start, stop), with spacing between the values given by step: [start, start+step, start+2*step, ..., stop-step].**

**A.** `1 - poisson.cdf(3, mu=1)` \
**B.** `1 - poisson.cdf(4, mu=1)` \
**C.** `sum(poisson.pmf(np.arange(3, 1000, 1), mu=1))` \
**D.** `sum(poisson.pmf(np.arange(4, 1000, 1), mu=1))` \
**E.** `1 - sum(poisson.pmf(np.arange(0, 2, 1), mu=1))` \
**F.** `1 - sum(poisson.pmf(np.arange(0, 3, 1), mu=1))` \
**G.** `1 - sum(poisson.pmf(np.arange(0, 4, 1), mu=1))` 


Answer in the next cell. Add each selected choice as a string and separate each two answer choices by a comma. For example, if you want to select `"A"` and `"B"`, your answer should be `"A", "B"`.\
Assign your answer to the given variable.
Remember to put quotes around each answer choice.

In [None]:
# Try any code here
...

In [None]:
# ANSWER CELL
q4_0 = ...
q4_0

In [None]:
grader.check("q4.0")

Now back to the cars problem. The project manager is worried that if there are more than 130 cars to be redirected, a large back up will develop on the highway, which will cause delays and might cause traffic accidents. 

<font color='red'>**Question 4.1.**</font> What is the probability that more than 130 cars will need to be redirected if the construction project takes 1.5 hours? Recall that the average rate is 1.2 cars per minute. If needed, assume 999 represents $\infty$. Assign your answer to `q4_1`. Do not just manually type the numeric answer. Use Python expressions that return the desired answer and assign them to the variable. (0.5 pts)

In [None]:
# ANSWER CELL
q4_1 = ...
print(f'Probability of having more than 130 cars is P(X > 130) = {q4_1:.3f}')

In [None]:
grader.check("q4.1")

### Time Between Car Arrivals

Two CalTrans employees observing the construction being done are watching the cars passing and wonder when the next car will arrive. There are no cars in the lane when they begin their timer. Recall that we modeled the number of cars arriving as a Poisson distribution.

<font color='red'>**Question 5.0.**</font> Let random variable $Y$ represent the time between successive cars. What distribution is appropriate for modeling $Y$? Assign your answer to the variable `q5_0` as a string. (0.25 pts)

**A.** Lognormal \
**B.** Binomial \
**C.** Hypergeometric \
**D.** Exponential \
**E.** Normal \
**F.** Negative Binomial \
**G.** Geometric \
**H.** None of the options

Your answer should be a string, e.g., `"A"`, `"B"`, etc.\
Remember to put quotes around your answer choice.

In [None]:
# ANSWER CELL
q5_0 = ...
q5_0

In [None]:
grader.check("q5_0")

We previously saw how to use `scipy.stats.poisson`. There are similar function for all the common distributions that we have discusses, both discrete and continuous. We saw a table of common discrete distributions in Python in Lab 05. 

The table below lists the common continuous distributions we have discussed in class and their corresponding Python functions and their input parameters. You can click on the different Python functions to read their documentation. **Note that the input parameters might use different symbols or be different from the ones we defined in the lecture.** So it's important you read the corresponding documentation carefully.


| Distribution  | Python function | Relation to Parameters from Lecture |
|:--------------|:----------------|:------------------------------------|
| Exponential   | [`expon(scale)`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.expon.html) | `scale=` $1/ \lambda$ |
| Gamma         | [`gamma(a, scale)`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.gamma.html) | `a=` $r$, `scale=` $1/ \lambda$ |
| Uniform         | [`uniform(loc, scale)`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.uniform.html) | `loc=` $a$, `scale=` $b-a$ |
| Normal        | [`norm(loc, scale)`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.norm.html) | `loc=` $\mu$, `scale=` $\sigma$ |
| Lognormal     | [`lognorm(s, scale)`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.lognorm.html) |  `s=` $\zeta$, `scale=` $e^\lambda$ |

All these distributions have different methods that can be used for different purposes. We provide examples of the most common methods based on the exponential distribution with `scale=1`, but these methods can be used with any of the distributions from the table above. You will just need to adjust the function names and input parameters accordingly for other distributions.
* `expon.pdf(x, scale=1)`: Get the probability density function at $x: f(x)$
* `expon.cdf(x, scale=1)`: Get the cumulative distribution function at $x: F(x)$
* `expon.mean(scale=1)`: Get the expected value of the random variable
* `expon.median(scale=1)`: Get the median of the random variable
* `expon.var(scale=1)`: Get the variance of the random variable
* `expon.std(scale=1)`: Get the standard deviation of the random variable

### Probability Density Function

We defined the random variable $Y$ as the time between successive cars. Recall that the average rate of car arrivals is 1.2 cars per minute.

<font color='red'>**Question 6.0.**</font> Based on the distribution you selected for $Y$, what is the **density** for the time between successive arrivals of 2 minutes? Assign your answer to `q6_0`. Do not just manually type the numeric answer. Use Python expressions that return the desired answer and assign them to the variable. (0.5 pts)

*Hint: `expon.pdf(x, scale)` returns the probability density function of an Exponential distribution at $x$. Use the name of the distribution you selected for the random variable and its corresponding parameters based on the table above.*

In [None]:
# ANSWER CELL
q6_0 = ...
print(f'Density of 2 mins is f(Y=2) = {q6_0:.3f}')

In [None]:
grader.check("q6.0")

Next, you want to plot the full probability density function of the time between successive cars.

<font color='red'>**Question 6.1.**</font> Plot the PDF of the time between successive cars during construction. Follow these steps: (1.0 pt)

1. Create an array named `x` for the possible values of the time between successive cars starting at 0 and ending at 10 mins (inclusive) with a time step of 0.01 min: {0, 0.01, 0.02, ..., 9.99, 10}
2. Calculate the input parameter `param` correctly based on the documentation of the distribution you selected. This parameter could be `scale` or any other parameter needed for the distribution you selected. If you need any other parameters, feel free to define them. Recall that the average rate is 1.2 cars per minute.
3. Plot the PDF using [`plt.plot()`](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.plot.html). Click on the function to read its documentation. Keep the line style as a solid line. This is a continuous distribution.
4. Set the x-axis label to 'Time (min)' and the y-axis label to 'Density'

In [None]:
# ANSWER CELL

# Do not modify this line for grading purposes
import matplotlib.pyplot as plt
from matplotlib.lines import Line2D

# Edit the code below to plot the PMF of the number of cars (only edit where you have ...)

# Create an array for the possible values of the random variable
# Recall that np.arange(start, stop, step) returns values within the half-open interval [start, stop),
# with spacing between the values given by step: [start, start+step, start+2*step, ..., stop-step].
x = ...

# Define the input parameter based on the documentation of the distribution you selected
param = ...

# create figure and axes
fig_3, ax_3 = plt.subplots(nrows=1, ncols=1, figsize=(5,3))

# Plot the PDF using plt.plot(). You give it first the x-axis values, then the y-axis values.
...

# Label the axes and change any other properties
...

# Display the plot
plt.tight_layout()
plt.show()

In [None]:
grader.check("q6.1")

### Probability Calculations

We can calculate probabilities of common continuous distributions directly Python. However, not all the methods that could be used to calculate probabilities of discrete distributions (refer to Question 4.0) are appropriate in this case. Why? Unlike discrete distributions for which we can use `pmf` to get probabilities, continuous distributions are represented by density, `pdf`, which is not the same as probability. However, we can still use some of the same methods. Mainly, the CDF is similar for both discrete and continuous distributions and provides the probability that the random variable is less than or equal to a certain value.

The two CalTrans employees observing the construction being done started a timer to record the time until the next car arrives. There are no cars in the lane when they started their timer. Recall that the average rate is 1.2 cars per minute.

<font color='red'>**Question 7.0.**</font> What is the probability that the next car will arrive **exactly** 1 minute after starting the timer? Assign your answer to `q7_0`. (0.5 pts)

In [None]:
# ANSWER CELL
q7_0 = ...
print(f'Probability that the next car will arrive exactly after 1 min is P(Y=1) = {q7_0:.3f}')

In [None]:
grader.check("q7.0")

<font color='red'>**Question 7.1.**</font> What is the probability that the next car will arrive **more than** 2 minutes after starting the timer? Assign your answer to `q7_1`. Do not just manually type the numeric answer. Use Python expressions that return the desired answer and assign them to the variable. (0.5 pts)

In [None]:
# ANSWER CELL
q7_1 = ...
print(f'Probability that the next car will arrive in more than 2 mins is P(Y>2) = {q7_1:.3f}')

In [None]:
grader.check("q7_1")

### What if No Cars Arrive Within 5 Minutes?

<font color='red'>**Question 8.0.**</font> After 5 minutes from starting the timer, no cars have passed by yet. One of the CalTrans employees (who went to Stanford FYI) tells the other that because they have not seen any cars for 5 minutes, there is a higher chance than usual of a car passing by in the next 30 seconds. Is the employee right? Assign your answer to the variable `q8_0` as a string. (0.5 pts)

**A.** Yes because the distribution for the time between cars is memoryless \
**B.** Yes because a car hasn't passed by for some time \
**C.** No because the distribution for the time between cars is memoryless \
**D.** No because 5 minutes is not too long \
**E.** We can't tell based on the given information

Your answer should be a string, e.g., `"A"`, `"B"`, etc.\
Remember to put quotes around your answer choice.

In [None]:
# ANSWER CELL
q8_0 = ...
q8_0

In [None]:
grader.check("q8.0")

### Time Until Fifth Car Arrival

<font color='red'>**Question 8.1.**</font> Several cars happened to arrive shortly after one another. What distribution is appropriate for modeling the time until the fifth car arrives? Assign your answer to the variable `q8_1` as a string. (0.5 pts)

**A.** Gamma \
**B.** Binomial \
**C.** Hypergeometric \
**D.** Exponential \
**E.** Normal \
**F.** Negative Binomial \
**G.** Geometric \
**H.** None of the options


Your answer should be a string, e.g., `"A"`, `"B"`, etc.\
Remember to put quotes around your answer choice.

In [None]:
# ANSWER CELL
q8_1 = ...
q8_1

In [None]:
grader.check("q8_1")

<font color='red'>**Question 8.2.**</font> What is the **median** time until the fifth car arrives? Assign your answer to `q8_2`. Do not just manually type the numeric answer. Use Python expressions that return the desired answer and assign them to the variable. (0.5 pts)

All common distributions have different methods that can be used for different purposes. Below are examples of the most common methods based on the exponential distribution with `scale=1`, but these methods can be used with any of the distributions from the table above. You will just need to adjust the function names and input parameters accordingly for other distributions.
* `expon.pdf(x, scale=1)`: Get the probability density function at $x: f(x)$
* `expon.cdf(x, scale=1)`: Get the cumulative distribution function at $x: F(x)$
* `expon.mean(scale=1)`: Get the expected value of the random variable
* `expon.median(scale=1)`: Get the median of the random variable
* `expon.var(scale=1)`: Get the variance of the random variable
* `expon.std(scale=1)`: Get the standard deviation of the random variable

**Note that the input parameters might use different symbols or be different from the ones we defined in the lecture.** So it's important you read the documentation in the table above carefully.

In [None]:
# ANSWER CELL
q8_2 = ...
print(f'The median time until the fifth car arrives is {q8_2:.3f} mins')

In [None]:
grader.check("q8.2")

### Weather and Construction Disruptions

Lastly, CalTrans is concerned about having significant rainfall during construction, which could significantly delay the construction work. Based on historical measurements over several years, the daily rainfall is assumed to have a normal distribution with a mean of 0.5 inch and a standard deviation of 0.2 inch.

<font color='red'>**Question 9.0.**</font> Plot the PDF of the daily rainfall. Follow these steps: (0.50 pts)

1. Create an array named `r` for the possible values of the daily rainfall starting at 0 and ending at 1.5 inch (inclusive) with a time step of 0.01 inch: {0, 0.01, 0.02, ..., 1.49, 1.50}
2. Plot the PDF using [`plt.plot()`](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.plot.html). Click on the function to read its documentation. Keep the line style as a solid line. This is a continuous distribution.
3. Set the x-axis label to 'Rainfall (inch)' and the y-axis label to 'Density'

In [None]:
# ANSWER CELL

# Do not modify this line for grading purposes
import matplotlib.pyplot as plt
from matplotlib.lines import Line2D

# Edit the code below to plot the PMF of the number of cars (only edit where you have ...)

# Create an array for the possible values of the random variable
# Recall that np.arange(start, stop, step) returns values within the half-open interval [start, stop),
# with spacing between the values given by step: [start, start+step, start+2*step, ..., stop-step].
r = ...

# create figure and axes
fig_4, ax_4 = plt.subplots(nrows=1, ncols=1, figsize=(5,3))

# Plot the PDF using plt.plot(). You give it first the x-axis values, then the y-axis values.
...

# Label the axes and change any other properties
...

# Display the plot
plt.tight_layout()
plt.show()

In [None]:
grader.check("q9.0")

<font color='red'>**Question 9.1.**</font> What is the probability that on the construction day, the daily rainfall will be less than 0.2 inch (which would be considered light intensity)? Assign your answer to `q9_1`. Do not just manually type the numeric answer. Use Python expressions that return the desired answer and assign them to the variable. (0.25 pts)

In [None]:
# ANSWER CELL
q9_1 = ...
print(f'The probability that there will be light rainfall is {q9_1:.3f}')

In [None]:
grader.check("q9.1")

<font color='red'>**Question 9.2.**</font> What is the probability that on the construction day, the daily rainfall will be between 0.2 inch and 0.8 inch (which would be considered moderate intensity)? Assign your answer to `q9_2`. Do not just manually type the numeric answer. Use Python expressions that return the desired answer and assign them to the variable. (0.5 pts)

In [None]:
# ANSWER CELL
q9_2 = ...
print(f'The probability that there will be moderate rainfall is {q9_2:.3f}')

In [None]:
grader.check("q9.2")

<font color='red'>**Question 9.3.**</font> Based on the PDF of the daily rainfall and properties of common distributions, which distribution would be more appropriate to use in this case? Assign your answer to the variable `q9_3` as a string. (0.25 pts)

**A.** Normal because it is the most common distribution \
**B.** Lognormal because rainfall can't take negative values and lognormal is defined for positive values only \
**C.** Exponential because the occurrences are Poisson \
**D.** Uniform because we don't have any information about rainfall \
**E.** None of the options

Your answer should be a string, e.g., `"A"`, `"B"`, etc.\
Remember to put quotes around your answer choice.

In [None]:
# ANSWER CELL
q9_3 = ...
q9_3

In [None]:
grader.check("q9.3")

### You're done with this Lab!

**Important submission information:** After completing the assignment, click on the Save icon from the Tool Bar &nbsp;<i class="fa fa-save" style="font-size:16px;"></i>&nbsp;. After saving your notebook, **run the cell with** `grader.check_all()` and confirm that you pass the same tests as in the notebook. Then, **run the final cell** `grader.export()` and click the link to download the zip file. Finally, go to Gradescope and submit the zip file to the corresponding assignment. 

**Once you have submitted, stay on the Gradescope page to confirm that you pass the same tests as in the notebook.**

In [None]:
%matplotlib inline
img = mpimg.imread('resources/animal.jpeg')
imgplot = plt.imshow(img)
imgplot.axes.get_xaxis().set_visible(False)
imgplot.axes.get_yaxis().set_visible(False)
print("Congratulations on finishing this lab!")
plt.show()

---

To double-check your work, the cell below will rerun all of the autograder tests.

In [None]:
grader.check_all()

## Submission

Make sure you have run all cells in your notebook in order before running the cell below, so that all images/graphs appear in the output. The cell below will generate a zip file for you to submit. **Please save before exporting!**

Make sure you submit the .zip file to Gradescope.

In [None]:
# Save your notebook first, then run this cell to export your submission.
grader.export(pdf=False)