In [1]:
import numpy as np
from scipy import stats

## Homework 11 ##

### 1. Working with Densities ###
Let $X$ have density given by

$$
f(x) ~ = ~ 
\begin{cases}
c(1-x^2) ~~~ -1 < x < 1 \\
0 ~~~~~~~~~~~~~~ \text{ otherwise }
\end{cases}
$$

Here $c$ is a constant. 

**a)** Sketch (by hand) a graph of the density. It's a good idea to start by finding the values of $f$ at $x = \pm 1$, $0$, and $\pm 0.5$.

Now find each of the following, and use a code cell if necessary to provide the numerical values as well.

**b)** $c$

**c)** the cdf of $X$

**d)** $P(\vert X \vert > 0.5)$

**e)** $E(X)$

**f)** $SD(X)$

In [2]:
### Code Cell for Problem 1





### 2. Screen-Saver ###

A laptop screen-saver displays colored discs of random radius. Let $R_i$ be the radius of the $i$th disc, and suppose $R_1, R_2, R_3 \ldots $ are i.i.d. with density given by

$$
f(r) ~ = ~ 
\begin{cases}
2r ~~~~~~~~ 0 < r < 1 \\
0 ~~~~~~~~~ \text{ otherwise }
\end{cases}
$$

What is the chance that at least one of the first 10 discs has area less than $\frac{\pi}{4}$?

In [3]:
### Code Cell for Problem 2





### 3. Gauss Model for Measurement Error ###

Probabilistic models are often used to describe the processes that generate data. Necessarily, the models are simplifications of reality. Good modeling is a balance between reducing complexity and retaining the crucial features of the phenomenon being studied. 

["Measure twice, cut once"](https://en.wiktionary.org/wiki/measure_twice_and_cut_once) is a proverb that arises from the experience that repeated measurements lead to more accurate results. The *Gauss model* for measurement error is a set of assumptions about repeated measurements made using sophisticated measuring devices such as those that use [lasers](https://en.wikipedia.org/wiki/Lidar) to measure distance. 

In essence, the model says that each measuremement is the true value plus a random error, and specifies some assumptions about the errors.

Let $X_1, X_2, X_3, \ldots$ be repeated measurements on the same quantity. The Gauss model says that the quantity has a true value that is an unknown constant $\mu$, and the $i$th measurement is

$$
X_i ~ = ~ \mu + \epsilon_i
$$

where $\epsilon_1, \epsilon_2, \epsilon_3 \ldots $ are i.i.d. random errors with expectation 0. It's traditional in statistics to use the Greek letter $\epsilon$ to represent a random error.

You can think of the observation $X_i$ as the sum of the *signal* $\mu$ and the *noise* $\epsilon_i$. Your job as a data scientist is to extract the signal from this sum.

Suppose that measurements on a distance $\mu$ meters follow the Gauss model and that the distribution of each $\epsilon_i$ is uniform on the interval $(-5, 5)$ centimeters.

**a)** What is the chance that a single measurement is strictly within 1 centimeter of the true distance $\mu$ meters?

**b)** Approximately what is the chance that the average of 100 measurements is strictly within 1 centimeter of the true distance $\mu$ meters?

In [4]:
### Code Cell for Problem 3





### 4. Calfiornia Earthquakes ###

California is prone to earthquakes, and geostatisticians use probabilistic models to try to understand them. In this exercise, "earthquake" will mean an earthquake of magnitude 4.9 or greater on the Richter scale. The figure below shows the distribution of the gaps in time (measured in days) between earthquakes in California during the years 1857 to 2014. The red curve is an exponential density.

![earthquake gaps](hw11_earthquakes.png)

You can see why it is tempting to use an exponential model for the gaps. You can also see that it's an over-simplifcation. But it's a start. 

Suppose the time between two earthquakes has the exponential distribution with mean 1100 days. 

Use `np.e` for $e$ and `np.log` for the $\log$ function when you compute numerical values below.

**a)** Find the chance that an earthquake happens within a year after the previous one. Ignore leap years; just use 365 for the number of days in a year. You are using a simplified model for data, so your answers will be rough anyway.

**b)** Fill in the blank with a number:

There is a 50% chance that an earthquake happens less than $\underline{~~~~~~~~~~~~~~~}$ days after the previous one.

**c)** Let $T$ be the time between two earthquakes, measured in days. Now let $X$ be $T$ measured in years, using 365 days as the length of a year. Write the relation between $T$ and $X$, and then find $P(X \le x \text{ years})$ for $x > 0$.

**d)** Use Part **c** to identify the distribution of $X$. Provide its name, the parameter or parameters, and the expectation.

In [None]:
### Code Cell for Problem 4





### 5. Minimum of Independent Exponentials ###

A device has two electronic components. Let $T_1$ be the lifetime of Component 1, and suppose $T_1$ has the exponential distribution with mean 5 years. Let $T_2$ be the lifetime of Component 2, and suppose $T_2$ has the exponential distribution with mean 4 years. 

Suppose $T_1$ and $T_2$ are independent of each other, and let $M = \min(T_1, T_2)$ be the minimum of the two lifetimes. In other words, $M$ is the first time one of the two components dies.

**a)** For each $t > 0$, find $P(M > t)$.

[Hint: If the minimum has to be bigger than $t$, what does that tell you about each of the lifetimes?]

**b)** Use Part **a** to identify the distribution of $M$. Provide its name and parameter (or parameters, if there are more than one).

**c)** Find the numerical value of $E(M)$.

In [None]:
### Code Cell for Problem 5





## Submission Instructions     
Please follow the directions below to properly submit your homework.

*  Scan all pages of **your work** into a PDF. You can use any scanner or a phone. Please **DO NOT** simply take pictures using your phone. 
* Please start a new page for each question. If you have already written multiple questions on the same page, you can crop the image or fold your page over (the old-fashioned way). This helps expedite grading.
* It is your responsibility to check that all the work on all the scanned pages is legible.


### Submitting
* Submit the assignment to Homework 11 on Gradescope. Use the entry code **MYBG2Z** if you haven't already joined the class.
* **Make sure to assign each page of your pdf to the correct question.**
