# Week Eight: Distributions

The goal of filling in the requested pieces is twofold: you should be able to run the worksheet and get the requested answer with the given dataset, and you should also be able to pass with different datasets (not given). These will often check unusual inputs, etc., so try to make sure all possible input datasets are accounted for.

To be graded, your notebook must be runnable start to finish. If you can't make an in-notebook test pass, comment it out for to attempt to get partial credit. You should replace the `...` markers with your code. Do not change the names of the pre-defined variables and functions.

Plots should have the required elements of a plot: labels, units if valid, a legend if more than one marker or line type is present. Titles are not required.

## Problem 1: Generating a distribution

Generate the following distribution two ways. The PDF is:

$$
P(x) = 1 - x^2
$$

From -1 to 1.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import cauchy, expon
from scipy.optimize import minimize

In [None]:
x = np.linspace(-1,1,100)
y = 1 - x**2
plt.plot(x,y)
plt.show()

### 1.1: Method 1

Use the rejection method generate a distribution. `N` is the maximum number of samples to generate (your function can produce less).

In [None]:
def generate_dist_1(N):
    ...

In [None]:
vals = generate_dist_1(1_000_000)
plt.hist(vals, bins=np.linspace(-1,1,100), density=True)
x = np.linspace(-1,1,100)
y = (1 - x**2)/(4/3) # 4/3 = normalization factor
plt.plot(x,y)
plt.show()

### 1.2 Method 2

Use the inverse CDF method to generate the distribution. You can calculate the CDF fairly easily. Note your work to calculate the CDF in a markdown cell, in comments or a docstring, or do it with sympy. For the inverse CDF, use an approxomation, such as interpolation (unless you can invert the function symbolically, which I did not have much luck with). Remember to normalize the CDF to 1. (If you can't do this method, try using the binned technique from class).

$$
\textrm{CDF}'(a) = \int_{-1}^{a} f(x) 
$$

$$
\textrm{CDF}(y) = \frac{\textrm{CDF}'(a)}{\textrm{CDF}'(1)} 
$$

$$
CDF'(a) = \int_{-1}^{a} f(x) = x-\frac{x^{3}}{3} \biggr|_{x=-1}^{x=a}
$$

$$
CDF'(a) = a - a^3/3 + 2/3 
$$

$$
CDF(1) = 4/3
$$

$$
CDF(a) = \frac{CDF'(a)}{CDF'(1)} = \frac{3}{4} a
                                 - \frac{1}{4} a^3
                                 + \frac{1}{2}
$$

$$
4 y = 3 a - a^3 + 2
$$

In [None]:
def generate_dist_2(N):
    ...

In [None]:
data = generate_dist_2(1_000_000)
plt.hist(data, bins=np.linspace(-1,1,101), density=True)
x = np.linspace(-1,1,100)
y = (1 - x**2)/(4/3) # 4/3 = normalization factor
plt.plot(x,y)
plt.show()

## Problem 2: Unbinned fitting

Fit the following unbinned dataset with a cauchy + an exponential distribution. You can also implement this yourself using `scipy.stats` for `cauchy` and `expon`. The range is from 0 to 20. The only tricky part is normalizing the PDFs, but you have the CDF, so it should be pretty easy.

The cauchy PDF is:

$$
f(x) = \frac{1}{\pi (1 + x^2)}
$$

In [None]:
vals = np.loadtxt('week8prob2.csv')

In [None]:
plt.hist(vals, bins=np.linspace(0,20,50), density=True)
plt.show()

In [None]:
def f(params, x):
    ...

def nll_f(params, x):
    return ... # Can be one line

For the fit, you should try initial parameters like `[10, 1, 10, .1]`. You probably should use bounds, with 0-1 for the fraction. The location of the cauchy can be constrained a bit too from the above plot by eye. I had the best luck with the `SLSQP` method.

In [None]:
def fit_data(data):
    res = minimize(...)
    return res

In [None]:
res = fit_data(vals)
res

In [None]:
print(res.x)
plt.hist(vals, bins=np.linspace(0,20,50), density=True)
x = np.linspace(0,20,100)
y = f(res.x, x)
plt.plot(x, y, color='C1')
plt.show()

## Problem 3: Error catching

Call `f`, and either return it's output value(s), or return the string form of the exception it throws if that exception is a `MessageException`. Don't do anything special if it is any other kind of exception, just let it error out like normal.


> Note: you will return `str(e)`, which is `"Print me"`.

In [None]:
class MessageException(Exception):
    pass

def throw_error():
    raise MessageException("Print me")

In [None]:
def return_result_or_msg(f):
    try:
        return f()
    except MessageException as e:
        return str(e)

In [None]:
assert return_result_or_msg(throw_error) == 'Print me'