# Section 4.2 Single Model Results Evaluation

In [None]:
import os
import arviz as az
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Change working directory
if os.path.split(os.getcwd())[-1] != "notebooks":
    os.chdir(os.path.join(".."))

np.random.seed(0)

In [None]:
az.style.use('arviz-white')

## Activity: Estimating treatment on plant growth
Your friend statistician friend is really into plants. As a side hobby she decided to test three fertilizers, BudgetFertilizer, GreenPower, and RootsGalore.


![Plants](../../img/Plants.jpg)  

Her methodology was as follows

1. Wait until plants were 10 inches tall
2. Treat with one of the three fertilizers
3. Measure the height of the plant each day until day 10

She was also nice enough to write down the mathematical model for you. Recall that we add 10 to $\mu$ because she waited until the plant was 10 inches high before we start out experiment. Because of this we do not need to estimate the intercept, it's always 10 inches on Day 0. 

$$ 
\beta \sim \mathcal{N}(0,1) \\
\epsilon \sim \operatorname{HalfCauchy}(1) \\
\mu = \beta*x + 10 \\
height \sim \mathcal{N}(\mu, \epsilon)
$$

In this model, x is the days since adding fertilizer (days since treatment), $\beta$ is the growth rate and $\epsilon$ is some noise.

She was able to perform inference runs, these are contained in the files `GreenPower.nc`, `BudgetFertilizer.nc`, and `RootsGalore.nc`. She's been too busy to interpret the results and asked that you help by answering the following questions. Remember your friend is a statistician, so she'll want to know the highest posterior density.

* Which fertilizer helps plants grow the best?
* Which fertilizer exhibits the least variability? Which exhibits the most variability?
* BudgetFertilizer is the cheapest, which is nice because she's on a grad student stipend. Should she use this fertilizer?

Unfortunately she forgot to give you the raw data, but fortunately you just paid attention to Section 4.1 where we learned how to interpret posterior distributions using`az.plot_posterior`, `az.plot_forest`, `az.summary`.

Can you help your friend out?

### Exercise 1: Plot the Posterior Estimations of all three fertilizers

#### Posterior Plots

In [None]:
greenpower = az.from_netcdf(os.path.join("inference_data", "GreenPower.nc"))
# Plot Posterior

In [None]:
budgetfertilizer = az.from_netcdf(os.path.join("inference_data", "BudgetFertilizer.nc"))
# Plot Posterior

In [None]:
rootsgalore = az.from_netcdf(os.path.join("inference_data", "RootsGalore.nc"))
# Plot Posterior

#### Forest Plots
**Hint:** Multiple Inference Runs can be plotted in the same forest plot. Take a look at the `model_names` argument

In [None]:
# Plot Forest


#### Summary Tables
We can quickly compare numerically by using `az.summary`.

In [None]:
# Create Summary Tables, one for each fertilizer


Just like the plots above, GreenPower is the best fertilizer for growth, RootsGalore is the most consistent, and BudgetFertilizer is not a good fertilizer as it seems reduce plant height over time.

### Exercise 2: Give recommendations to your friend
* **Which fertilizer helps plants grow the best?**
Hint: The estimate of the slope $\beta$ tells us which fertilizer is the best, as it represents day over day growth rate.

* **Which fertilizer is the most consistent? Which is the least consistent?**
Hint: The estimated distribution $\epsilon$ indicates the fertilizer variability. 

* **BudgetFertilizer is the cheapest, which is nice because she's on a grad student stipend and spent all her money on planet observations. Should she use this fertilizer?**
