# lognormal distributions for cost-effectiveness analysis
1. "everything is lognormal".
2. for $X,Y$ independent, $E[XY] = E[X]E[Y]$ 
3. for $X,Y$ independent, $Var[X+Y] = Var[X] + Var[Y]$.
4. we can think of a lognormal distribution with parameters $\mu, \sigma^2$ as a variable centered around $e^{\mu}$ with a *multiplicative* spread of $e^{\sigma}$
4. I expect this is how most people implicitly think of lognormal distributions when they make estimations.
5. for a lognormally distributed $X$, $E[X] = e^{\mu + \sigma^2/2}$. In particular, this is different from the median of $X$ which is $e^{\mu}$ (its "center") 
6. generally, $E[\frac{1}{X}] \neq \frac{1}{E[X]}$
7. in particular, even for $X,Y$ independent, $E[\frac{X}{Y}] \neq \frac{E[X]}{E[Y]}$ generally
8. for $X\sim \text{Lognormal}(\mu_X, \sigma^2_X)$, it's reciprocal is $\frac{1}{X}\sim \text{Lognormal}(-\mu_X, \sigma^2_X)$
9. for $Y\sim \text{Lognormal}(\mu_Y, \sigma^2_Y)$, $XY$ is lognormally distributed with parameters $\mu_X + \mu_Y$ and (if they are independent) $\sigma^2_X + \sigma^2_Y$
10. in many cases, cost-effectiveness estimates are of the form $\frac{\prod X_i}{\prod Y_j}$, where all variables are assumed to be independently lognormally distributed (maybe there are more multiplicative factors). This can be simplified to a log of a sum of normal distributions. 
11. it's expected value is $E[\frac{\prod X_i}{\prod Y_j}] = e^{\sum \mu_{X_i} - \sum \mu_{Y_j} + \frac{\sum \sigma^2_{X_i} + \sum \sigma^2_{Y_j}}{2}}$ 


## Everything is lognormal!(?)



# Estimation Uncertainty

When estimating a variable $X$, we have both "epistemic" and "statistical" (or, "aleatoric") uncertainty about its value. Statistical uncertainty can be thought of as the inherent randomness of $X$ (e.g. the number of heads in 10 coin flips). Epistemic uncertainty is uncertainty about the value of $X$ due to our lack of knowledge about it (e.g. the number of people in the world who have ever lived). 

For example, in a randomized controlled trial (RCT), we try to estimate the effect of a treatment on some outcome. That effect is usually dependent on many particular factors, such as the financial and cultural aspects of the population involved, the time of day the treatment was administered, the prevalence of a specific disease in a particular location, etc. With infinite knowledge, we could account for all of these factors and estimate the effect as a function of them. However, as this isn't possible, we can instead try model the effect as a random variable with some *statistical* uncertainty. 

If we conduct that RCT well, have a large sample size, and we have drawn randomly from our target population, then we can find a good fit for the distribution of the effect which can then be used to predict the effect of a future large-scale program. In this case, we have very little *epistemic* uncertainty about the effect, but the statistical uncertainty is still present and unreducible.

If we would have tried to apply the results of that RCT to a different population, then we would have had to account for the epistemic uncertainty as well. For example, if we had conducted the RCT in a rural area of a developing country, and we wanted to apply the results to an urban area of that same country, then we would have to make some educated guesses about the effect of the treatment in the new population. This is in epistemic uncertainty territory.


## Example - Water Treatment Interventions
In 2022 GiveWell revised upwards their [assessment](https://www.givewell.org/international/technical/programs/water-quality-interventions) of the cost-effectiveness of chlorination interventions to improve water quality in subsaharan Africa, and have [funded](https://www.givewell.org/research/grants/evidence-action-dispensers-for-safe-water-January-2022) Evidence Action's Dispensers for Safe Water program to the amount of $65m. 

Their analysis was based on a meta-analysis by Kremer et. al. of related RCTs, estimating the effect of chlorination interventions on mortality. It is currently a working paper (the latest version have is [here](https://bfi.uchicago.edu/wp-content/uploads/2022/03/BFI_WP_2022-26.pdf), as of July 2023), which means that they are still performing some analyses an it hasn't yet been peer-reviewed. 

In GiveWell's analysis, they performed their own estimation, based on adjustments to Kremer et. al.'s meta-analysis.  

[I want to show the diamond plot, how GiveWell use it, their criticism from the competition, and Witold's reaction. All of this tells a nice story about how meshing together different types of uncertainties can be tricky. However, I don't feel like I have a good solution, so maybe I should just leave it out.]

...
