# Conclusion

This chapter covered a lot of ground. Inference is difficult and inexact.

Although there are several different approaches to inference, we presented the Bayesian approach here. We summarize the approach overall as,

> In the forward direction, we have a model, $\theta$, and we use it to deduce the probabilities of events given a probability distribution. In the reverse direction, I have events and infer a probability distribution over possible models, $\theta$s.

The main reasons for using Bayesian inference are:

1. It is more general than other approaches (multiple models, more general domains, etc.)
2. In the 21st century, Bayesian approaches simply aren't difficult anymore.
3. The posterior distribution is richer to work with and makes more intuitive sense than either p-values or (Frequentist) confidence intervals.

We can always at least do something similar to a p-value or confidence interval if we want but the posterior distribution leaves room for a more thorough analysis.

We described several approaches to Bayesian inference:

1. Grid Method
2. Exact Method
3. Monte Carlo Method
4. Bootstrap Method

The first three are simply the application of Bayes Rule. The equivalence of the Bootstrap and a Bayesian posterior distribution is a surprisingly convenient result if we're willing to assume a uniform prior.

We concentrate on the Bootstrap throughout the remainder of the text because:

1. It is quick and simple.
2. It has general applicability. We'll can use it for A/B tests or evaluating regression coefficients.
3. We don't need to specify or assume any particular distributional form, we use the empirical distribution directly.

However, we acknowledge a few things:

1. It is a limited form of Bayesian inference because we do not specify distributional forms or priors.
2. It must be interpreted carefully at the boundaries. Without a prior, it will give zero probabilities for things that are not, strictly speaking, impossible.
3. For more complicated and or sophisticated modeling, we may need to return to Monte Carlo methods.

Finally, we must acknowledge that Bayesian, Frequentist or other, there is no magic bullet that solves the problem of induction. It will never be certain.

We closed out the chapter to examining problems in statistical inference that occur with great regularity:

1. Comparing two boolean $\theta$s.
2. Comparing two real valued $\theta$s.
3. Comparing a boolean $\theta$ to a hypothesized or analytical value.
4. Comparing a real valued $\theta$ to a hypothesized or analytical value.

We showed how to use the Bootstrap to conduct inference for those problems and that with the Bootstrap because we need not specify a distributional form, they're all pretty much the same. For first two, the $theta$s often come from either one or groups in your data or as we shall see later, two or more results when comparing two models. For the last two, the hypothesized value might come from a suggestion, a past value or an idealized value.

We didn't cover everything. Bayesian statistics is very deep and can be applied to a lot of different problems. The main thing we did not cover is model *checking*. We are, in essence, building a model of our $theta$ using the posterior distribution calculated from the data. We generally want to test all of our models, even models of statistical inference.

## Review

1. Why is inference a problem?
2. How do we use probabilities in a deductive way? What kinds of problems can that solve? Is this controversial?
3. What are the two basic concepts in the Bayesian approach to inference?
4. Describe the Grid method to Bayesian inference. How is this just the application of Bayes Rule?
5. Describe the Exact method to Bayesian inference. How is this just the application of Bayes Rule? Why was the Exact method considered such a stumbling block to adoption of Bayesian inference for centuries?
6. Describe the Monte Carlo method of Bayesian inference. How does the Monte Carlo method get around the general intractability of the Exact method?
7. Summarize the Bootstrap and the steps required to get from the Bootstrap to a Bayesian posterior distribution and Bayesian inference.
8. Why is Frequentist inference often described as a *process* rather than a *result*?
8. What is the Frequentist p-value? What do people often think it means?
9. What is the Frequentist 95% confidence interval? What do people often think it means?
10. What are the four common problems in inference?

In a previous chapter, we learned how to generate synthetic data from Mathematical distributions. You can use this technique to study how inference works and sometimes doesn't work. Concentrating on the first two general problems in inference (which, when we use the Bootstrap, are really the same), you can do conduct three sets of experiments:

1. generate two samples using the same $\theta$ (for example, the same $p$ in a series of Bernoulli trial). Is the Bootstrap able to discern that the $p$ is the same? What happens has you change $p$ in ways that affect the variance? What happens when you increase the sample sizes? Use different distributions, parameters, etc.
2. generate two samples using two different $\theta$s (you don't have to stick with Bernoulli trials, you can pick $\theta$ as a mean and or variance from a Normal distribution). Here we have multiple things to investigate, basically, how do:
    1. the distance/difference between $\theta$s,
    2. variance
    3. and sample size

interact and affect your ability to arrive at valid inferences?