# Chapter 2: Information Criterions and Bootstrap Confidence Intervals

In this chapter, we'll cover the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) for nested models, as well as the bootstrap method for confidence intervals. 

## Nested Models

Let's say we have model A with variables $\theta_1$, $\theta_2$ and $\theta_3$. Model B is nested in model A if model B contains a subset of the variables of model A, e.g. if model B has variables $\theta_1$ and $\theta_3$. We call model B a $\textbf{nested model}$. 
Model B is $\textbf{not}$ a nested model if it contains a variable that model A doesn't contain, e.g. if model B has variables $\theta_1$, $\theta_3$ and $\theta_4$. 

[overgang van nested models naar AIC and BIC en waarom we naar nested models kijken]


## Akaike Information Criterion (AIC)
The Akaike Information Criterion (AIC) is an estimator of prediction error of statistical models. It is a method for evaluating how well a model fits the data it was generated from. AIC is used to compare different models and determine which model is the best fit. This best fit is the one that explains the greatest amount of variation and has the minimal amount of information loss, using the fewest amount of parameters. The equation for the AIC is as follows;

<a id="eq:AIC"></a>
$$AIC = - 2\max\left[\ln(L)\right] + 2p_k,\tag{2.1}$$
It combines the max-liklihood and the number of parameters $p_k$. The best model will give the lowest value. From [Eq. 2.1](#eq:AIC) you can see that the AIC discourages overfitting. By adding more parameters, the AIC value will increase. The goodness of the fit, determined with the max-liklihood, will help decrease the AIC score.

**Note**: The AIC gives a qualitative comparison between models, it *does not* say anything about the quality of the models themself. So after selecting a model with the AIC, you need to validate the quality of the model. This can be done for instance by checking the models residuals.

### Simple usage of AIC values
Let $\mathrm{AIC}_{min}$ be the minimal AIC value. The following expression can be used to compare different AIC scores ($\mathrm{AIC}_{i}$) with $\mathrm{AIC}_{min}$.
<a id="eq:AIC comparison"></a>
$$c = e^{\frac{\mathrm{AIC}_{min} - \mathrm{AIC}_{i}}{2}},\tag{2.2}$$
This quantity $c$ is proportional to the probability of that the model $\mathrm{AIC}_{i}$ minimizes the information loss.
Say you have three models with the following values: $\mathrm{AIC}_1 = 101$, $\mathrm{AIC}_2 = 104$ and $\mathrm{AIC}_3 = 111$. 
\begin{equation}
c = 
\begin{cases}
e^{\frac{101 - 104}{2}} \approx 0.223 \\
e^{\frac{101 - 111}{2}} \approx 0.007\\
\end{cases}
\end{equation}
This means that model 2 is 0.223 times and model 3 is 0.007 times as probable as model 1 to minimize the information loss.


## Bayesian Information Criterion (BIC)


## Bootstrap and confidence intervals

The bootstrap technique is used to estimate a confidence interval for a parameter which inherently does not have a confidence interval. First we need to understand what a confidence interval is. 

Confidence intervals

It is a way to easily say some value can be found with a certain confidence within an interval. For example, you want to find your parameter $\theta$ with 95% confidence. It would not suffice to find some value $\theta$ within your data and say that you are 95% confident that the found $\theta$ is the actual $\theta$. You need to found the boundary or the interval in which you find the $\theta$ 95% of the time. Thus eventually founding ($\theta_{lower}$, $\theta_{higher}$), now you have expressed an interval in which you can say a found $\theta$ resides 95% of the time.
