## Bootstrap


The bootstrap technique introduced by Efron (1979) could possibly be a potential alternative in estimation and inference from time series models in finite samples. However, in time series regressions, the standard bootstrap resampling method designed for independent and identically distributed (IID) errors is not applicable because in most situations the assumption of IID errors is violated. 

The basic bootstrap approach consists of drawing repeated samples (with replacement). 
The simplest assumption for that method is that observations should be IID. 
But in time series models IID assumption is not satisfied.
Thus the method needs to be modified.

- Estimating Standard Errors:  if “Small Sample Size” distribution is normal then we can get a BS distribution to Estimate SE (same as asymptotic distribution for SE)
- Confidence Interval statements:
	Using BS distribution to Estimate CI we can get  different result for CI (from asymptotic distribution), for example, because of BS distribution skewness 



1. CI for $\theta$ (Эфронов доверительный интервал)

$$(q_{\alpha/2}, q_{1-\alpha/2})$$

BS Distribution of $ \hat{\theta}^*$

2. CI for assymptotic distribution of $\hat{\theta}$:
$$(\hat{\theta}  \pm z_{\alpha} se(\hat{\theta}) )$$

where $z_{\alpha}$  -  the $100 - \alpha$ percentile from the standart normal distribution

3. CI for BS distribution of $\hat{\theta}$ (Доверительный интервал Холла):
$$(\hat{\theta} -   z_{1-\alpha}^*,  \hat{\theta} +   z_{\alpha}^* ) $$


where $z_{\alpha}^*$  -  the $\alpha$ percentile of the  distribution of $\hat{\theta}^* - \hat{\theta}$ (бутстрапируем  отклонение оценки от истинного значения).
BS Distribution $ (\hat{\theta}^* - \hat{\theta} ) $, 
not $ (\hat{\theta}^* - \theta_0 ) $

4. t-percentile CI (t-процентильный доверительный интервал)

$$(\hat{\theta} -   z_{1-\alpha/2}^* se(\hat{\theta}),  \hat{\theta} +   z_{\alpha/2}^* se(\hat{\theta})) $$

BS properly studentized statistic:
$$\dfrac{\hat{\theta}^* - \hat{\theta}}{ \hat{\sigma}^*} $$
use $\hat{\sigma}^*$ - estimate of $\hat{\sigma} $ from the BS sample

Для получения симметричного t-процентильного CI (подходит для тестирования гипотез) 
$$(\hat{\theta} \pm   z_{1-\alpha}^* se(\hat{\theta})) $$
  вместо $\dfrac{\hat{\theta} - {\theta}}{ \hat{\sigma}}$ бутстрапируем $\left|\dfrac{\hat{\theta} - {\theta}}{ \hat{\sigma}}\right|$


### The Recursive BS for stationary AR(p) model


Consider AR(p) process:
$$ y_t = \sum_{i=1}^p a_i y_{t-i} + e_t, e_t \sim N(0,\sigma^2)$$

We estimate coefficients with OLS and get: 
$ (\hat{a}_1,\dots, \hat{a}_p), \hat{e}_t $




 Define the centered and scaled residuals:
$$ \tilde{e}_t = (\hat{e}_{t} - \frac{1}{n} \sum \hat{e}_{t} )  \left( \frac{n}{n-p}\right) ^{1/2} $$
Resample $ \tilde{e}_t $ with replacement to get the BS residuals $ e_t^* $

Construct the BS sample recursively using $ y_t^* = y_t $:

$$ y_t^* = \sum_{i=1}^p \hat{a}_i y_{t-i}^* + e_t^*$$



### The Moving Block Bootstrap

Application of the residual based bootstrap methods is straightforward if the error distribution is specified to be an ARMA(p,q) process with known p and q

However, if the structure of serial correlation is not tractable or is misspecified, the residual based methods will give inconsistent estimates

Divide the data of $n$ observations into blocks of length $l$ and select $b$ of these blocks (with repeats allowed) 


** NBB - nonoverlapping blocks bootstrap **

> Carlstein (1986) – first discussed the idea of bootstrapping blocks of observations rather 
than the individual observations.

Number of blocks: $\frac{n}{l} = b$  

High probability of missing entire blocks in the Carlstein scheme (non overlapping blocks) $ \rightarrow $ not often used

** MBB - moving blocks bootstrap **


> Künsch (1989) and Singh (1992) – independently introduced a more general BS
procedure, the moving block BS (MBB) which is applicable to stationary time series data. In this method the blocks of observations are overlapping.

Number of blocks: $n - l + 1$  


>> IDEA: MBB for short clusterized time series

#### Problems with MBB

1. The pseudo time series generated by the moving block method is not stationary, even if the original series $\{x_t\}$ is stationary


> Politis and Romano (1994):  **A stationary bootstrap method**

 Sampling blocks of random length, where the length of each block has a geometric distribution. They show that the pseudo time series generated by the stationary bootstrap method is indeed stationary

The application of stationary bootstrap is less sensitive to the choice of $p$ than the application of moving block bootstrap is to the choice of $l$

2. The mean $\bar{x}^*_n$ of the moving block bootstrap is biased in the sense that: $$E(\bar{x}^*_n | x_1, ... , x_n) \neq \bar{x}_n $$


3. The MBB estimator of the variance of $\sqrt{n} \bar{x}_n$  is also biased

> Davidson and Hall (1993): ** modification **

 Usual estimator: $\hat{\sigma}^2 = n^{-1}\sum^n_{i=1}(x_i - \bar{x}_n )^2$

 Modification:  $\tilde{\sigma}^2 = n^{-1}\sum^n_{i=1}\left((x_i - \bar{x}_n )^2  + \sum^{i-1}_{k=1} \sum^{n-k}_{i=1} (x_i - \bar{x}_n ) (x_{i+k} - \bar{x}_n ) \right)$
 
 With this modification the bootstrap can improve substantially on the normal approximation


#### Optimal Length of Blocks

Interested in minimizing the MSE of the block bootstrap estimate
of the variance of a general statistic

Carlstein’s rules for non-overlapping blocks: 

- As the block size increases: variance $\uparrow$,  bias  $\downarrow$

- As the dependency among the $x_i$ gets stronger  a longer block size is needed

- Optimal block size for AR(1) model $x_t = \phi x_{t-1} + e_t $  is $l^* = \left( \dfrac{2\phi}{1-\phi^2}  \right)^{2/3} n^{2/3}$ 

- Carlstein optimal block size:  $ l = n^{1/3} \rho^{-2/3}$

- Künsch optimal block size: $ l = (3/2 * n)^{1/3} \rho^{-2/3} $, where the covariance of $x_t$ at lag $j$:
 
$$ \rho  = \dfrac{\gamma(0) + 2 \sum^{\infty}_{j=1} \gamma(j) }{ \sum^{\infty}_{j=1} j \gamma(j)}$$ 

- Hall and Horowitz’s rules for AR(1): $ \rho = (1-\phi^2)/\phi$
