###  R Packages for DM test: 

- [`multDM`](https://cran.r-project.org/web/packages/multDM/multDM.pdf) Allows to perform the multivariate version of the Diebold-Mariano test for equal predictive ability of multiple forecast comparison.

- `dm.test` from [`forecast`](https://www.rdocumentation.org/packages/forecast/versions/8.5/topics/dm.test) This function implements the modified test proposed by Harvey, Leybourne and Newbold (1997). The null hypothesis is that the two methods have the same forecast accuracy. 

### [Comparing Predictive Accuracy of two Forecasts: The Diebold-Mariano Test](http://www.phdeconomics.sssup.it/documents/Lesson19.pdf)

In empirical applications it is often the case that two or more
time series models are available for forecasting a particular
variable of interest.

Question: Are the forecasts equally good?

The loss associated with forecast $i$ is assumed to be a function
of the forecast error, $e_{it}$, and is denoted by $g(e_{it})$.

A problem with these loss function is that they are symmetric
functions (squared-error loss, absolute error loss)

When it is more costly to underpredict $y_t$ than to overpredict
it, the following loss function can be used:
$$g(e_{it}) = \exp (λe_{it}) − 1 − λe_{it}$$

We define the loss differential between the two forecasts by
$$d_t = g(e_{1t}) − g(e_{2t})$$
and say that the two forecasts have equal accuracy if and only
if the loss differential has zero expectation for all $t$.'

So, we would like to test the null hypothesis
$$H_0 : E(d_t) = 0,  ∀t$$
versus the alternative hypothesis
$$H_1 : E(d_t) \neq 0$$
The null hypothesis is that the two forecasts have the same
accuracy. The alternative hypothesis is that the two forecasts
have different levels of accuracy


Suppose that the forecasts are h(> 1)-step-ahead. In order to
test the null hypotesis that the two forecasts have the same
accuracy, Diebold-Mariano utilize the following statistic

$$DM = \frac{\bar{d}}{\sqrt{\frac{2\pi\hat{f}_d(0)}{T}}} \sim N(0;1)$$

where

$f_d (0) = \frac{1}{2π} \sum^{\infty}_{k=-\infty} γ_d (k)$ is the spectral density of the loss differential at frequency 0, 
$γ_d(k)$ is the autocovariance of the loss differential at lag $k$

$\hat{f}_d (0) =  \frac{1}{2π} \sum^{T−1}_{k=−(T−1)} I(\frac{k}{h − 1}) \hat{γ}_d (k)$
is a consistent estimate of $f_d (0)$

$\hat{γ}_d (k) = 1/T \sum^T_{t=|k|+1} (d_t − \bar{d})(d_{t−|k|} − \bar{d})$

$I(\frac{k}{h − 1}) =  \begin{cases} 1, \  \ if\  |\frac{k}{h − 1}| \leq 1 \\ 0, \  otherwise \end{cases}$


As the simulation experiments in Diebold and Mariano (1995)
show, the normal distribution can be a very poor
approximation of the DM test’s finite-sample null distribution.
Their results show that the DM test can have the wrong size,
rejecting the null too often, depending on the degree of serial
correlation among the forecast errors and the sample size, $T$.


Harvey, Leybourne, and Newbold (1997) (HLN) suggest that
improved small-sample properties can be obtained by:
1. making a bias correction to the DM test statistic, and
2. comparing the corrected statistic with a Student-t distribution with (T-1) degrees of freedom, rather than the standard normal.


$$HLN = DM\sqrt{(n+1-2h+h(h-1))/n} \sim T(n-1)$$


__A problem:__

The Diebold-Mariano test should not be applied
to situations where the competing forecasts are obtained using
two nested models

The root of the problem is that, at the population level, if the
null hypothesis of equal predictive accuracy is true, the forecast
errors from the competing models are exactly the same and
perfectly correlated, which means that the numerator and
denominator of a Diebold-Mariano test are each limiting to
zero as the estimation sample and prediction sample grow.

However, when the size of the estimation sample remains finite
as the size of the prediction sample grows, parameter
estimates are prevented from reaching their probability limits
and the Diebold-Mariano test remains asymptotically valid
even for nested models, under some regularity assumptions
(see Giacomini and White 2003).

> ##### [Nested model](http://quantile.ru/12/12-AT.pdf) 

> Когда модель A является частным случаем модели B, т. е. A можно получить из B, наложив
на параметры некоторые ограничения (и, возможно, сделав замену переменных), то говорят,
что модель A вложена (nested) в модель B. Если же A и B не сводятся друг к другу с помощью наложения ограничений на параметры, то говорят, что A и B являются невложенными
(non-nested); примером могут служить модели логит и пробит. Если правильность спецификации одной модели проверяют на основе сопоставления с невложенной альтернативной
моделью, то говорят о невложенных гипотезах. 


> В том же контексте выбора модели есть похожий термин encompassing (охват). Принцип
охвата заключается в том, что если модель правильно специфицирована, то она должна
быть способна объяснять результаты использования альтернативной модели. О выборе мо-
дели, невложенных и охватывающих моделях см. Gourieroux & Monfort (1994), Pesaran &
Weeks (2001), Greene (2012).

Additional information:


- [Statistical Tests for Multiple Forecast Comparison](http://statweb.stanford.edu/~ckirby/ted/conference/Roberto%20Mariano.pdf): other  tests like:
    - Morgan-Granger-Newbold (MGN) Test (1977), 
    - Meese-Rogoff (MR) Test (1988), 
    - Diebold-Mariano (DM) Test (1995), 
    - HLN (1997): Small-Sample Modification of DM Test 
    - A Multivariate Test 