Now conditional and maximum likelihood refers to the estimation procedure. However, it should also apply to the intended model given exogenous variables (even in the case of constant-only exog). When we specify exogenous variables now for an "ARMAX" model we are estimating a regression model with ARMA errors. Ie.,
L(phi)(y _t- X_tB) = L(thetai)e_t
Y_t = X_tB + mu_t
L(phi)mu_t = L(theta)e_t
This is true whether or not the mode was estimated with exact or conditional MLE. However, it is more correct when we specify a conditional model to mean that we are modeling the conditional mean of y_t given its own past. Assuming that the conditional mean can be described as linear combination of exogenous variables, this is
L(phi)y_t = X_tB + L(theta)e_t
Right now, for a constant only, you can go back and forth between the two models, by taking the mean of the model and multiplying the conditional constant by 1 - sum(arparam). E.g.
from statsmodels.datasets.macrodata import load
from statsmodels.tsa.arima_model import ARMA
cpi = load().data['cpi'].diff()
# conditional MLE with a constant
# the constant parameter is actually the 'exact MLE' definition of the mean given above
res = ARMA(cpi).fit(order=(4,1, method='css')
# get the condtional MLE mean
res.params * (1 - res.arparams.sum())
This is, however, just a special case and is can't work for exog != 1.
I understand the difference in the models, but don't see why it should depend on conditional versus exact MLE.
The first part is a specification of the DGP, the model that describes the data generating process, the second is an estimator.
If we have different estimators, we still want to use them on the same model. Switching models when we switch estimators sounds very confusing and won't allow comparisons.
BTW: related to #754 : Is Gretl also using the same model as we do now, linear moded with ARMA errors?
Re: Yes for MLE, no for conditional. For conditional, they estimate
\phi(L)y_t = x_t \beta_t + \theta(L) \epsilon_t
But they note in their docs to allow for using both specifications, but that the don't yet. This is why I don't test our conditional. Cf. their discussion in the user guide section 24.2.
It probably makes more sense to keep the estimation and the DGP assumptions separate as you say. If you want what gretl is calling conditional, then we should just have a separate model. ConditionalARIMA or something.
Just adding a piece of information to this. X12ARIMA/X13ARIMA calls what we estimate a regARIMA model or a regression with ARIMA errors model. They note that the differences between the two are usually negligible while the parameters of the regression with ARIMA errors model are easier to interpret. I'm not sure they even allow estimation of the "typical" (depending on domain I think) ARIMAX model.
Hyndman also has a blog article arguing in favor of the regression model with ARMA errors, because of interpretability.