Hi, the following code:
__author__ = 'davide'
import numpy as np
import pandas as pd
from statsmodels.tsa.vector_ar.var_model import VAR
def compute_coeffs(series, p, index=None):
data = pd.DataFrame(series)
if index is None:
d = datetime.datetime.now()
delta = datetime.timedelta(days=1)
index = 
for i in range(data.shape):
d += delta
data.index = pd.DatetimeIndex(index)
model = VAR(data)
res = model.fit(p)
return res.intercept, res.coefs
if __name__ == "__main__":
series = np.array([[2., 2.], [1, 2.], [1, 2.], [1, 2.], [1., 2.]])
coeffs = compute_coeffs(series, 1)
gives me the following error:
ValueError: total size of new array must be unchanged
This is because the add_constant function in tools.py skips the addition on a column when there is already a constant column in the data (from the documentation of the function). The problem is that I can't estimate the coefficients of the model where there is a constant column such that. I commented out rows 290-292 in tools.py as a workaround, but I would know if this is a feature or it is a bug. (At least change the error message or catch it and raise a more appropriated exception).
I was also surprised by add_constant behavior when .predict() failed in the middle of cross validation with ValueError: matrices are not aligned. Maybe a flag for always adding a constant?
Thanks for the report. Sorry I didn't see this before the release. I'll look in to it.
What do you think is a reasonable solution here? I'm leaning towards just raising an error about the constant already being present rather than trying to do anything fancy.
@psarka Can you elaborate a little bit on your use case. It's not clear to me why I would want to force add a perfectly collinear column to the data.
Maybe add a parameter in order to select the desired behaviour. Personally I don't understand the ratio behind this choice...
What choice do you mean? The choice not to allow two constants in a model? Or the choice (bug) to let this pass through without raising an error?
The reason not to allow it is that the effects of the two variables are not separately identified. The way it's implemented now, you get a (scaled) split of the constant that's more or less arbitrary across two variables just as a consequence of using the SVD to solve the least squares problem. Other approaches won't work with a RHS that's not full column rank.
@jseabold The way I ran into this problem was while doing forward cross validation for time series. I was splitting, training and testing, and that involved adding a constant to both training and testing sets (unless I am doing something wrong). The test set of one of the folds was rather small (6 rows) and happened to have a constant column. There is nothing wrong in having proportional columns in the test set, so in this case I'd like add_constant to add constant without checking for co-linearity.
Now that is a tiny problem, solved by two lines, but it took me some time to realize what was going on. Raising an error fixes that.
Have a look at #2093. Basically just a better error message in this case.
Looks good to me, thanks!
I would prefer a flag to disable this behaviour.
So that's what has_constant='add' will do (that is, will always add constant).
Yes, and has_constant='none' (or None), should not add anything to the matrix.
I don't understand why you would want a function add_constant with a flag not to never add a constant. Just don't use the function?
Your code example has no direct use of add_constant anyway, so you will not able to provide the flag.
Sorry, there was a misunderstanding. I supposed you were talking of adding a flag to the model.fit() method, because it is the source of the exception in my code.
There is already a flag to disable the behavior, pass trend='nc' to fit. I'm not sure I want to add another keyword that is there only to disable the behavior of another keyword. In general, I try to stay away from keywords that govern the behavior of other keywords. Can you just check for a constant beforehand and change the value of trend, or alternatively, you can use a try/except to catch the error.
Ok, so I guess a better error message could do the trick
ENH: Options for behavior if constant is already present. Closes #2043.
Backport PR #2093: Add user control over what happens if a constant i…
…s already present.
Fix for #2043.