Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VAR model's AIC, BIC and Log Likelihood are non-deterministic #115

Closed
Alechan opened this issue Oct 3, 2017 · 5 comments
Closed

VAR model's AIC, BIC and Log Likelihood are non-deterministic #115

Alechan opened this issue Oct 3, 2017 · 5 comments

Comments

@Alechan
Copy link

Alechan commented Oct 3, 2017

Example code:

file issuegithub.py

import numpy as np
import pyflux as pf
import pandas as pd

df = pd.DataFrame(np.linspace(0,100,1000), columns=list('A'))
# Model the data using VAR modeling
model = pf.VAR(data=df, lags=2, integ=1)
# Fit the model
x = model.fit()
# Print the aic
print(x.aic)

Runs:

$ python issuegithub.py
651.7090301513672
$ python issuegithub.py
636.0988006591797
$ python issuegithub.py
-16537352142428.482

I've narrowed it down to a call to var_likelihood, that is defined in pyflux/var/var_recursions.cpython-36m-x86_64-linux-gnu.so.

It's a .so file so I can't debug it but adding a print to the result of the call to that function for each run will show that each time it will return something different.

@Alechan
Copy link
Author

Alechan commented Oct 5, 2017

Edited the original comment because it was a random dataframe. Now the dataframe is fixed.

@dioh
Copy link

dioh commented Oct 5, 2017

We have found a workaround to get consistente performance values. We traced the error to the var_recursion method that is used to calculate the negative likelihood. The code was in cython so we couldn't debug it. When we replaced the implementation with a python one we started having consistent metrics.

We basically added to the VAR class the following instance method:

    def var_likelihood(self, ll1, mu_shape, diff, inverse):
        ll2 = 0.0
        for t in range(0,mu_shape):
            ll2 += np.dot(np.dot(diff[t].T,inverse),diff[t])

        return -(ll1 -0.5*ll2)

What do you think the issue may be? an old version of the library maybe?

@RJT1990
Copy link
Owner

RJT1990 commented Nov 21, 2017

Hi Alechan - I am trying this out now to verify.

@RJT1990
Copy link
Owner

RJT1990 commented Nov 21, 2017

Hi Alechan - I have a number of problems here with this query. With your example, I get the error: LinAlgError("Singular matrix") upon initialization.

Additionally I am trying with the base VAR example here - http://pyflux.readthedocs.io/en/latest/var.html - and cannot replicate the indeterministic AIC response.

This may be an issue with the old version of a library (or dependencies). I would recommend upgrading and then reporting back if the issue persists.

@RJT1990
Copy link
Owner

RJT1990 commented Dec 6, 2017

Closing .

@RJT1990 RJT1990 closed this as completed Dec 6, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants