Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce ArmaModel to Pastas #198

Merged
merged 4 commits into from
Jun 9, 2020
Merged

Introduce ArmaModel to Pastas #198

merged 4 commits into from
Jun 9, 2020

Conversation

raoulcollenteur
Copy link
Member

@raoulcollenteur raoulcollenteur commented May 27, 2020

Short Description

This PR introduces an ARMA(1,1) noise model to Pastas. For now, it is only tested on regular time steps. The use of Numba is recommended to maintain speed, but without it it still works fine. The accompanying Notebook tests the new model and shows how to use it. The formulation is applicable for irregular time steps, but more research (outside this PR) is necessary to confirm its applicability for this type of data.

Checklist before PR can be merged:

  • new feature
  • is documented
  • PEP8 compliant code
  • tests added / passed
  • passes Travis
  • Example Notebook (for new features)

@raoulcollenteur raoulcollenteur added the development Indicates development of new features label May 27, 2020
@raoulcollenteur raoulcollenteur added this to the 0.15.0 milestone May 27, 2020
Don't run on travis because the Notebook depends on statsmodels.
@raoulcollenteur
Copy link
Member Author

What do you think of the name of the second parameter, theta? I am not sure about this, as we also used theta for the impulse response function.. It could be Phi, although that is typically used for the AR part of an ARMA model. Thoughts?

@mbakker7
Copy link
Collaborator

Couple comments:

  1. I wouldn't call the parameter theta. For the MA model it is called theta if it is equidistant, which it isn't here. I suggest to call it beta, so that we use greek letters for the noise model, like alpha.

  2. In both the notebook and in the noisemodels.py file, it states that this has only been tested for equidistant time series, but you showed an example where that is not the case last week (and that is why you have beta to begin with). So remove that and add a non-equidistant example in the notebook.

  3. It is easy to generate ARMA(1,1) data yourself. Why use statsmodels? That way we can test the notebook with Travis.

  4. In the example notebook, you still find some autocorrelation in the data, even though you use the 'correct' noise model. Worrisome? Can we do a test for autocorrelation? Does it pass the test?
    Or is the test not yet implemented in Pastas?

  5. In the Notebook you state that "It is important to understand that this noisemodel will only help in removing autocorrelations at the first time lag, but not at larger time lags". That is not apparent from the autocorrelation plot, which is much better for the ARMA(1,1) model than the AR1 model, also for larger time steps.

  6. From a research point of view: What if we take this dataset and use a measurement every 2 weeks. Does AR1 perform well enough? Howbout ARMA(1,1) (we don't need to check this before merging the pull request).

Nice job!

@raoulcollenteur
Copy link
Member Author

raoulcollenteur commented Jun 9, 2020

  1. I wouldn't call the parameter theta. For the MA model it is called theta if it is equidistant, which it isn't here. I suggest to call it beta, so that we use greek letters for the noise model, like alpha.

Good idea. I changed theta to beta.

  1. In both the notebook and in the noisemodels.py file, it states that this has only been tested for equidistant time series, but you showed an example where that is not the case last week (and that is why you have beta to begin with). So remove that and add a non-equidistant example in the notebook.

I added the irregular example at the end of the Notebook. However, it does not work for this as I did not apply weights yet (we should work on that for the future). This can be done in a future PR.

  1. It is easy to generate ARMA(1,1) data yourself. Why use statsmodels? That way we can test the notebook with Travis.

EDIT: just changed the method to generate ARMA(1,1) noise to a formula and commented out Statsmodels.

  1. In the example notebook, you still find some autocorrelation in the data, even though you use the 'correct' noise model. Worrisome? Can we do a test for autocorrelation? Does it pass the test Or is the test not yet implemented in Pastas?

I think you refer to the autocorrelation plot in step 6. There is only a significant autocorrelation (ACF > 95% conf. interval) at two time lags, which is not a problem. I will add a test showing this in the future when we implemented such tests in Pastas.

  1. In the Notebook you state that "It is important to understand that this noisemodel will only help in removing autocorrelations at the first time lag, but not at larger time lags". That is not apparent from the autocorrelation plot, which is much better for the ARMA(1,1) model than the AR1 model, also for larger time steps.

This is the result of an overestimation of the AR(1) parameter, trying to correct for the MA(1) part. This results in negative autocorrelation over the first few time lags. I added this explanation to the notebook.

  1. From a research point of view: What if we take this dataset and use a measurement every 2 weeks. Does AR1 perform well enough? Howbout ARMA(1,1) (we don't need to check this before merging the pull request).

I expect this is the case as long as it is an MA(1) process, but in real data it may be a MA(14) of course, so an ARMA(1,1) model may still help when using biweekly observations.

@raoulcollenteur raoulcollenteur merged commit 14d4de3 into dev Jun 9, 2020
@raoulcollenteur
Copy link
Member Author

I'll leave the branch open so we can work on developing a new version of the ArmaModel that applies weights and works with irregular time steps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
development Indicates development of new features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants