Introduce ArmaModel to Pastas #198

raoulcollenteur · 2020-05-27T07:31:24Z

Short Description

This PR introduces an ARMA(1,1) noise model to Pastas. For now, it is only tested on regular time steps. The use of Numba is recommended to maintain speed, but without it it still works fine. The accompanying Notebook tests the new model and shows how to use it. The formulation is applicable for irregular time steps, but more research (outside this PR) is necessary to confirm its applicability for this type of data.

Checklist before PR can be merged:

Don't run on travis because the Notebook depends on statsmodels.

raoulcollenteur · 2020-05-28T09:08:05Z

What do you think of the name of the second parameter, theta? I am not sure about this, as we also used theta for the impulse response function.. It could be Phi, although that is typically used for the AR part of an ARMA model. Thoughts?

mbakker7 · 2020-05-29T12:28:04Z

Couple comments:

I wouldn't call the parameter theta. For the MA model it is called theta if it is equidistant, which it isn't here. I suggest to call it beta, so that we use greek letters for the noise model, like alpha.
In both the notebook and in the noisemodels.py file, it states that this has only been tested for equidistant time series, but you showed an example where that is not the case last week (and that is why you have beta to begin with). So remove that and add a non-equidistant example in the notebook.
It is easy to generate ARMA(1,1) data yourself. Why use statsmodels? That way we can test the notebook with Travis.
In the example notebook, you still find some autocorrelation in the data, even though you use the 'correct' noise model. Worrisome? Can we do a test for autocorrelation? Does it pass the test?
Or is the test not yet implemented in Pastas?
In the Notebook you state that "It is important to understand that this noisemodel will only help in removing autocorrelations at the first time lag, but not at larger time lags". That is not apparent from the autocorrelation plot, which is much better for the ARMA(1,1) model than the AR1 model, also for larger time steps.
From a research point of view: What if we take this dataset and use a measurement every 2 weeks. Does AR1 perform well enough? Howbout ARMA(1,1) (we don't need to check this before merging the pull request).

Nice job!

raoulcollenteur · 2020-06-09T10:13:00Z

I wouldn't call the parameter theta. For the MA model it is called theta if it is equidistant, which it isn't here. I suggest to call it beta, so that we use greek letters for the noise model, like alpha.

Good idea. I changed theta to beta.

In both the notebook and in the noisemodels.py file, it states that this has only been tested for equidistant time series, but you showed an example where that is not the case last week (and that is why you have beta to begin with). So remove that and add a non-equidistant example in the notebook.

I added the irregular example at the end of the Notebook. However, it does not work for this as I did not apply weights yet (we should work on that for the future). This can be done in a future PR.

It is easy to generate ARMA(1,1) data yourself. Why use statsmodels? That way we can test the notebook with Travis.

EDIT: just changed the method to generate ARMA(1,1) noise to a formula and commented out Statsmodels.

In the example notebook, you still find some autocorrelation in the data, even though you use the 'correct' noise model. Worrisome? Can we do a test for autocorrelation? Does it pass the test Or is the test not yet implemented in Pastas?

I think you refer to the autocorrelation plot in step 6. There is only a significant autocorrelation (ACF > 95% conf. interval) at two time lags, which is not a problem. I will add a test showing this in the future when we implemented such tests in Pastas.

In the Notebook you state that "It is important to understand that this noisemodel will only help in removing autocorrelations at the first time lag, but not at larger time lags". That is not apparent from the autocorrelation plot, which is much better for the ARMA(1,1) model than the AR1 model, also for larger time steps.

This is the result of an overestimation of the AR(1) parameter, trying to correct for the MA(1) part. This results in negative autocorrelation over the first few time lags. I added this explanation to the notebook.

From a research point of view: What if we take this dataset and use a measurement every 2 weeks. Does AR1 perform well enough? Howbout ARMA(1,1) (we don't need to check this before merging the pull request).

I expect this is the case as long as it is an MA(1) process, but in real data it may be a MA(14) of course, so an ARMA(1,1) model may still help when using biweekly observations.

raoulcollenteur · 2020-06-09T15:41:46Z

I'll leave the branch open so we can work on developing a new version of the ArmaModel that applies weights and works with irregular time steps.

Introduce ArmaModel in Pastas

c64431d

raoulcollenteur added the development Indicates development of new features label May 27, 2020

raoulcollenteur added this to the 0.15.0 milestone May 27, 2020

raoulcollenteur requested review from mbakker7 and dbrakenhoff May 27, 2020 07:31

Add test and don't run notebook on Travis

575a9d2

Don't run on travis because the Notebook depends on statsmodels.

dbrakenhoff approved these changes May 27, 2020

View reviewed changes

change theta to beta

46ff758

Generate ARMA(1,1) noise without statsmodels and test notebook

db79cc1

raoulcollenteur merged commit 14d4de3 into dev Jun 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce ArmaModel to Pastas #198

Introduce ArmaModel to Pastas #198

raoulcollenteur commented May 27, 2020 •

edited

raoulcollenteur commented May 28, 2020

mbakker7 commented May 29, 2020

raoulcollenteur commented Jun 9, 2020 •

edited

raoulcollenteur commented Jun 9, 2020

Introduce ArmaModel to Pastas #198

Introduce ArmaModel to Pastas #198

Conversation

raoulcollenteur commented May 27, 2020 • edited

Short Description

Checklist before PR can be merged:

raoulcollenteur commented May 28, 2020

mbakker7 commented May 29, 2020

raoulcollenteur commented Jun 9, 2020 • edited

raoulcollenteur commented Jun 9, 2020

raoulcollenteur commented May 27, 2020 •

edited

raoulcollenteur commented Jun 9, 2020 •

edited