Skip to content

Conversation

NathanielF
Copy link
Contributor

Forecasting AR Structural Timeseries models

Adding this merge request here to for the open issue: #450
The notebook has the broad structure of the related blog post and i'm happy to take any feedback or suggestions on streamlining it. I believe it follows the jupyter style advice accurately.

It seems to be passing the pre-commit checks locally. I found it a bit frustrating to get jupytext to pass.
image

Helpful links

… notebook

Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@drbenvincent
Copy link
Contributor

Thanks so much for the contribution so far!

Just set the remote checks running. Will try to review over the weekend, but might drift into early next week.

@NathanielF
Copy link
Contributor Author

NathanielF commented Oct 22, 2022

Thanks. Will have a look at that failing check this evening. Think it is failing on the codespell checks which were skipped for some reason when I ran it locally.

Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
@NathanielF
Copy link
Contributor Author

Tried to address the failing codespell check. Seemed to work but not entirely sure.

image

Copy link
Member

@lucianopaz lucianopaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NathanielF, first of all, thank you for submitting such an awesome example. It touches on a subject that has been completely neglected in the past pymc documentation.
However, there is a crucial subtlety about time series forecasting that was missed in your notebook: the future AR values are conditionally dependent on the past AR values that had been learnt. With the approach you did, setting the data of the AR you are sampling the past AR values, and not drawing them from their learnt posterior. The resampled AR will still be conditioned on the learnt coefficients, but the exact AR past values that are in the posterior will be ignored. I mentioned a way to work around this problem here, and maybe @ricardoV94 can share a gist he wrote that concatenates two random variables to do predictions without sampling what was learnt for the past values.
I would be very happy to help you out in fixing the forecasting, so let me know if my comments were clear or if you need more tips to get this to work.
As I said before and in one of my code comments, this has been lacking in the pymc documentation for a long time, and your contribution is a perfect opportunity to make things right!

```{code-cell} ipython3
az.plot_trace(idata_ar, figsize=(10, 6), kind="rank_vlines");
```

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that it would be valuable to also plot the learnt latent AR variable over time here. You can do something like:

idata_ar.posterior.ar.mean(["chain", "draw"]).plot()

or also take advantage of arviz.hdi

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a plot here but were you thinking i should hit any notes of exposition as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that it looks nice enough not to require more explanation. Maybe @drbenvincent agrees?

NathanielF and others added 6 commits October 22, 2022 17:36
…st.md

Co-authored-by: Luciano Paz <luciano.paz.neuro@gmail.com>
…st.md

Co-authored-by: Luciano Paz <luciano.paz.neuro@gmail.com>
…st.md

Co-authored-by: Luciano Paz <luciano.paz.neuro@gmail.com>
… to be conditional on learned posterior

Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
…the predict step pattern

Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
…some more text

Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
@NathanielF NathanielF requested review from lucianopaz and removed request for drbenvincent October 25, 2022 09:35
@NathanielF
Copy link
Contributor Author

NathanielF commented Oct 25, 2022

Ooops, I didn't mean to remove the request for review from @drbenvincent.

Also, didn't mean to rush you @lucianopaz. Just wondering is the "request re-review" button is the right etiquette or would you already have seen the changes i've made since you requested them? Don't mean to put pressure on, just meant to signal that i think i've addressed the above.

@drbenvincent drbenvincent self-requested a review October 25, 2022 10:52
Copy link
Member

@lucianopaz lucianopaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks way better now! I've added a bunch of comments. After you iterate through those, I think that the only things that might be left are a bunch of minor stylistic best practices, and potentially also run black and isort on the notebook to have it finally ready.

@NathanielF
Copy link
Contributor Author

Thanks so much for your feedback on this @lucianopaz i will adapt the above discussed and review for grammar and tone. Hopefully push some changes later today.

…model and improved plot labels.

Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
@NathanielF NathanielF requested review from lucianopaz and removed request for drbenvincent October 28, 2022 13:44
@NathanielF
Copy link
Contributor Author

Ah, sorry @drbenvincent I did the same thing again. No idea why it knocked you out when i requested a review.

@NathanielF
Copy link
Contributor Author

Just giving this a slight nudge @lucianopaz , @drbenvincent. Think it's nearly there and it'd be cool if we could get it over the line this week?

@drbenvincent drbenvincent self-requested a review November 8, 2022 09:56
Copy link
Contributor

@drbenvincent drbenvincent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @NathanielF. Sorry about the delay on the review. I've been bogged down with client work.

It think this is excellent and will make a great addition.

I've added a bunch of comments. Feel free to ask if any clarification is needed.

PS. After these changes I'm happy to approve. That said, I'm not a time series expert, but as long as @lucianopaz is happy then I'm confident we can approve this soon.

Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
@NathanielF
Copy link
Contributor Author

Thanks again for your time on this @drbenvincent, I know we're all busy. Happy to wait for @lucianopaz to give the above another look.

I think i've addressed all your comments and his prior comments. In the last commit there i've made the model-diagram neater for all models. I've also stressed that we're adding structural components not for arbitrary reasons but because real-world data tends to have multiple influences... and bayesian structural time-series modelling is one way of capturing theses multi-aspected data generating processes.

For @lucianopaz 's comments above - the main change was to revert to using the coefs pattern rather than having priors for two individual coefficients separately.

@drbenvincent
Copy link
Contributor

Just saw this in a quick look now...
In the very final plot, the predicted mean seems not matched up with the shaded regions. Maybe worth a double check.

Screenshot 2022-11-08 at 13 09 04

@NathanielF
Copy link
Contributor Author

@drbenvincent do you mean at the tail end of the prediction is seems to come out of phase? Or just about the degree of oscillation? If i change the prior of the beta_fourier terms I get slightly more pronounced oscillation:

image

I wasn't too concerned about it. The point of the plot was to just show that we can recover the seasonality pattern. I think it does that...

I think it's too much of a rabbit hole to go down, to try and figure out if the percentile color-gradient technique is getting the color map banding exactly right. To my mind the color mapping is there just to suggest that the probabilistic outcomes come with a graded range of plausibility...

@drbenvincent
Copy link
Contributor

drbenvincent commented Nov 8, 2022

It was the phase that I noticed. Thought I'd mention it in case or anything obvious.

@NathanielF
Copy link
Contributor Author

Right, yeah... honestly not sure why that is happening.

@NathanielF
Copy link
Contributor Author

I added more samples to the plot ,extended the prediction period for longer and allowed a wider sigma on the fourier terms. Still slight phasing visible, but it doesn't seem to be a preface to doom. I don't think it's anything to worry about.

image

Copy link
Member

@lucianopaz lucianopaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NathanielF, thanks again for all of your amazing work! And thanks for the nudge, I had completely lost track of this PR...
I've reviewed it a bit deeper to find out why the forecast plots showed strange behavior. I found 3 issues:

  1. The AR distribution you use for forecasting starts from the last time point of the training time. But then you combine it with trend and seasonality that are one time step into the future with respect to it. To fix this, I added an extra coordinate to the model that adds an extra step to the AR so that it has the last time step (used for the init), and all of the time points into the future that must be forecasted.
  2. The Fourier features for the forecast with seasonality were not starting from the last time step, but from 0. This added a phase difference that appeared as a sort of discontinuity/inflection between the forecast and the training period.
  3. The plots for the cyan lines of the forecast were using the correct future observed time steps, but the shaded areas were using a linspace, so they didn't match. Visually, this appeared like a sort of phase lag between the cyan line and the shaded areas.

After addressing these issues locally, the last plot looks like this:

image

@NathanielF
Copy link
Contributor Author

These are great observations @lucianopaz! Thanks for digging into it. I'll adjust and these and re-push today.

NathanielF and others added 7 commits November 9, 2022 10:24
…st.md

Co-authored-by: Luciano Paz <luciano.paz.neuro@gmail.com>
…st.md

Co-authored-by: Luciano Paz <luciano.paz.neuro@gmail.com>
…st.md

Co-authored-by: Luciano Paz <luciano.paz.neuro@gmail.com>
…st.md

Co-authored-by: Luciano Paz <luciano.paz.neuro@gmail.com>
…st.md

Co-authored-by: Luciano Paz <luciano.paz.neuro@gmail.com>
…st.md

Co-authored-by: Luciano Paz <luciano.paz.neuro@gmail.com>
…t function and adjusted prediction step AR logic

Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
@NathanielF
Copy link
Contributor Author

Thanks so much again @lucianopaz that last observation about the plotting function was subtle. I've been looking at the function too long to have seen it! I've adjust the notebook in the manner you suggested and indeed was able to recover a prediction plot with the phasing issues:

image

I added a small note about the AR logic in the comments to the code:

image

Do you think this is sufficient or should i add anything else?

Copy link
Member

@lucianopaz lucianopaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @NathanielF. It's almost ready. I just have two very minor nitpicks before approving

…prediction for mean and removed redundant argument from plot_fits function

Signed-off-by: Nathaniel <NathanielF@users.noreply.github.com>
@NathanielF
Copy link
Contributor Author

Thanks @lucianopaz I fixed those last two issues. It was really great working on this issue with yourself and @drbenvincent I learned a tonne!

Copy link
Member

@lucianopaz lucianopaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @NathanielF! Thank you so much for contributing this!

@NathanielF
Copy link
Contributor Author

Fantastic. This has been a great experience. I hope to follow it up shortly with another pull request on the Bayesian VAR models. Thanks again for your time on this!

@lucianopaz lucianopaz merged commit d6fcd3d into pymc-devs:main Nov 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants