changed ar1 logp to use ar1 precision instead of innovation precision #3899

ljmartin · 2020-05-01T01:08:42Z

Addresses #3892

The previous logp method for the AR1 model used the precision of the innovation term as the precision for the first observation. Assuming the AR(1) process was initiated in the infinite past, the first observation should be modelled as coming from anywhere in the dispersion of the whole AR(1) process.

I made a previous PR (#3897) which I have now closed. It had duplicated code since I now realise that tau_e (the innovation precision) and tau (the AR(1) precision) had already been defined, and the fix just needed a switch from tau_e to tau for the likelihood of the first observation.
cheers

AlexAndorra

Hi,
Thanks for reporting and solving the issue!

The changes look good to me. Can you however fix the tests on AR1 please? This seems to break them, as indicated by Travis CI.

Also, this may be a trivial question, but can there be cases when the data was initiated at zero, not in the infinite past? Would the changes implemented here still be appropriate or should we also rely on the previous implementation?

ljmartin · 2020-05-02T02:58:17Z

Thanks for checking it @AlexAndorra . I'm not experienced with github and/or travis but I took a look, and it appears to me the unit test is asking if the logp function using pymc3/theano is the same as if it were calculated in scipy, is this right?

In that case, the reason it was passing before is because the unit test also used the incorrect specification of the likelihood for the first observation. Now they disagree - is it OK to change the test in order to make it pass? i.e.

from:

def AR1_logpdf(value, k, tau_e):
    return (sp.norm(loc=0, scale=1/np.sqrt(tau_e)).logpdf(value[0]) +
            sp.norm(loc=k*value[:-1], scale=1/np.sqrt(tau_e)).logpdf(value[1:]).sum())

to:

def AR1_logpdf(value, k, tau, tau_e):
    return (sp.norm(loc=0, scale=1/np.sqrt(tau)).logpdf(value[0]) +
            sp.norm(loc=k*value[:-1], scale=1/np.sqrt(tau_e)).logpdf(value[1:]).sum())

this would also require changing:

    @pytest.mark.parametrize('n', [2, 3, 4])
    def test_AR1(self, n):
        self.pymc3_matches_scipy(AR1, Vector(R, n), {'k': Unit, 'tau_e': Rplus}, AR1_logpdf)

to:

    @pytest.mark.parametrize('n', [2, 3, 4]) #does this need to be changed?
    def test_AR1(self, n):
        self.pymc3_matches_scipy(AR1, Vector(R, n), {'k': Unit, 'tau': Rplus, 'tau_e': Rplus}, AR1_logpdf)

As for time series starting at zero, as I understand it this would violate the stationarity assumption (that the mean and variance are the same across time) of an AR1 process. It's a little bit confusing because in the tutorial, infinite past is assumed, but the time series also starts at zero which is in conflict with that:
https://docs.pymc.io/notebooks/AR.html

Thanks!

AlexAndorra · 2020-05-02T10:46:28Z

The changes you laid out for the tests look sound to me (the goal is indeed to change the test so that it reflects the new AR1 parametrization and, hence, so that it passes) 👌

Good point about the example NB: I think it should be corrected to start the time series in the infinite past too and use the new AR1 -- do you feel like doing this? It would make the PR complete 😃

Also, don't forget to add a line about this PR in the release notes, under maintenance.

ljmartin · 2020-05-04T05:21:31Z

OK thanks for the advice, I'll do those things this week :)

AlexAndorra · 2020-05-04T20:04:58Z

Thank you! And tell me if you have any questions

AlexAndorra · 2020-05-11T14:48:26Z

@ljmartin, did you find some time to look into that? Do you need some help?

ljmartin · 2020-05-12T07:43:57Z

thanks for checking in @AlexAndorra, had a busy week. Will it update the PR if I just commit to my PR-branch of the forked pymc3? I just did that to fix the tests - Ill check back in a few minutes to see if they pass. And also fix up the example NB!

AlexAndorra · 2020-05-12T08:08:43Z

Thanks a lot @ljmartin ! Yeah, it'll update the PR and relauch the tests automatically if you push to your branch.
I'll review your changes when you've pushed all the changes (tests, notebook and release notes) 👌

pymc3/distributions/timeseries.py

Co-authored-by: Thomas Wiecki <thomas.wiecki@gmail.com>

codecov · 2020-05-12T09:18:08Z

Codecov Report

❗ No coverage uploaded for pull request base (master@7a50430). Click here to learn what that means.
The diff coverage is 100.00%.

@@            Coverage Diff            @@
##             master    #3899   +/-   ##
=========================================
  Coverage          ?   86.39%           
=========================================
  Files             ?       86           
  Lines             ?    13723           
  Branches          ?        0           
=========================================
  Hits              ?    11856           
  Misses            ?     1867           
  Partials          ?        0

Impacted Files	Coverage Δ
pymc3/distributions/timeseries.py	`71.20% <100.00%> (ø)`
pymc3/tuning/__init__.py	`100.00% <0.00%> (ø)`
pymc3/distributions/mixture.py	`87.27% <0.00%> (ø)`
pymc3/distributions/__init__.py	`100.00% <0.00%> (ø)`
pymc3/model_graph.py	`88.48% <0.00%> (ø)`
pymc3/distributions/discrete.py	`74.65% <0.00%> (ø)`
pymc3/variational/__init__.py	`100.00% <0.00%> (ø)`
pymc3/step_methods/hmc/base_hmc.py	`95.32% <0.00%> (ø)`
pymc3/distributions/multivariate.py	`78.13% <0.00%> (ø)`
pymc3/tuning/starting.py	`80.91% <0.00%> (ø)`
... and 77 more

AlexAndorra

All good on the tests part now, thanks 👍
For the NB, it's a little hidden in the source code, so here's the link. You'll need to create a new jupyter kernel with your PyMC develop in the venv (tell me if you have questions on how to do that).
Also, at the very end of the NB, can you add a last cell with these commands (after doing pip install watermark):

%load_ext watermark
%watermark -n -u -v -iv -w

And finally Black-format the NB with pip install black_nbconvert and then black_nbconvert /path/to/a/notebook.ipynb on the command line -- this will reformat the NB inplace automatically 😉
Thanks a lot in advance!

ljmartin · 2020-05-12T11:14:55Z

Okey doke, I pip-installed the PR-branch into a fresh conda environment in order to run and edit the notebook. While editing the notebook, I got the Mass matrix contains zeros on the diagonal. error. I thought it was my additions, but now I dont think so:

If I pip-install the development version of pymc3 (in a second conda env, using pip install git+https://github.com/pymc-devs/pymc3) I get the same thing - i.e. it fails with the mass matrix error.
In constrast, the stable version of pymc3 (i.e. installed by conda install -c conda-forge pymc3 in a third env) runs fine.

Can anyone try running this using the latest development version, and see if the sampling completes? It's only short - if it fails then it does so within a few seconds. To reiterate, it runs fine on the stable version of pymc3 so I don't think the model is misspecified.

import pymc3 as pm
import numpy as np
import arviz as az

print('Running on PyMC3 v{}'.format(pm.__version__))

np.random.seed(seed=42)

T = 10000
y = np.zeros((T,))
theta = 0.95

for i in range(1,T):
    y[i] = theta * y[i-1] + np.random.normal()

with pm.Model() as ar1:
    theta_ = pm.Uniform('theta',-1,1) # we assume process is stationary, so -1<k_<1
    tau_ = pm.Gamma('tau',mu=1,sd=1) # the variance of the innovation term
    center = pm.Normal('center', mu=y.mean(), sigma=5) # prior for the processs mean initially centred on the population mean
    likelihood = pm.AR1('likelihood', k=theta_, tau_e=tau_, observed=y-center)

with ar1:
    trace = pm.sample(1000, chains=2)

AlexAndorra · 2020-05-12T11:33:18Z

Yeah it looks like another example of this issue. In short, it looks like the jitter+adapt_diag initialization of the MCMC got more unstable on the master branch than on 3.8 -- TBH, this is above my league for now and I don't know where it comes from and how to fix that.

The ways forward I usually use:

Trying with pm.sample(init="adapt_diag") (note that sample defaults to 1000 tuning samples and 1000 draws now on the master branch).
Using more informative priors. Here theta_ uses a Uniform distribution; this is usually not a good idea, even when the model samples.
Passing testvals to sensitive parameters that could throw NUTS off.

Tell me if this helps 😉

ljmartin · 2020-05-12T22:48:42Z

Yes looks like the same error. Unfortunately I couldn't get that little script to work using the tips you suggested. The only thing that works is moving to Metropolis MCMC i.e.

import pymc3 as pm
import numpy as np
import arviz as az

print('Running on PyMC3 v{}'.format(pm.__version__))

np.random.seed(seed=42)

T = 10000
y = np.zeros((T,))
theta = 0.95

for i in range(1,T):
    y[i] = theta * y[i-1] + np.random.normal()

with pm.Model() as ar1:
    theta_ = pm.Uniform('theta',-1,1) # we assume process is stationary, so -1<k_<1D
    tau_ = pm.Gamma('tau',mu=1,sd=1) # the variance of the innovation term
    center = pm.Normal('center', mu=y.mean(), sigma=5) # prior for the processs mean initially centred on the population mean
    likelihood = pm.AR1('likelihood', k=theta_, tau_e=tau_, observed=y-center)

with ar1:
    step = pm.Metropolis()
    trace = pm.sample(5000, step)
    #trace = pm.sample(init="adapt_diag")

Using Metropolis the results are good as expected.

What to do? Shall I wait for the kinks in NUTS to be ironed out? Doesn't really make sense to suggest using Metropolis in the example notebooks

AlexAndorra · 2020-05-13T07:16:13Z

I agree, it doesn't seem like a good idea to suggest Metropolis. I'll check this out locally and keep you posted 😉

AlexAndorra · 2020-05-13T09:13:34Z

Ok @ljmartin, I managed to sample from your model with NUTS, by changing the init method and using more regularizing priors. The following model should work and allow you to update the NB:

with pm.Model() as ar1:
    theta_ = pm.Normal('theta', 0., 0.5) # we assume process is stationary, so -1 < k_ < 1
    tau_ = pm.Exponential('tau', 0.5) # the variance of the innovation term
    center = pm.Normal('center', mu=y.mean(), sigma=2.5) # prior for the processs mean initially centred on the population mean
    
    likelihood = pm.AR1('likelihood', k=theta_, tau_e=tau_, observed=y - center)
    
    trace = pm.sample(init="adapt_diag", random_seed=42)
    idata = az.from_pymc3(trace)
az.plot_trace(idata);

Let me know how it goes 😉

ljmartin · 2020-05-15T11:12:24Z

Thanks for the code! I merged a new version of the notebook with the watermarking and black_nbconvert formatting, as well as adding a line to the RELEASE-NOTES.md. Something went weird because there was a conflict. I'm not sure how to resolve that - I can do it if you let me know how or someone else with write access can.

AlexAndorra · 2020-05-15T12:09:29Z

Thanks @ljmartin, reviewing ASAP 😉

AlexAndorra

Thanks a lot @ljmartin, this is almost ready to merge 🎉
I just left a last batch of comments (promise 😉 ). Tell me if you have any questions!

RELEASE-NOTES.md

docs/source/notebooks/AR.ipynb

ljmartin · 2020-05-25T06:39:56Z

Hey @AlexAndorra sorry for the wait, been a busy time! Ill work on this tonight. If I 'git pull' from my PR branch, will it sync that branch with the dev version of pymc3, and avoid conflicts? Would that overwrite the changes made so far?

AlexAndorra · 2020-05-25T09:14:53Z

Hi @ljmartin, and thanks for committing to this 🙏
I don't see any conflicts with master on this branch, so I think you can just pick up where you left off -- I don't foresee any conflicts when you'll push, as you're just modifying the NB now.

If you do want to sync with the base repository's master branch though, then, on your dev branch, after committing your changes locally, you can do:

$ git fetch upstream
$ git rebase upstream/master

Then push the changes to your remote branch as usual.

AlexAndorra · 2020-05-28T18:28:34Z

Hi @ljmartin ! Did you manage in the end, or got any trouble?

AlexAndorra · 2020-06-08T19:55:59Z

Hi @ljmartin! Do you think you'll have time to make the last changes in the coming days? Otherwise, do you mind if I take this over?
We're on the last pushes for the 3.9.0 release and that'd be nice to include this 😉

twiecki · 2020-06-09T08:47:53Z

@AlexAndorra If you have time and drive you can just push this over the finish line.

AlexAndorra · 2020-06-09T20:36:45Z

Busy with #3551 for now, but I added it on my OSS to-do list for this week @twiecki 😉
I'll probably have time to deal with it at the end of the week.

michaelosthege

Black doing its thing always makes it hard to review... But the tests are green, so...

AlexAndorra · 2020-06-11T19:19:14Z

I didn't modify anything in the code files (except Blackify them) and had already reviewed and approved them, so they should be ok 😉 Most of the work was on the NB.

changed ar1 logp to use ar1 precision instead of innovation precision

50e9d13

AlexAndorra requested changes May 1, 2020

View reviewed changes

twiecki assigned AlexAndorra May 3, 2020

fixed tests for AR1

7105b49

making AR1 scipy test behave like pymc3 version

caf7181

twiecki reviewed May 12, 2020

View reviewed changes

pymc3/distributions/timeseries.py Outdated Show resolved Hide resolved

twiecki reviewed May 12, 2020

View reviewed changes

pymc3/distributions/timeseries.py Outdated Show resolved Hide resolved

ljmartin and others added 2 commits May 12, 2020 18:54

Update pymc3/distributions/timeseries.py

18c207b

Co-authored-by: Thomas Wiecki <thomas.wiecki@gmail.com>

Update pymc3/distributions/timeseries.py

3ed9a50

Co-authored-by: Thomas Wiecki <thomas.wiecki@gmail.com>

AlexAndorra reviewed May 12, 2020

View reviewed changes

AlexAndorra added this to the 3.9.0 milestone May 13, 2020

AlexAndorra added the bug label May 13, 2020

AlexAndorra linked an issue May 13, 2020 that may be closed by this pull request

AR(1) logp potentially using wrong variance #3892

Closed

ljmartin added 2 commits May 15, 2020 19:54

updated AR notebook to use new AR1 model

d18f17c

updating timeseries AR1

e6a328f

Merge branch 'master' into fix-ar1-precision-typo

e7d0d41

AlexAndorra requested changes May 15, 2020

View reviewed changes

RELEASE-NOTES.md Outdated Show resolved Hide resolved

AlexAndorra reviewed May 15, 2020

View reviewed changes

AlexAndorra added 5 commits June 11, 2020 12:28

Fixed release notes

bdceac6

Black

981a449

Updated AR NB

e6e49e5

Blackified AR NB

a75e681

Merge branch 'master' into fix-ar1-precision-typo

3b604cb

michaelosthege approved these changes Jun 11, 2020

View reviewed changes

AlexAndorra approved these changes Jun 11, 2020

View reviewed changes

AlexAndorra merged commit bca5b88 into pymc-devs:master Jun 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

changed ar1 logp to use ar1 precision instead of innovation precision #3899

changed ar1 logp to use ar1 precision instead of innovation precision #3899

ljmartin commented May 1, 2020

AlexAndorra left a comment •

edited

Loading

ljmartin commented May 2, 2020 •

edited

Loading

AlexAndorra commented May 2, 2020 •

edited

Loading

ljmartin commented May 4, 2020

AlexAndorra commented May 4, 2020

AlexAndorra commented May 11, 2020

ljmartin commented May 12, 2020

AlexAndorra commented May 12, 2020

codecov bot commented May 12, 2020 •

edited

Loading

AlexAndorra left a comment

ljmartin commented May 12, 2020 •

edited

Loading

AlexAndorra commented May 12, 2020 •

edited

Loading

ljmartin commented May 12, 2020 •

edited

Loading

AlexAndorra commented May 13, 2020

AlexAndorra commented May 13, 2020

ljmartin commented May 15, 2020 •

edited

Loading

AlexAndorra commented May 15, 2020

AlexAndorra left a comment

ljmartin commented May 25, 2020

AlexAndorra commented May 25, 2020

AlexAndorra commented May 28, 2020

AlexAndorra commented Jun 8, 2020

twiecki commented Jun 9, 2020

AlexAndorra commented Jun 9, 2020

michaelosthege left a comment

AlexAndorra commented Jun 11, 2020

changed ar1 logp to use ar1 precision instead of innovation precision #3899

changed ar1 logp to use ar1 precision instead of innovation precision #3899

Conversation

ljmartin commented May 1, 2020

AlexAndorra left a comment • edited Loading

Choose a reason for hiding this comment

ljmartin commented May 2, 2020 • edited Loading

AlexAndorra commented May 2, 2020 • edited Loading

ljmartin commented May 4, 2020

AlexAndorra commented May 4, 2020

AlexAndorra commented May 11, 2020

ljmartin commented May 12, 2020

AlexAndorra commented May 12, 2020

codecov bot commented May 12, 2020 • edited Loading

Codecov Report

AlexAndorra left a comment

Choose a reason for hiding this comment

ljmartin commented May 12, 2020 • edited Loading

AlexAndorra commented May 12, 2020 • edited Loading

ljmartin commented May 12, 2020 • edited Loading

AlexAndorra commented May 13, 2020

AlexAndorra commented May 13, 2020

ljmartin commented May 15, 2020 • edited Loading

AlexAndorra commented May 15, 2020

AlexAndorra left a comment

Choose a reason for hiding this comment

ljmartin commented May 25, 2020

AlexAndorra commented May 25, 2020

AlexAndorra commented May 28, 2020

AlexAndorra commented Jun 8, 2020

twiecki commented Jun 9, 2020

AlexAndorra commented Jun 9, 2020

michaelosthege left a comment

Choose a reason for hiding this comment

AlexAndorra commented Jun 11, 2020

AlexAndorra left a comment •

edited

Loading

ljmartin commented May 2, 2020 •

edited

Loading

AlexAndorra commented May 2, 2020 •

edited

Loading

codecov bot commented May 12, 2020 •

edited

Loading

ljmartin commented May 12, 2020 •

edited

Loading

AlexAndorra commented May 12, 2020 •

edited

Loading

ljmartin commented May 12, 2020 •

edited

Loading

ljmartin commented May 15, 2020 •

edited

Loading