Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Traceplot hangs with 3.7.rc1 #3489

Closed
clausherther opened this issue May 22, 2019 · 5 comments · Fixed by #3502
Closed

Traceplot hangs with 3.7.rc1 #3489

clausherther opened this issue May 22, 2019 · 5 comments · Fixed by #3502

Comments

@clausherther
Copy link

I'm trying to plot a trace for 2 variables from a Dirichlet-Multinomial model, where sd has 1 value, and alpha has 58 values.
Using pymc==3.6 this ran fine in less than 30 secs or so.

In 3.7.rc1 (installed via pip install pymc3==3.7rc1) this hangs at 100% CPU (longest I've let it run was 12 minutes before stopping the jupyter kernel).
I've tried with both the old varnames and the new var_names.

_ = pm.traceplot(trace_sku, 
                 var_names=["sd", "alpha"], 
                 divergences=False,
                 combined=True, 
            );

Plotting the single-value sd parameter by itself works:

_ = pm.traceplot(trace_sku, var_names=["sd"], divergences=False);

However, including alpha causes the plot to hang.

# Hangs
_ = pm.traceplot(trace_sku, var_names=["alpha"], divergences=False);

The trace trace_sku was run with 4 chains, 2,000 draws each and 1,000 tuning steps each, so contains 8,000 non-tuning step samples:

Parameters I'm trying to plot:

trace_sku["sd"].shape
(8000,)
trace_sku["alpha"].shape
(8000, 58)

Parameters I'm not trying to plot, b/c they're too large for a traceplot:

trace_sku["sku_probability"].shape
(8000, 36, 58)

Versions:

seaborn          0.9.0
matplotlib.pylab 1.15.3
pandas           0.24.1
theano           1.0.4
numpy            1.15.3
pymc3            3.7.rc1 
CPython 3.6.7
@clausherther
Copy link
Author

clausherther commented May 22, 2019

Looks like that's b/c Arviz splits variables with multiple values, in this case 58 onto separate axes, vs the old behavior of plotting on the same axis.

Perhaps there could be a warning or error if the number of traces for a variables exceeds some threshold?

@ColCarroll
Copy link
Member

arviz is looking at allowing the "old" behavior, then pymc3 can default to that. there's also an issue there to have a global configuration for the number of subplots it will create, and to have it fail if you ask for too many (you can change the config with code).

do you think those changes will fix this?

@clausherther
Copy link
Author

Hi @ColCarroll I think at least for the short/mid term it'd be great to have PyMC3 traceplots by default behave in the old way! Making the new behavior optional and configurable also sound like good ideas to me.
For context, I often have traceplots for 20-30 valued parameters that work fine on the same axis that would all essentially fail using the new behavior and really couldn't be plotted anymore.
I get that there might not be huge value in plotting 30 beta distributions on one plot, but it does give you a quick picture of how well your trace turned out. Thx!

@twiecki
Copy link
Member

twiecki commented May 23, 2019

agreed, I don't really like the new default traceplot behavior. If something has too many dimensions for a single subplot it'll also have too many for multiple subplots, just in one case I get a plot while jn the other it just hangs.

@ColCarroll
Copy link
Member

This is addressed by arviz-devs/arviz#679 -- i'll make a PR to make this behavior default for pm.traceplot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants