Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Odd behaviour in v2.3.4 #663

Closed
sammosummo opened this issue Dec 18, 2014 · 9 comments
Closed

Odd behaviour in v2.3.4 #663

sammosummo opened this issue Dec 18, 2014 · 9 comments

Comments

@sammosummo
Copy link

I'm using version 2.3.4 (as recommended in another issue), and I'm getting some really odd behaviour when trying to create a simple model with a non-trivial number of data points. Below I've attached some simple code that produces the error:

import numpy as np
import pymc

N = [[[40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40],
      [40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40],
      [40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40],
      [40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40]],

     [[40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40],
      [40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40],
      [40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40],
      [40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40]],

     [[40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40],
      [40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40],
      [40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40],
      [40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40]],

     [[40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40],
      [40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40],
      [40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40],
      [40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40]],

     [[50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50,],
      [40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40,],
      [41, 42, 42, 38, 38, 38, 44, 38, 38, 41, 38, 41, 44,],
      [40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40,]],

     [[40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40],
      [40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40],
      [40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40],
      [40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40]]]

S = [[[ 0,  1,  5, 11, 18, 20, 21, 21, 21, 29, 35, 38, 40],
      [ 2,  2,  8, 17, 18, 20, 19, 25, 23, 25, 37, 40, 39],
      [ 0,  5, 10, 16, 16, 21, 20, 26, 25, 23, 39, 40, 40],
      [ 5,  2, 12, 20, 18, 17, 22, 18, 24, 23, 33, 38, 40]],

     [[ 4,  4,  8, 17, 16, 15, 23, 28, 34, 32, 39, 39, 39],
      [ 1,  3, 15, 20, 24, 21, 27, 21, 27, 28, 34, 39, 39],
      [ 4, 14,  9, 15, 27, 22, 25, 18, 24, 27, 32, 33, 39],
      [ 3, 12, 14, 19, 24, 14, 23, 22, 21, 24, 28, 33, 35]],

     [[ 0,  1,  5, 13, 18, 17, 20, 21, 32, 28, 38, 39, 39],
      [ 0,  4, 13, 11, 14, 23, 19, 23, 19, 25, 30, 38, 40],
      [ 0,  6,  9, 17, 19, 17, 23, 29, 28, 27, 34, 36, 40],
      [ 2,  4, 13, 17, 17, 21, 24, 23, 24, 26, 31, 33, 40]],

     [[ 1,  0,  6, 18, 14, 19, 23, 18, 29, 27, 26, 30, 34],
      [ 2,  3, 11, 20, 19, 18, 19, 20, 23, 23, 30, 38, 38],
      [ 2,  6,  8, 14, 14, 17, 19, 19, 20, 28, 27, 31, 35],
      [ 2,  8, 13, 19, 23, 12, 28, 14, 26, 27, 26, 28, 35]],

     [[ 1,  5, 10, 15, 14, 16, 17, 20, 20, 24, 33, 50, 49],
      [ 0,  3,  5,  6, 10, 10, 16, 12, 11, 21, 27, 38, 39],
      [ 1,  1,  9, 13, 15, 13, 23, 15, 24, 25, 32, 39, 44],
      [ 3, 10, 14, 16, 20, 22, 21, 21, 28, 30, 36, 34, 40]],

     [[ 1,  3,  1,  2,  8,  9, 16, 23, 27, 35, 40, 39, 40],
      [ 1,  3,  1,  6,  6, 14, 25, 19, 30, 36, 40, 39, 40],
      [ 1,  4,  5,  5, 14, 22, 19, 23, 31, 31, 31, 40, 40],
      [ 1,  1,  4, 10, 16, 17, 23, 23, 32, 26, 34, 39, 40]]]

pymc.Binomial('S', N, 0.5, S, observed=True) 

Sometimes this code will run with no problems, but sometimes I get the following error:

pymc.Node.ZeroProbability: Stochastic S's value is outside its support,
 or it forbids its parents' current values.

which does not make sense to me since all the values are valid. The probability of encountering an error seems to increase the more times I run the script. Any ideas?

@fonnesbeck
Copy link
Member

Actually, I get a worse result than this. I get segmentation faults after trying to run the code a few times. I can narrow it to an even simpler case:

N = [50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50,]

S =  [ 1,  5, 10, 15, 14, 16, 17, 20, 20, 24, 33, 50, 49]

pymc.Binomial('S', N, 0.5, S, observed=True)

which yields

python(45269,0x7fff7d1b1300) malloc: *** error for object 0x1020139d8: incorrect checksum for freed object - object was probably modified after being freed.
*** set a breakpoint in malloc_error_break to debug
/Users/fonnescj/anaconda3/envs/py3/bin/python.app: line 3: 45269 Abort trap: 6           /Users/fonnescj/anaconda3/envs/py3/python.app/Contents/MacOS/python "$@"

It has something to do with the list structure, as running the same data but iterating over the scalar values does not replicate the behavior.

@fonnesbeck
Copy link
Member

Can you replicate the problem if you pass flat arrays for S and N? The PyMC log-likelihoods aren't really built to handle multidimensional values.

@sammosummo
Copy link
Author

OK, so there are at least two separate things going on here.

First, I ran your code and got the same segmentation fault. After running inside a loop, I find the problem occurs after around 5-11 iterations. Sometimes, however, the code won't raise an error but will just stop before completion. These could be two different things but I think it's likely they are the same, or at least related. Making p an array of the same length as N and S fixes both of these issues.

The error I posted is somewhat different, and as you say appears to be caused by passing multi-dimensional arrays to PyMC. It seems to be assigning observations to the wrong parameters, such that sometimes the value of S is assigned the wrong corresponding N parameter. When this causes s>n, it throws out the pymc.Node.ZeroProbability exception.

Based on this, would you recommend flattening ALL arrays passed to PyMC objects?

@fonnesbeck
Copy link
Member

I always recommend using flat data structures in PyMC, and slicing the values that you need when you need them. I haven't yet tracked down what is causing either of the issues, however, so I can't say much more than that at this point. I know that PyMC will resize p to be the same size as S before it is passed to the Fortran log-likelihood, so what you suggest should not be happening.

I'm pretty sure the issue occurs outside of the Fortran, however, because when I call binomial_like rather than the Binomial class, I can replicate neither issue.

pymc.binomial_like(S, N, 0.5)

This is good news, since it should be easier to debug.

@fonnesbeck
Copy link
Member

Something crazy going on here:

904         def get_logp(self):
905             import pdb; pdb.set_trace()
906  ->         if self.verbose > 0:
907                 print_('\t' + self.__name__ + ': logp accessed.')
908             logp = self._logp.get()
909     
910             try:
911                 logp = float(logp)
(Pdb) n
> /Users/fonnescj/GitHub/pymc2/pymc/PyMCObjects.py(908)get_logp()
-> logp = self._logp.get()
(Pdb) self._logp.get()
-1.7976931348623157e+308
(Pdb) self.value
[1]    617 segmentation fault  ipython

fonnesbeck pushed a commit that referenced this issue Dec 20, 2014
@fonnesbeck
Copy link
Member

OK, I think I got it. Please test the current 2.3 branch and see if the problem persists.

@sammosummo
Copy link
Author

Do you know the conda/brew/pip command to do this? (sorry, complete amateur).

@fonnesbeck
Copy link
Member

The easiest way is this:

pip install -U git+git://github.com/pymc-devs/pymc.git@2.3

@sammosummo
Copy link
Author

Seems to do the trick!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants