New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
model.logp is nan #2066
Comments
This might be (part of) the problem:
The testval is not valid, it has to be between 0 and 1. If your master is older than a couple of hours, you might want to pull again. #2046 was merged very recently. |
Also, |
No, I really have the last version ;D even so, If i replace uniform with
I have -inf as logp -.-. This shouldn't be normal, should it? With pm.HalfCauchy it works, and the logp is even positive XD (40K) something strange? |
The bounded normal on its own works fine for me. Can you try |
@junpenglao had a CAR implementation posted in a different issue, I recommend finding that. |
@aseyboldt same thx |
NUTS prints warnings if there are nans in the non-tuning part of the trace (calls them divergences), Metropolis doesn't. We have been thinking about printing Messages that are more helpful, but it is a bit tricky to do that without a huge performance hit. from theano.compile.nanguardmode import NanGuardMode
mode = NanGuardMode(nan_is_error=True, inf_is_error=True, big_is_error=True)
with model:
step = pm.Metropolis(mode=mode)
trace = pm.sample(1000, step=step)
# Or if you want to use NUTS
trace = pm.sample(2000, tune=1000, nuts_kwargs={'mode': mode}) |
Something else that can help is to introduce |
Is there a particular reason you used Metropolis? Using NUTS with ADVI init seems to run normally. |
@junpenglao this is just a toy example. My real problem is bigger (more variables). Thus, in the bigger problem NUTS goes to ~1.5 iterations per second, and this is tooooo slow compared to 250 it/s of Metropolis. This is the only reason: trying more models... @aseyboldt I don't have any warning. Am I doing something wrong? thx |
Do you mean that you got the FloatingPointError in the original model with the invalid testval for |
@aseyboldt I've seen that mistake being made all the time. I wonder if we should add a warning if someone uses Metropolis or Slice on a continuous model that they should really use NUTS. |
@twiecki :) noob mistakes. Consider it! :)) @aseyboldt sorry, my bad: I forgot to put a valid testval for p :) With:
With this code I have
And de-commenting mus = pm.Deterministic('mu', mu), I have:
|
Re-parameterization should always be the first reaction when your model is slow :) |
I'm not sure we should warn people off of |
@denadai2 Your CAR implementation seems a bit strange. I just gave the version at http://mc-stan.org/documentation/case-studies/mbjoseph-CARStan.html a shot, and it runs much faster (100it/s) and seems to converge. But the model can't really tell what the spacial correlation should be. Maybe that is because the data doesn't provide enough information somehow, or there is a bug in the logp of CAR. import theano.sparse
import scipy.sparse
class CAR(Continuous):
def __init__(self, alpha, adjacency, *args, **kwargs):
if not isinstance(adjacency, np.ndarray):
raise ValueError("Adjacency matrix is not an ndarray.")
n, m = adjacency.shape
if n != m or np.any(adjacency != adjacency.T):
raise ValueError('Adjacency matrix must be symmetric.')
if 'shape' in kwargs and kwargs['shape'] == n:
raise ValueError('Invalid shape: Must match matrix dimension.')
kwargs['shape'] = n
super(CAR, self).__init__(*args, **kwargs)
self.n = n
self.alpha = tt.as_tensor_variable(alpha)
adjacency_sparse = scipy.sparse.csr_matrix(adjacency)
self.adjacency = theano.sparse.as_sparse_variable(adjacency_sparse)
self.neighbors = tt.as_tensor_variable(adjacency.sum(0))
self.mean = tt.zeros(n)
self.median = self.mean
adj = adjacency.astype('d').copy()
sqrt_neighbors = 1 / np.sqrt(adjacency.sum(0))
adj[:] *= sqrt_neighbors[:, None]
adj[:] *= sqrt_neighbors[None, :]
self.eigs = scipy.linalg.eigvalsh(adj)
def logp(self, x):
Wx = theano.sparse.dot(self.adjacency, x.reshape((self.n, 1)))
tau_dot_x = self.neighbors * x - self.alpha * Wx.ravel()
logdet = tt.log(1 - self.alpha * self.eigs).sum()
logp = 0.5 * (logdet - tt.dot(x, tau_dot_x))
return bound(logp, self.alpha > 0, self.alpha < 1)
with pm.Model() as model:
b0 = pm.Normal('intercept', mu=5.4, sd=2)
b1 = pm.Cauchy('b1_mdist_daily', alpha=0, beta=2)
# random effect precision parameter
sd = pm.HalfCauchy('sd', beta=2)
# strength of spatial correlation
p = pm.Uniform('p', lower=0, upper=1)
phi = pm.distributions.CAR('mu_phi', alpha=p, adjacency=amat)
mu = tt.exp(b0 + b1 * X['mdist_daily'] + sd * phi)
alpha = pm.HalfCauchy(name='alpha', beta=2)
home_points = pm.NegativeBinomial('home_points', mu=mu, alpha=alpha, observed=y)
trace = pm.sample(2000, tune=1000, njobs=4) |
@fonnesbeck @twiecki How about printing a warning if the number of dimensions is large? But I'm not sure what a reasonable cutoff would be. I vaguely remember someone saying that it usually doesn't work well if d>20, but I don't even know where I head that. |
nice implementation @aseyboldt ! I am writing a doc on porting models in WinBugs/JAGS/STAN to pymc3, with some additional tips and heuristic - can I use some of your code as an example? |
@junpenglao sure. But keep in mind that I haven't tested this thoroughly. |
@aseyboldt I will compare it with the STAN result. Will keep you posted! |
@aseyboldt wow thanks for your code. That implementation is especially good when you have multiple CAR variables (MCAR) and it is fast. I could use it if I fail with my code :D
Still slow, but it converges. @junpenglao trying to reparametrize |
Closed accidentally. Reopening. |
I would be fine with a "report" at the end of sampling that includes warnings and recommendations, perhaps written to disk. Along the lines of:
|
@fonnesbeck I seconded the "report" idea. Now it is a bit too much of all the warnings at times. |
@fonnesbeck @junpenglao I don't think we should overwrite a file without being asked explicitly. But how about adding a |
Many of the original issues are fixed. Closing. |
There are two strange behaviours in this model:
I have the last version (master) of pymc3
Logp:
etc
chains:
pickle
bug.pkl.zip
The text was updated successfully, but these errors were encountered: