Inconsistent loss values with & without vectorization #414

hawkrobe · 2017-10-30T20:39:24Z

Consider the vectorized and non-vectorized versions of the same hierarchical regression model. The main differences are that the vectorized version:

uses iarange instead of irange
uses a single observe from a 1 x batch_size dimensional distribution instead of batch_size separate observes from 1 dimensional distributions
optimizes 1 x k dimensional params for a single intercepts sample site instead of k separate 1 dimensional params for k sample sites

Intuitively, I'd expect these to converge to the same loss; instead the vectorized version converges to ~1900 and the non-vectorized version converges to ~600. This same model written in webppl also converges to about ~600, so this might indicate an issue with scaling in the vectorized version?

Incidentally, the mean-field guide sigmas also converge to different values in the vectorized version although the mu point estimates are the same. The unvectorized version matches webppl but the vectorized version has much higher certainty.

It's of course very plausible that there's a bug in my implementation of the vectorized model!

The text was updated successfully, but these errors were encountered:

fritzo · 2017-10-30T21:08:27Z

Thanks for the report and the reproducible example! I'm looking into it.

hawkrobe · 2017-10-30T21:18:56Z

@fritzo : thanks! @jpchen has also been working with these models a fair amount and may have thoughts as well.

jpchen · 2017-10-30T21:24:36Z

yep working wit @fritzo on this

fritzo · 2017-10-31T02:03:54Z

After some deep diving with @jpchen, we found that your model and guide that had different parameter shapes at the pyro.sample('intercept', Normal(...)) site. The fix is to change your model parameter shapes

  subj_bias = pyro.sample('intercepts',
-                         Normal(b0.expand(num_subjects),
-                                sigma_subj.expand(num_subjects)))
+                         Normal(b0.expand(num_subjects, 1),
+                                sigma_subj.expand(num_subjects, 1)))

Sorry for such a difficult-to-diagnose error. We are adding an error message for this case so that debugging will be easier in the future (see #303). Let us know if this works for you.

hawkrobe · 2017-10-31T05:24:32Z

@fritzo @jpchen : wow, that's super subtle (and a really surprising consequence -- I would've expected it to either throw an error or noticeably mess the whole thing up instead of just making the loss/uncertainty converge to different numbers!)

Thanks for taking the time to diagnose, and glad it's not a deeper issue!

fritzo self-assigned this Oct 30, 2017

fritzo added the bug label Oct 30, 2017

hawkrobe mentioned this issue Oct 30, 2017

Further profiling -- small tensor models #361

Closed

fritzo added usability warnings & errors and removed bug labels Oct 31, 2017

fritzo closed this as completed Oct 31, 2017

fritzo mentioned this issue Oct 31, 2017

Error if model,guide site shapes disagree #418

Merged

neerajprad mentioned this issue Oct 31, 2017

Raise errors when scoring data with unexpected dimensions #419

Merged

neerajprad mentioned this issue Nov 28, 2017

Wrap torch.distributions.Normal for use in Pyro #607

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inconsistent loss values with & without vectorization #414

Inconsistent loss values with & without vectorization #414

hawkrobe commented Oct 30, 2017

fritzo commented Oct 30, 2017

hawkrobe commented Oct 30, 2017

jpchen commented Oct 30, 2017

fritzo commented Oct 31, 2017

hawkrobe commented Oct 31, 2017 •

edited

Inconsistent loss values with & without vectorization #414

Inconsistent loss values with & without vectorization #414

Comments

hawkrobe commented Oct 30, 2017

fritzo commented Oct 30, 2017

hawkrobe commented Oct 30, 2017

jpchen commented Oct 30, 2017

fritzo commented Oct 31, 2017

hawkrobe commented Oct 31, 2017 • edited

hawkrobe commented Oct 31, 2017 •

edited