Test failure of enum-parallel gradients after PyTorch #5776 #912

neerajprad · 2018-03-21T00:46:55Z

With the latest pytorch master, many parallel enum tests in test_enum are failing due to mismatch in the gradient computation.

To replicate, checkout the commit 5fa3aac610ee234338dbc11eb5b6d4a133cb483d in PyTorch master (pytorch/pytorch#5776), build PyTorch and run these tests

pytest -v --tb=short tests/infer/test_enum.py

Example of a failing test - test_elbo_iarange_iarange 2-2-None-None-parallel-None.

@fritzo, @eb8680 - I thought that there could be some unexpected interactions between the dice elbo change and upstream PyTorch. Turns out that is not exactly the case as 11 of our tests fail even before the dice elbo change, but there are more failures (79) with dice elbo. Could you guys take a look?

This could either be a Pyro bug or something in PyTorch upstream.

The text was updated successfully, but these errors were encountered:

cpuhrsch · 2018-03-21T14:39:30Z

Hey @neerajprad, thank you for your post, I'm looking into this now.

cpuhrsch · 2018-03-21T16:54:28Z

I found the bug, I'll send a patch soon.

neerajprad · 2018-03-21T17:52:09Z

Thanks, @cpuhrsch! Curious to see where the bug was.

cpuhrsch · 2018-03-21T20:30:26Z

@neerajprad please see PR pytorch/pytorch#5926

fritzo · 2018-03-24T18:57:25Z

Fixed upstream by pytorch/pytorch#5926 and in Pyro by #917.

neerajprad added the bug label Mar 21, 2018

eb8680 added the high priority label Mar 21, 2018

fritzo self-assigned this Mar 21, 2018

fritzo changed the title ~~Elbo gradient mismatch with enum-parallel on latest pytorch master~~ Test failure of enum-parallel gradients after PyTorch #5776 Mar 21, 2018

fritzo removed their assignment Mar 21, 2018

fritzo mentioned this issue Mar 21, 2018

ATen ReduceOps Re-merge pytorch/pytorch#5776

Merged

soumith mentioned this issue Mar 21, 2018

ReduceOps are breaking Pyro test pytorch/pytorch#5921

Closed

neerajprad mentioned this issue Mar 21, 2018

Introduce validation toggle flags for pyro.infer module #913

Merged

4 tasks

cpuhrsch mentioned this issue Mar 21, 2018

parallel_for_2d fix and guarding avx/avx2 compilation pytorch/pytorch#5926

Merged

fehiepsi mentioned this issue Mar 22, 2018

Rename sparse_mvn and support batch for mvn variance #916

Merged

fritzo closed this as completed Mar 24, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test failure of enum-parallel gradients after PyTorch #5776 #912

Test failure of enum-parallel gradients after PyTorch #5776 #912

neerajprad commented Mar 21, 2018 •

edited by fritzo

cpuhrsch commented Mar 21, 2018

cpuhrsch commented Mar 21, 2018

neerajprad commented Mar 21, 2018

cpuhrsch commented Mar 21, 2018

fritzo commented Mar 24, 2018

Test failure of enum-parallel gradients after PyTorch #5776 #912

Test failure of enum-parallel gradients after PyTorch #5776 #912

Comments

neerajprad commented Mar 21, 2018 • edited by fritzo

cpuhrsch commented Mar 21, 2018

cpuhrsch commented Mar 21, 2018

neerajprad commented Mar 21, 2018

cpuhrsch commented Mar 21, 2018

fritzo commented Mar 24, 2018

neerajprad commented Mar 21, 2018 •

edited by fritzo