ESS produces the wrong result certain models #1633

torfjelde · 2021-06-08T06:39:30Z

torfjelde · 2021-06-08T12:15:54Z

Tests are failing because of the issue fixed in TuringLang/Bijectors.jl#184

devmotion · 2021-06-08T12:53:21Z

src/inference/ess.jl

+    for vn in Iterators.flatten(values(vns))
+        set_flag!(varinfo, vn, "del")
+    end


We should add a context that does this 😛 At least it would be more convenient than dealing with DynamicPPL internals here. I remember that I was very confused and uncertain if I did it correctly when I implemented this. It seemed to work 😬

It works sometimes because in assume we only check if the flag is set for vns[1], but that is only for "sub-symbols". This is the line I'm referring to: https://github.com/TuringLang/DynamicPPL.jl/blob/9083299db3f623136895cae80ef5f10d7fcf8d2c/src/context_implementations.jl#L268. But this won't work if we have more than one key in vi.metadata or if tilde_assume and others are called with a varname subsumed e.g. m[2].

And this won't be an issue once we have a clear separation between sampling and evaluation. These sorts of bugs show up soooo often (and I agree, it's super-confusing), so looking forward to not having those:)

devmotion · 2021-06-08T12:53:42Z

src/inference/ess.jl

+function DynamicPPL.dot_tilde(ctx::DefaultContext, sampler::Sampler{<:ESS}, right, left, vi)
+    return DynamicPPL.dot_tilde(ctx, SampleFromPrior(), right, left, vi)


Was this a bug or is needed because of recent changes in DynamicPPL?

This was a bug. This is never actually hit and given the arguments I'm assuming it was intended to be a dot_tilde_observe rather than a dot_tilde_assume (which is the case when rng is passed, but because the signature doesn't match the rest of the dot_tilde_assume, it was never hit). The result was that ESS didn't work for dotted observations.

devmotion · 2021-06-08T12:54:03Z

src/variational/advi.jl

@@ -34,14 +34,15 @@ function Bijectors.bijector(
    end

    bs = Bijectors.bijector.(tuple(dists...))
+    rs = tuple(ranges...)


I guess this is not related to ESS?

Correct; this is just to ensure that we have the same behavior as we had before the most recent release of Bijectors.jl. I just noticed it when trying to figure out why the tests weren't passing.

test/test_utils/models.jl

devmotion · 2021-06-08T12:58:07Z

test/test_utils/numerical_tests.jl

+        # Log this so that if something goes wrong, we can identify the
+        # algorithm and model.
+        @info "Testing $(alg) on $(m.name)"


Just put it into a testset

@testset "Testing $(alg) on $(m.name)" for m in mean_of_mean_models ... end

?

Wait wat?! That works?!

Yes 😄 https://docs.julialang.org/en/v1/stdlib/Test/#Test.@testset

Sick! Did not know that:)

devmotion · 2021-06-08T12:59:43Z

test/test_utils/models.jl

+# A collection of models for which the mean-of-means for the posterior should
+# be same.


Sorry, what exactly do you mean with mean-of-means? And is the value the same as the prior? Or between the models? And only with the default arguments or in general?

I want to have a collection of models which tries out all the combinations of *_tilde_*, but this means that we'll sometimes have univariate latent variables rather than multivariate (e.g. gdemo5 below). Therefore I compare the mean of the mean of the latent variables rather than the variables directly.

Buuuuut now that we're comparing to the true mean rather than pitting the different models against each other, I guess we don't need to do that 😅

Project.toml

Co-authored-by: David Widmann <devmotion@users.noreply.github.com>

coveralls · 2021-06-10T08:19:25Z

Pull Request Test Coverage Report for Build 924779474

3 of 5 (60.0%) changed or added relevant lines in 2 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage increased (+1.02%) to 79.325%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
src/variational/advi.jl	0	2	0.0%

Totals
Change from base Build 899504966:	1.02%
Covered Lines:	1128
Relevant Lines:	1422

💛 - Coveralls

codecov · 2021-06-10T08:19:25Z

Codecov Report

Merging #1633 (64a816a) into master (c79dce0) will increase coverage by 1.02%.
The diff coverage is 60.00%.

@@            Coverage Diff             @@
##           master    #1633      +/-   ##
==========================================
+ Coverage   78.30%   79.32%   +1.02%     
==========================================
  Files          23       23              
  Lines        1424     1422       -2     
==========================================
+ Hits         1115     1128      +13     
+ Misses        309      294      -15

Impacted Files	Coverage Δ
src/variational/advi.jl	`61.11% <0.00%> (-1.16%)`	⬇️
src/inference/ess.jl	`98.00% <100.00%> (+4.12%)`	⬆️
src/stdlib/distributions.jl	`56.98% <0.00%> (+1.07%)`	⬆️
src/modes/ModeEstimation.jl	`80.76% <0.00%> (+9.12%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c79dce0...64a816a. Read the comment docs.

torfjelde · 2021-06-10T09:33:18Z

@devmotion You ready to give it 👍 ?

devmotion

Looks good 👍 Maybe test/Project.toml doesn't have to be modified?

test/Project.toml

yebai · 2021-06-10T11:53:43Z

thanks, @torfjelde.

* removed unnecessary exports * updated OptimizationContext * updated ESS smapler * fixed #1633 * fixed bug where ESS didnt support dot_observe * added some additional models to test against * added test for ESS on the mean-of-mean models * patch version bump * added tests on mean_of_mean_models for optimization methods too * fixed bug in bijector after recent update to Bijectors.jl * use exact value in check_mean_of_mean_models * fixed bug in OptimizationContext * just use MvNormal instead of TuringDiagMvNormal in test models * renamed the mean_of_mean models used tests * renamed the mean_of_mean_models in tests to gdemo_models * removed redundant testset block * upper-bound compat entries for Libtask while we wait for bugfix * compat entries with hyphens arent supported on Julia v1.3 * compat entries with hyphens not supported on Julia 1.3 * also test models with literal observe * Update Project.toml Co-authored-by: David Widmann <devmotion@users.noreply.github.com> * forgot to bump DPPL version * Apply suggestions from code review * bump DPPL patch version to fix AdvancedPS samplers * bump patch version Co-authored-by: David Widmann <devmotion@users.noreply.github.com>

* removed unnecessary exports * updated OptimizationContext * updated ESS smapler * fixed #1633 * fixed bug where ESS didnt support dot_observe * added some additional models to test against * added test for ESS on the mean-of-mean models * patch version bump * added tests on mean_of_mean_models for optimization methods too * fixed bug in bijector after recent update to Bijectors.jl * use exact value in check_mean_of_mean_models * fixed bug in OptimizationContext * just use MvNormal instead of TuringDiagMvNormal in test models * renamed the mean_of_mean models used tests * renamed the mean_of_mean_models in tests to gdemo_models * removed redundant testset block * upper-bound compat entries for Libtask while we wait for bugfix * compat entries with hyphens arent supported on Julia v1.3 * compat entries with hyphens not supported on Julia 1.3 * also test models with literal observe * Update Project.toml Co-authored-by: David Widmann <devmotion@users.noreply.github.com> * forgot to bump DPPL version * Apply suggestions from code review * bump DPPL patch version to fix AdvancedPS samplers * bump patch version * updated OptimizationContext to work with the new version of DPPL Co-authored-by: David Widmann <devmotion@users.noreply.github.com>

torfjelde added 9 commits June 8, 2021 07:35

fixed #1633

48b8463

fixed bug where ESS didnt support dot_observe

ca81eb0

added some additional models to test against

df8bb42

added test for ESS on the mean-of-mean models

9898865

patch version bump

7351562

added tests on mean_of_mean_models for optimization methods too

20267ce

fixed bug in bijector after recent update to Bijectors.jl

4a931c1

use exact value in check_mean_of_mean_models

48030eb

just use MvNormal instead of TuringDiagMvNormal in test models

92cabdf

torfjelde requested a review from devmotion June 8, 2021 12:20

devmotion reviewed Jun 8, 2021

View reviewed changes

torfjelde added 4 commits June 8, 2021 20:02

renamed the mean_of_mean_models in tests to gdemo_models

966b724

removed redundant testset block

2cc253b

upper-bound compat entries for Libtask while we wait for bugfix

2b5c5e1

compat entries with hyphens not supported on Julia 1.3

cab751a

devmotion reviewed Jun 10, 2021

View reviewed changes

Project.toml Outdated Show resolved Hide resolved

also test models with literal observe

20daa3e

torfjelde mentioned this pull request Jun 10, 2021

Update to new DPPL version #1636

Merged

Update Project.toml

169a014

Co-authored-by: David Widmann <devmotion@users.noreply.github.com>

devmotion approved these changes Jun 10, 2021

View reviewed changes

test/Project.toml Outdated Show resolved Hide resolved

torfjelde commented Jun 10, 2021

View reviewed changes

test/Project.toml Outdated Show resolved Hide resolved

Apply suggestions from code review

64a816a

devmotion approved these changes Jun 10, 2021

View reviewed changes

yebai merged commit 9f52d75 into master Jun 10, 2021

delete-merged-branch bot deleted the tor/fix-1633 branch June 10, 2021 11:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ESS produces the wrong result certain models #1633

ESS produces the wrong result certain models #1633

torfjelde commented Jun 8, 2021 •

edited

Loading

torfjelde commented Jun 8, 2021 •

edited

Loading

devmotion Jun 8, 2021

torfjelde Jun 8, 2021

devmotion Jun 8, 2021

torfjelde Jun 8, 2021

devmotion Jun 8, 2021

torfjelde Jun 8, 2021

devmotion Jun 8, 2021

torfjelde Jun 8, 2021

devmotion Jun 8, 2021

torfjelde Jun 8, 2021

devmotion Jun 8, 2021

torfjelde Jun 8, 2021

coveralls commented Jun 10, 2021 •

edited

Loading

codecov bot commented Jun 10, 2021 •

edited

Loading

torfjelde commented Jun 10, 2021

devmotion left a comment

yebai commented Jun 10, 2021

		function DynamicPPL.dot_tilde(ctx::DefaultContext, sampler::Sampler{<:ESS}, right, left, vi)
		return DynamicPPL.dot_tilde(ctx, SampleFromPrior(), right, left, vi)

		# A collection of models for which the mean-of-means for the posterior should
		# be same.

ESS produces the wrong result certain models #1633

ESS produces the wrong result certain models #1633

Conversation

torfjelde commented Jun 8, 2021 • edited Loading

torfjelde commented Jun 8, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coveralls commented Jun 10, 2021 • edited Loading

Pull Request Test Coverage Report for Build 924779474

💛 - Coveralls

codecov bot commented Jun 10, 2021 • edited Loading

Codecov Report

torfjelde commented Jun 10, 2021

devmotion left a comment

Choose a reason for hiding this comment

yebai commented Jun 10, 2021

torfjelde commented Jun 8, 2021 •

edited

Loading

torfjelde commented Jun 8, 2021 •

edited

Loading

coveralls commented Jun 10, 2021 •

edited

Loading

codecov bot commented Jun 10, 2021 •

edited

Loading