Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is Pigeons performing correctly on this example? #201

Closed
itsdfish opened this issue Feb 8, 2024 · 5 comments
Closed

Is Pigeons performing correctly on this example? #201

itsdfish opened this issue Feb 8, 2024 · 5 comments

Comments

@itsdfish
Copy link
Contributor

itsdfish commented Feb 8, 2024

Hello,

I have been using Pigeons for quantum models of cognition, which have multimodal posterior distributions and unusual log likelihood surfaces (see also #111). I was wondering whether someone could provide advice on the model in this script. This model has two rotation parameters θli and θpb, which allows one to transform between different bases within the same space. The model is parameterized such that the ranges are [-1,1]. In the model linked above, I use [0,1] because it appears that the predictions repeat (in other models the rotation parameters seem to interact and create non-repeating oscillations).

I have a few related questions. First, it's not clear to me whether the diagnostics look good. I don't understand how the algorithm works, but the output seems potentially suspicious. For example, many of the numbers below are near 0 or 1. This is a departure from other models I have used.

────────────────────────────────────────────────────────────────────────────
  scans        Λ      log(Z₁/Z₀)   min(α)     mean(α)    min(αₑ)   mean(αₑ) 
────────── ────────── ────────── ────────── ────────── ────────── ──────────
        2   7.11e-15  -1.42e-14          1          1          1          1 
        4   1.07e-14   1.78e-15          1          1          1          1 
        8   1.24e-14   2.66e-15          1          1          1          1 
       16   2.04e-14  -8.22e-15          1          1          1          1 
       32   3.06e-14  -8.66e-15          1          1          1          1 
       64   1.33e-14   6.66e-16          1          1          1          1 
      128    3.3e-14  -6.66e-15          1          1          1          1 
      256   1.89e-14   8.88e-16          1          1          1          1 
      512   2.21e-14     -4e-15          1          1          1          1 
 1.02e+03   2.09e-14   1.33e-15          1          1          1          1 
────────────────────────────────────────────────────────────────────────────
PT(checkpoint = false, ...)

I also noticed that NUTS gets stuck in several areas. I am not sure whether Pigeon is missing secondary modes, or whether NUTS gets stuck in these weird negative deflections, and Pigeons is robust to them. I suspect the later is true, but I don't quite know.

surface

I also noticed that the index process plot looked different from the example. As you can see, it looks like there is erratic jumping.

index

As I final somewhat related note, I tried Pigeons.n_tempered_restarts(pt), Pigeons.n_round_trips(pt) as suggested in the docs, but it threw an error.

Thanks!

@alexandrebouchard
Copy link
Member

Thanks for reporting this weird behaviour!

Here I strongly suspect that what happens is that the line

Turing.@addlogprob! logpdf(model, data, n_trials; n_way)

does not get annealed (hence PT is not able to work properly since the "reference distribution" is just the posterior).

@miguelbiron : I vaguely remember you got the @addlogprob! to work in another project, do you remember how you achieved this?

PS: when trying to replicate the runs using the provided script, I get:

ERROR: LoadError: UndefKeywordError: keyword argument `θil` not assigned
Stacktrace:
 [1] top-level scope
   @ ~/w/debug-quantum/run_pigeons.jl:23
 [2] include(fname::String)
   @ Base.MainInclude ./client.jl:478
 [3] top-level scope
   @ REPL[9]:1

@miguelbiron
Copy link
Collaborator

Hi! Yes you need to control the context, like we do here

if DynamicPPL.leafcontext(__context__) !== DynamicPPL.PriorContext()

@itsdfish
Copy link
Contributor Author

Thank you for your replies. I forgot to update the example after changing the name of a parameter. That should be fixed now.

I will try your fix tomorrow and report back. Thanks again!

@itsdfish
Copy link
Contributor Author

itsdfish commented Feb 14, 2024

I added the if statement, and updated the repo. It seems to have helped. For example, the trace looks better to me:

────────────────────────────────────────────────────────────────────────────
  scans        Λ      log(Z₁/Z₀)   min(α)     mean(α)    min(αₑ)   mean(αₑ) 
────────── ────────── ────────── ────────── ────────── ────────── ──────────
        2       1.68      -75.2   1.96e-12      0.813          1          1 
        4       3.14      -64.7    0.00503      0.651          1          1 
        8       2.89      -59.6      0.447      0.679          1          1 
       16       3.01      -59.8      0.103      0.665          1          1 
       32       2.84      -59.8      0.488      0.685          1          1 
       64       2.99      -59.6      0.354      0.668          1          1 
      128       3.18      -60.1      0.523      0.647          1          1 
      256        3.2      -60.1      0.546      0.644          1          1 
      512       3.03        -60      0.574      0.663          1          1 
 1.02e+03       3.06      -60.2      0.643      0.659          1          1 
────────────────────────────────────────────────────────────────────────────

However, I don't know the normal range of various diagnostics. So maybe the values above are not reasonable. The index process shows fewer jumps, but still seems like it might be high.

One thing that might help some users, such as myself, is more detail about typical ranges. Would it be worth opening a separate issue to elaborate on diagnostics (and also the use of Turing.@addlogprob!)?

@alexandrebouchard
Copy link
Member

Looks better! For Λ, the printed value can be used to set the number of chains, there should be at least around 2Λ chains (see https://pigeons.run/dev/output-pt/#Global-communication-barrier). So in short you seem to be using enough chains. For αₑ, since this is a slice sampler, higher is better, so no problems there either.

But yes, these would be reasonable issues to open regarding documentation. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants