Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Order of Array Parameters is not Preserved by get #311

Open
farr opened this issue Jun 16, 2021 · 4 comments
Open

Order of Array Parameters is not Preserved by get #311

farr opened this issue Jun 16, 2021 · 4 comments

Comments

@farr
Copy link

farr commented Jun 16, 2021

If I have a chain with array parameters, get returns them in some random order, not numeric:

c = Chains(randn(100, 13, 4), ["x[$(i)]" for i in 13:-1:1])
xs = get(c, :x).x

all(xs[1] .== c[Symbol("x[1]")]) # false
all(xs[1] .== c[Symbol("x[13]")]) # true

This is a really bad bug for people who work with arrays....

@devmotion
Copy link
Member

in some random order

They are returned in the order in which they appear in the chain. Since you constructed the chain in such a way that the first parameter belonging to x is x[13], x[13] is the first column returned by get.

@devmotion
Copy link
Member

devmotion commented Jun 17, 2021

You can use sort to reorder the parameters in natural sort order (or any other order but this is the default):

sorted_chain = sort(c)
sorted_xs = get(sorted_chain, :x).x

sorted_xs[1] == c["x[1]"] # true
sorted_xs[2] == c["x[2]"] # true
sorted_xs[11] == c["x[11]"] # true

@AnselmJeong
Copy link

This may affect the reliability of the summarystats() output.

In a multi-level model like this.

y = [28.0, 8.0, -3.0, 7.0, -1.0, 1.0, 18.0, 12.0]
sigma = [15.0, 10.0, 16.0, 11.0, 9.0, 11.0, 10.0, 18.0]

@model function eight(;y=missing, σ=sigma, N=8)
    y = y===missing ? Array{Float64}(undef, N) : y
    
    μ ~ Normal(0, 5)
    τ ~ truncated(Cauchy(0, 5), 0, Inf)
    θ ~ filldist(Normal(μ, τ), N)
    
    @. y ~ Normal(θ, σ)
end

The obtained chain has θ parameters in haphazard order.

group(chain, :θ)

OUTPUT:

Iterations        = 501:1:1500
Number of chains  = 1
Samples per chain = 1000
Wall duration     = 0.5 seconds
Compute duration  = 0.5 seconds
parameters        = θ[5], θ[7], θ[8], θ[2], θ[1], θ[4], θ[6], θ[3]
internals         = 

Summary Statistics
  parameters      mean       std   naive_se      mcse        ess      rhat   e ⋯
      Symbol   Float64   Float64    Float64   Float64    Float64   Float64     ⋯

        θ[1]    7.0462    6.4852     0.2051    0.4285   268.2643    1.0079     ⋯
        θ[2]    5.3364    5.1565     0.1631    0.2275   351.0179    0.9996     ⋯
        θ[3]    3.7642    5.6877     0.1799    0.3353   375.4434    0.9991     ⋯
        θ[4]    4.8493    5.3475     0.1691    0.3289   346.8819    1.0046     ⋯
        θ[5]    3.1562    5.1128     0.1617    0.2497   351.7596    1.0020     ⋯
        θ[6]    3.8590    5.2401     0.1657    0.3042   358.8378    1.0002     ⋯
        θ[7]    7.2531    5.9002     0.1866    0.3476   262.3535    1.0127     ⋯
        θ[8]    5.1030    6.0514     0.1914    0.3020   414.4987    1.0040     ⋯

The ordering of θ parameters is indicated as θ[5], θ[7], θ[8], θ[2], θ[1], θ[4], θ[6], θ[3] so that the result of the summarystat(chain) reflect this ordering. But the actual printout of summarystat(chain) confuse the user because the parameters column lists the θ parameters as θ[1], θ[2]....
In the above example, the printout mean value of θ[1] 7.0462 is actually the man value of θ[5] (the first one in the order).

As a newbie like me, this weird behavior made me quite frustrating.

Thanks.

@knuesel
Copy link

knuesel commented Aug 4, 2021

I was also bitten by this.

By the way, a similar confusion can arise with simple chains[[...]] indexing:

julia> chains = Chains([10i+j for i in 1:2, j in 1:5], [:a, :b, :c, :d, :e]);

julia> Array(chains[[:d, :a, :e]])
2×3 Matrix{Int64}:
 11  14  15
 21  24  25

Here the matrix columns are not in the "requested" order. (Is there an easy way to get them in the desired order?)

I guess a switch to AxisKeys (as suggested in #231) would help here too...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants