New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add LKJ bijector #125
Add LKJ bijector #125
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really great work @yiyuezhuo ! Thank you so much!
I've added quite a few comments, mostly suggested style-changes, but also some questions so I can understand what's going on.
I haven't looked at correctness of impl; I'll do that next 👍
src/bijectors/corr.jl
Outdated
# https://mc-stan.org/docs/2_23/reference-manual/correlation-matrix-transform-section.html | ||
# (7/30/2020) their "manageable expression" is wrong... | ||
|
||
function upper1(AT, A) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make use of UpperTriangular
(which uses a view) instead of copying the matrix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PDBijector
use a similar copy based implementation https://github.com/TuringLang/Bijectors.jl/blob/master/src/bijectors/pd.jl#L16 https://github.com/JuliaLang/julia/blob/master/stdlib/LinearAlgebra/src/triangular.jl#L164. Besides, I think I can just remove this function if https://github.com/TuringLang/Bijectors.jl/blob/master/test/transform.jl#L67 is not required.
Co-authored-by: Tor Erlend Fjelde <tor.github@gmail.com>
Co-authored-by: Tor Erlend Fjelde <tor.github@gmail.com>
Co-authored-by: Tor Erlend Fjelde <tor.github@gmail.com>
Co-authored-by: Tor Erlend Fjelde <tor.github@gmail.com>
Co-authored-by: Tor Erlend Fjelde <tor.github@gmail.com>
Co-authored-by: Tor Erlend Fjelde <tor.github@gmail.com>
Co-authored-by: Tor Erlend Fjelde <tor.github@gmail.com>
Co-authored-by: Tor Erlend Fjelde <tor.github@gmail.com>
Co-authored-by: Tor Erlend Fjelde <tor.github@gmail.com>
Co-authored-by: Tor Erlend Fjelde <tor.github@gmail.com>
Co-authored-by: Tor Erlend Fjelde <tor.github@gmail.com>
Co-authored-by: Tor Erlend Fjelde <tor.github@gmail.com>
Co-authored-by: Tor Erlend Fjelde <tor.github@gmail.com>
Co-authored-by: Tor Erlend Fjelde <tor.github@gmail.com>
Co-authored-by: Tor Erlend Fjelde <tor.github@gmail.com>
Co-authored-by: Tor Erlend Fjelde <tor.github@gmail.com>
Co-authored-by: Tor Erlend Fjelde <tor.github@gmail.com>
Co-authored-by: David Widmann <devmotion@users.noreply.github.com>
Something seems broken 🤔, I will push the remaining modification and add a IJulia gist to show correctness later. |
src/bijectors/corr.jl
Outdated
return r + zero(x) # keep LowerTriangular until here can avoid some computation | ||
# This dense format itself is required by a test, though I can't get the point. | ||
# https://github.com/TuringLang/Bijectors.jl/blob/b0aaa98f90958a167a0b86c8e8eca9b95502c42d/test/transform.jl#L67 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bijectors.jl/test/transform.jl
Line 66 in b0aaa98
# This is a quirk of the current implementation, of which it would be nice to be rid. |
+ zero(x)
?This hack is really really ugly and unintuitive - e.g., if we would an
Array
in the end (I'm questioning that) it would be much more natural to call just Array(r)
instead of some weird addition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correlation matrix: Test Failed at /home/yiyuezhuo/.julia/dev/Bijectors/test/transform.jl:67
Expression: typeof(x) == typeof(y)
Evaluated: Array{Float64,2} == UpperTriangular{Float64,Array{Float64,2}}
Stacktrace:
[1] single_sample_tests(::LKJ{Float64,Int64}) at /home/yiyuezhuo/.julia/dev/Bijectors/test/transform.jl:67
[2] top-level scope at /home/yiyuezhuo/.julia/dev/Bijectors/test/transform.jl:203
[3] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1115
[4] top-level scope at /home/yiyuezhuo/.julia/dev/Bijectors/test/transform.jl:201
correlation matrix: Test Failed at /home/yiyuezhuo/.julia/dev/Bijectors/test/transform.jl:82
Expression: typeof(xs) == typeof(ys)
Evaluated: Array{Array{Float64,2},1} == Array{UpperTriangular{Float64,Array{Float64,2}},1}
Stacktrace:
[1] multi_sample_tests(::LKJ{Float64,Int64}, ::Array{Float64,2}, ::Array{Array{Float64,2},1}, ::Int64) at /home/yiyuezhuo/.julia/dev/Bijectors/test/transform.jl:82
[2] top-level scope at /home/yiyuezhuo/.julia/dev/Bijectors/test/transform.jl:220
[3] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1115
[4] top-level scope at /home/yiyuezhuo/.julia/dev/Bijectors/test/transform.jl:201
Test Summary: | Pass Fail Total
correlation matrix | 9 2 11
ERROR: LoadError: LoadError: Some tests did not pass: 9 passed, 2 failed, 0 errored, 0 broken.
in expression starting at /home/yiyuezhuo/.julia/dev/Bijectors/test/transform.jl:199
in expression starting at /home/yiyuezhuo/.julia/dev/Bijectors/test/runtests.jl:22
ERROR: Package Bijectors errored during testing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know why this test exists at all. My suggestion would be to just remove (or comment out) the test. But I guess @torfjelde might know why the test exists?
I can see some benefits to asking a dense matrix now. Type based AD such as ReverseDiff and Tracker face the problem of how to place their function (b::CorrBijector)(x::AbstractMatrix{<:Real})
w = cholesky(x).U
@show typeof(w)
r = _link_chol_lkj(w) ForwardDiff: typeof(w) = UpperTriangular{ForwardDiff.Dual{ForwardDiff.Tag{typeof(b_f),Float64},Float64,9},Array{ForwardDiff.Dual{ForwardDiff.Tag{typeof(b_f),Float64},Float64,9},2}} Zygote: typeof(w) = UpperTriangular{Float64,Array{Float64,2}} ReverseDiff: typeof(w) = UpperTriangular{ReverseDiff.TrackedReal{Float64,Float64,ReverseDiff.TrackedArray{Float64,Float64,2,Array{Float64,2},Array{Float64,2}}},ReverseDiff.TrackedArray{Float64,Float64,2,Array{Float64,2},Array{Float64,2}}} Tracker: typeof(w) = TrackedArray{…,UpperTriangular{Float64,Array{Float64,2}}} So In fact, ReverseDiff's type can not be matched by |
Co-authored-by: David Widmann <devmotion@users.noreply.github.com>
…/Bijectors.jl into Stan_corr_impl_adjoint
While # pd.jl
(b::PDBijector)(X::AbstractMatrix{<:Real}) = pd_link(X)
function pd_link(X)
Y = lower(parent(cholesky(X; check = true).L))
return replace_diag(log, Y)
end
lower(A::AbstractMatrix) = convert(typeof(A), LowerTriangular(A))
# zygote.jl
@adjoint function pd_link(X::AbstractMatrix{<:Real})
return pullback(X) do X
Y = cholesky(X; check = true).L
return replace_diag(log, Y)
end
end
@adjoint function lower(A::AbstractMatrix)
return lower(A), Δ -> (lower(Δ),)
end
# tracker.jl
lower(A::TrackedMatrix) = track(lower, A)
@grad function lower(A::AbstractMatrix)
Ad = data(A)
return lower(Ad), Δ -> (lower(Δ),)
end
# reversediff.jl
lower(A::TrackedMatrix) = track(lower, A)
@grad function lower(A::AbstractMatrix)
Ad = value(A)
return lower(Ad), Δ -> (lower(Δ),)
end |
I don't think it should matter to us (apart from dispatching, but do we actually dispatch on these types?) how the AD backends handle the tags. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have just two minor comments left, IMO we can merge this if they are addressed and tests pass. Could you also bump the version number to 0.8.3 so that we can make a new release (after we fixed the Distribution bounds)?
Great work and thanks a lot for this implementation (and your patience with all of my comments 😄)!
@devmotion I appreciate your comment and suggestions very much and have learned many Julia best practices and numerical tricks from them. 🎉 |
Thanks again @yiyuezhuo! |
Fix #108
Implement LKJ bijector (transform), following Stan's spec.
Since #124, I guess the CI will not pass. Though all PR related tests should pass.
Implement #108 example:
Setup:
Model definition:
Since
truncated(Cauchy(0., 5.), 0., Inf)
is a weakly informative prior, which sometimes gives a too large value which breaks sampling, specifying initial value is useful.While tests passed for
ForwardDiff, ReverseDiff, Zygote, Tracker
. Above model will not work onTracker
backend.NUTS sampler is not that robust for this model, because errors such as non-positive-definite will break whole sampling rather than just reject current step like Stan, so following adaption is proposed, according to TuringLang/Turing.jl#702 .
Above code will only work for
ForwardDiff, ReverseDiff
backends.