Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add LKJ #1066

Merged
merged 14 commits into from
Feb 14, 2020
Merged

Add LKJ #1066

merged 14 commits into from
Feb 14, 2020

Conversation

johnczito
Copy link
Member

@johnczito johnczito commented Feb 11, 2020

This PR adds the LKJ distribution, which is a distribution over correlation matrices. It can be found in other statistical computing platforms (here, here, or here, for instance).

Note: One of the unit tests runs a hypothesis test, so this PR adds HypothesisTests.jl as an extra testing dependency in the .toml. Let me know if I have done this incorrectly, or if I should exclude this change for now.

@johnczito
Copy link
Member Author

(As I mentioned here, I'm not sure if folks will want to wait on this indefinitely until #951 is sorted out. Hopefully not...)

@codecov-io
Copy link

codecov-io commented Feb 11, 2020

Codecov Report

Merging #1066 into master will increase coverage by 1.17%.
The diff coverage is 98.94%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1066      +/-   ##
==========================================
+ Coverage   79.45%   80.62%   +1.17%     
==========================================
  Files         112      113       +1     
  Lines        5514     5611      +97     
==========================================
+ Hits         4381     4524     +143     
+ Misses       1133     1087      -46
Impacted Files Coverage Δ
src/Distributions.jl 100% <ø> (ø) ⬆️
src/matrixvariates.jl 92.15% <ø> (ø) ⬆️
src/matrix/lkj.jl 98.94% <98.94%> (ø)
src/univariate/continuous/ksonesided.jl 0% <0%> (ø) ⬆️
src/univariate/continuous/locationscale.jl 94.11% <0%> (+2.94%) ⬆️
src/univariate/continuous/ksdist.jl 67.6% <0%> (+67.6%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fb90c3d...2c010df. Read the comment docs.

@mschauer
Copy link
Member

Hi, could you post a screenshot of the docstring of LKJ?

@johnczito
Copy link
Member Author

johnczito commented Feb 11, 2020

Hi, could you post a screenshot of the docstring of LKJ?

Screen Shot 2020-02-11 at 11 45 03 PM

^Updated.

@mschauer
Copy link
Member

Do you know in which sense is this "uniform" if eta=1?

# Section 3.2 in LKJ (2009 JMA)
# 1. Initialization
R = ones(typeof(η), d, d)
β = η + 0.5d - 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe

Suggested change
β = η + 0.5d - 1
β = η + d/2 - 1

or should this have some oftype?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

β won't be promoted to whatever η is?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

η = 1f0
d = 4

julia> typeof(η), typeof(η + 0.5d - 1)
(Float32, Float64)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, thanks. Since Beta suffers from #960 it doesn't end up mattering, unfortunately.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets fix it anyway.

# 2.
for k in 2:d - 1
# (a)
β -= 0.5
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cf 115

@mschauer
Copy link
Member

One could test that importance sampling is successful, e.g.

E f(R) = E f(U) c_0(eta)/c_0(1) | U | (eta-1) 

for R ~ LKJ(eta) and U ~ LKJ(1) and some functional f

@johnczito
Copy link
Member Author

Do you know in which sense is this "uniform" if eta=1?

The density is constant in R. The determinant term collapses to 1 and you just have f(R; η) = c₀.

@johnczito
Copy link
Member Author

One could test that importance sampling is successful, e.g.

Fair enough. I can add that.

@mschauer
Copy link
Member

Yes, I can see that, but I suspect there is something to say what that distribution actually is. Or in other words, what is the reference measure for the densities?

@johnczito
Copy link
Member Author

johnczito commented Feb 11, 2020

@mschauer It's a density with respect to Lebesgue measure, so uniform here just means the density is one over the Lebesgue measure of the support. The m = d(d-1)/2 free elements of a correlation matrix live in a subset of [-1, 1]ᵐ that has finite and strictly positive Lebesgue measure. So not some measure zero submanifold that requires a funky dominating measure to define a density. More here and here.

Consequently though, the log integrating constant here is currently off by a sign. Fixing that now.

@mschauer
Copy link
Member

Thank you, this is nice. What is the price we pay for the test in terms of time?

test/lkj.jl Outdated Show resolved Hide resolved
@johnczito
Copy link
Member Author

Thank you, this is nice. What is the price we pay for the test in terms of time?

0.21 seconds

@johnczito
Copy link
Member Author

Thanks for all of the help on this, as usual. Let me know what else you want to see for this to be considered for merging.

@matbesancon
Copy link
Member

Looks great to me. @johnczito maybe add a reference in the testset from the last commit to indicate where the stan test set comes from (documentation link or something)

@matbesancon
Copy link
Member

Awesome, the tests were green before, so I'll just merge it here, thanks!

@matbesancon matbesancon merged commit d521695 into JuliaStats:master Feb 14, 2020
@mschauer
Copy link
Member

I now believe that it is uniform in the constraint space of correlation matrices!

using Distributions
using Makie
K = 50000; ρ = [Point3f0(rand(LKJ(3, 1.0))[[2,3,6]]) for k in 1:K]
scatter(ρ, markersize=0.009, color=(:black, 0.2))

lkj

@johnczito
Copy link
Member Author

@mschauer Nice! And that's a code snippet I'll definitely be stealing...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants