New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MVN cdf #114
MVN cdf #114
Conversation
This function is a translation of the same Matlab function (qsimvnv), code by Prof. Alan Genz (used will permission / credit). It uses Monte Carlo integration to compute the Multivariate Normal (MVN) CDF over a rectangular region.
Types added to arguments for qsimvnv() and _chldr() ; tests needed to be updated to be consistent with these types
Note however that the package is under the MIT "Expat" license, and it is unclear if Prof. Genz is also OK with relicensing derived code. |
The tests are failing because the LinearAlgebra depency isn't in the test environment (Primes is also needed). After studying travis, Google Actions, continuous integration, ... I don't know how to fix it. It seems like CompatHelper should fix it? As for the license, StatsFuns has already released code based on Prof. Genz's work -- tvpack.jl. MATLAB includes his code in their commercial product. But I will write him directly for clarification. |
I received a reply from Professor Genz:
So we are good to go on the licensing front. Still need to figure out how to fix the LinearAlgebra depency causing the tests to fail. |
qsimvnv: Input types now AbstractArray{<:Real}, AbstractVector{<:Real}, AbstractVector{<:Real} Updated docstring to reflect that (removed AbstractPDMat) Since BigFloat is a Real, but the function won't finish if used, check for BigFloat and print a warning if found, then convert it to Float64 MVN: Corrected test JuliaStats#4 (test passed, but for wrong matrix and result) added tests 15 & 16 to check for the 2- and 3-dimensional shortcut code for Orthant probabilities added note to ignore warnings on test 5 & 6 -- the isposdef() check is wrong (test 5) or irrelevant (test 6) added test to check for BigFloat warning message
I haven't checked the implementation but IMHO this PR adds too many heavy dependencies. In particular, I am worried about the circular dependency on Distributions. Can they be avoided? Otherwise it might be better to add this functionality to Distributions or a separate package. |
The dependency on the Distributions package is the Normal() distribution, which was "moved" some time ago from StatsFuns to Distributions. Although "norm.jl" is still in StatsFuns; I could call that directly or compute it via an erfc() call; don't know how you'd feel about that. In general, I prefer to call the "Plan Of Record" (POR) code -- the one official function, as it will get maintained over time and get bug fixes. So I reference that function, I get those fixes too. The dependency on LinearAlgebra is a natural from using vectors for a multivariate distribution . The dependency on Random comes from using quasi-random integration. The dependency on Primes is due to the construction of Richtmyer generators for the integration; this could be replaced by a tuple of, say, the first 1000 prime numbers. Although I figured any statistician would probably have Primes loaded anyhow. Prior discussions on Discourse steered me away from adding it to Distributions and instead putting it into StatsFuns. That said, once qsimvnv.jl was added to StatsFuns, it would be fairly trivial to add the CDF function the MvNormal distribution. Some time ago StatsFuns was split from Distributions. @andreasnoack has discussed undoing that and moving the statistics functions back to Distributions. Perhaps MVN could be the first one -- or, more likely, participate in the R-replacement stats funs that @andreasnoack is implementing moving back to Distributions (if you decide to do that). Not my decision, but I would of course cooperate with whatever direction is set. |
FWIW, I still think that a mini-package for this specific functionality would be best. |
So @devmotion, @tpapp -- what's the call? Into StatsFuns.jl or as a separate mini-package? If it's "mini-package":
|
@blackeneth: I would recommend a specific, separate package for this. Yes, create it under your github account, then register if you want to. I can provide some code review on superficial API details once you have set up this package (just ping me), but the actual computation I am not familiar with. |
Adds function "qsimvnv" to /distrs -- this function is a Julia translation of the Matlab "qsimvnv" function, which computes the Multivariate Normal CDF over a rectangular area. The original code is by Prof. Alan Genz of the University of Washington, who allows redistribution of his code if credit is given in the source code (which has been done). A private function _chlrdr is also added, which re-arranges the integration limits and computes the Cholesky decomposition.
A test set is present in /test, called MVN.jl. runtests.jl has been modified to call it. Given that the definition of covariance matrices takes a lot of code space, it was broken out as a separate file for clarity. Total of 33 tests (all pass).
This is my first PR. Please be kind to my mistakes.