-
-
Notifications
You must be signed in to change notification settings - Fork 194
[DRAFT/RFC] Gaussian Copula Cholesky LPDF #3206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
@spinkney would you have a look at the general signature/idea here? Not 100% that is the way to go, but I'm not sure of a better alternative |
31116c0
to
8b68066
Compare
Jenkins Console Log Machine informationNo LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.3 LTS Release: 20.04 Codename: focalCPU: G++: Clang: |
@andrjohns thanks for getting to this! In the signature real gaussian_copula_cholesky_lpdf(vector | tuple-of-tuples, cholesky_factor_corr) Does this imply that it is looped over for each observation, that is for each
Yea, I see that there's a few different options with different pros and cons. I like the current implementation but I think it would be nice to have additional signatures. Here's a few questions and options:
To make this a bit more concrete: data {
int<lower=0> N;
int<lower=1> K; // K different marginals
array[K] int<lower=0> P; // number of parameters for each k marginal
array[N] vector[K] y; // continuous outcome and an array of vector
}
transformed data {
int P_total = to_int(sum(to_vector(P)));
}
parameters {
vector[P_total] theta_free;
cholesky_factor_corr[K] L;
}
model {
tuple( tuple(real, ...), ..., tuple(real, ...) ) marginal_lcdf_tuples =
tuple(
tuple(k1_lcdf, theta_free[1], theta_free[2], ...),
tuple(k2_lcdf, theta_free[p], theta_free[p + 1], ...),
....,
tuple(K_lcdf, ..., theta_free[P_total])
);
// Can the above also be any container for the parameters?
// such as
//
// tuple( tuple(real, ...), ..., tuple(real, ...) ) marginal_tuples =
// tuple(
// tuple(k1_lcdf, vector),
// tuple(k2_lcdf, array[] real),
// ....,
// tuple(K_lcdf, ..., matrix)
// );
//
// Why not also have another input for the lpdfs?
//
// tuple( tuple(real, ...), ..., tuple(real, ...) ) marginal_lpdf_tuples =
// tuple(
// tuple(k1_lpdf, theta_free[1], theta_free[2], ...),
// tuple(k2_lpdf, theta_free[p], theta_free[p + 1], ...),
// ....,
// tuple(K_lpdf, ..., theta_free[P_total])
// );
//
y ~ gaussian_copula_cholesky(marginal_lcdf_tuples | L);
// or optionally
// y ~
// gaussian_copula_cholesky(
// marginal_lcdf_tuples | marginal_lpdf_tuples, L
// );
//
} |
Just typing that out was tiresome and made me think another way // lcdf functors
tuple( real, real, ... ) marginal_lcdf_tuples =
tuple( k1_lcdf, k2_lcdf, ..., K_lcdf );
// lpdf functors, same K number of objects in the tuple as marginal_lcdf_tuples
tuple( real, real, ... ) marginal_lpdf_tuples =
tuple( k1_lpdf, k2_lpdf, ..., K_lpdf );
// parameters
// I'm assuming that the containers in this tuple would match the
// allowable parameters in their corresponding lcdf/lpdf functor signatures
tuple( params_k1, ..., params_K) marginal_params_tuple =
tuple( k1_params, ..., K_params );
y ~ gaussian_copula_cholesky(
marginal_lcdf_tuples |
marginal_params_tuple,
marginal_lpdf_tuples, // optional, if omitted assumed you are doing this somewhere else
L); |
Is it legal to have a user defined |
Yes, we currently allow lpdfs in reduce_sum, and I don’t think cdfs would require any different code generation than what we already have |
Summary
Opening this as a draft PR for feedback on the signature/approach.
For the Gaussian copula (and other copula families), the user needs to provide both the
y
variable and a means of transforming that variable to the unit-scale.The method I've gone with is to require a tuple of the same length as
y
, where each element is itself a tuple - with a functor for computing the LCDF as the first element, and any additional args as the remaining.For example, if the user wanted to model the correlation between a
Gamma(2, 1)
variable and aExp(2)
variable, they would pass the tuple-of-tuples:So the final signature would be:
The current framework only supports continuous outcomes, but once we settle on a good approach I can expand to discrete outcomes by requiring that a uniform parameter for data-augmentation is also included in the tuple for that outcome.
Tests
Basic
prim
andmix
are added, but thefwd
components of themix
tests are currently failingSide Effects
Extended a few of the utilities (
size
,vector_seq_view
) for tuplesRelease notes
Replace this text with a short note on what will change if this pull request is merged in which case this will be included in the release notes.
Checklist
Copyright holder: Andrew Johnson
The copyright holder is typically you or your assignee, such as a university or company. By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses:
- Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
- Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
the basic tests are passing
./runTests.py test/unit
)make test-headers
)make test-math-dependencies
)make doxygen
)make cpplint
)the code is written in idiomatic C++ and changes are documented in the doxygen
the new changes are tested