New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Faster sampling 2D Riemannian Gaussian #198
Faster sampling 2D Riemannian Gaussian #198
Conversation
@@ -34,15 +34,18 @@ | |||
samples_1 = sample_gaussian_spd(n_matrices=n_matrices, | |||
mean=mean, | |||
sigma=sigma, | |||
random_state=random_state) | |||
random_state=random_state, | |||
sampling_method='rejection') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would introduce a value called "auto" for the sampling_method method that would default to "rejection" if dim == 2 and slice otherwise so you don't have to expose these options in the tutorial and it will just become faster for users without any code change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand your point. I have the impression that the default value "None" already fulfils the role of the auto variable you want to define.
Co-authored-by: Alexandre Gramfort <alexandre.gramfort@m4x.org>
Co-authored-by: Alexandre Gramfort <alexandre.gramfort@m4x.org>
Co-authored-by: Alexandre Gramfort <alexandre.gramfort@m4x.org>
Co-authored-by: Alexandre Gramfort <alexandre.gramfort@m4x.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx for this PR with such a detailed description!
Can you update test_sampling
too?
Co-authored-by: Quentin Barthélemy <q.barthelemy@gmail.com>
…ests with sampling_method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not expose the sampling_method
in the example. Just keep the default value of the parameter.
besides LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Thanks for the work, I agree with @agramfort about the auto rather than None, this the same idea but I think it is more clear that a choice is made behind the scene.
LGTM!
Co-authored-by: Quentin Barthélemy <q.barthelemy@gmail.com>
Thanks @Artim436 and all for this PR! |
Introduction
Hi, everyone.
In this issue, I would like to share an improvement that @plcrodrigues and myself made for
_sample_parameter_r
, which is the main bottleneck for sampling SPD matrices from a Riemannian Gaussian distribution viasample_gaussian_spd
.Disclaimer: This improvement is only for
n_dim=2
but we think it will be useful for the community, since many toy models can be reduced to the 2D case.In summary, our idea is to use rejection sampling in
_sample_parameter_r
instead of the current implementation with slice sampling. To have a first look, here is a graph of the speedup with a fixed choice ofsigma
(i.e. the dispersion of the Gaussian distribution).Theory
Breaking the problem in two. We want to sample the vector$r = (r_1, \dots, r_m)$ from the probability distribution
which has to be done via some computational method. At the moment,$p(r)$ but this is rather slow. We suggest changing the implementation to a rejection sampling approach [2], exploring certain properties of $p$ and therefore obtaining better performance.
pyriemann
uses a slice sampling approach [1] to obtain samples fromWe consider at first the simplest case for our sampling procedure, that of$m = 2$ . The pdf for $r = (r_1, r_2)$ simplifies to:
we can see this pdf as a mixture of two components depending on a binary variable$b$
with$\mathbb{P}(b = 0) = \mathbb{P}(b = 1) = 1/2$ and
and
where$\mathbb{I}$ is the indicator function.
Great, we see that to generate a sample from$p(r_1, r_2)$ we can first generate a Bernoulli variable $b \sim \mathcal{B}(1/2)$ and then sample from one of the conditional distributions. Now the question is: how do we sample from the conditional distributions?
Sampling from the conditional distributions. We can use rejection sampling to sample from$p(r \mid b = 0)$ and first we need to find a nice upper bound to it. Considering the inequality valid for all $x > 0$ :
we can write
and through some rearrangements,
The expression above indicates that if we want to sample from$p(r \mid b = 0)$ we can do it via rejection sampling using as auxiliary distribution
the algorithm goes as follows:
Sample$u \sim \mathcal{U}(0, 1)$ and $r \sim g_+(r)$
Check whether
Where$M = \pi\sigma^2 \exp(\sigma^2/4)$ .
As you can imagine, sampling from$p(r \mid b = 1)$ follows the same logic but with a different auxiliary pdf $g_-(r)$
The algorithm
Summing up, the new implementation that we propose for
_sample_parameter_r
is based on the following algorithm:the sampling from the conditional distributions is done following the rejection sampling procedure described above.
Why not consider more dimensions ?
A natural question to ask is why we have not considered cases with more dimensions than just two dimensions? Well, things can get quite complicated when$m$ increases...
For instance, suppose we have$r = (r_1, r_2, r_3)$ then the pdf of interest can be written as
To use the same strategy from our 2D example, we would have to sample three Bernoulli random variables (one for each factor in the product) and consider the$2^3 = 8$ possible combinations of signs inside each of the $\sinh$ . We have tried to implement this case, but our first results indicate a the probability of acceptance that is too small and makes the rejection sampling algorithm impractical.
Moreover, we see that the number of conditional distributions increases as$2^m$ where $m$ is the dimensionality of the SPD matrices being considered. Therefore, our algorithm does not look scalable for larger matrix dimensions.
Final remarks
This is it, we have obtained a much faster implementation for sampling 2D Gaussian SPDs than what was available in$p(r_1, r_2)$ , which can be cumbersome. We will leave this extension for a future PR.
pyriemann
so far. We should mention that our implementation is based on awhile
loop that stops once we have obtained the desired number of samples. However, we can use certain properties of the rejection sampling algorithm to calculate the probability of acceptance of a sample and write code that generates several candidates in the upper hand. Such an algorithm can be even faster than the one we have implemented, but it requires approximating the normalizing constant of