Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add rand for ProbabilitySimplex #604

Merged
merged 10 commits into from
Jun 24, 2023
Merged

Conversation

adienes
Copy link
Contributor

@adienes adienes commented May 10, 2023

uniformly samples from probability simplex if no point given. otherwise returns a mean-zero vector of gaussians.

@codecov
Copy link

codecov bot commented May 10, 2023

Codecov Report

Merging #604 (5308add) into master (836e716) will increase coverage by 0.00%.
The diff coverage is 94.73%.

@@           Coverage Diff           @@
##           master     #604   +/-   ##
=======================================
  Coverage   99.01%   99.01%           
=======================================
  Files         104      106    +2     
  Lines       10138    10187   +49     
=======================================
+ Hits        10038    10087   +49     
  Misses        100      100           
Impacted Files Coverage Δ
src/Manifolds.jl 86.66% <ø> (ø)
src/manifolds/ProbabilitySimplexEuclideanMetric.jl 90.00% <90.00%> (ø)
src/manifolds/ProbabilitySimplex.jl 100.00% <100.00%> (ø)

... and 3 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@adienes
Copy link
Contributor Author

adienes commented May 11, 2023

sorry should have squashed all those first. had to turn off the random tangent test since isapprox has atol set to 0, and thus isapprox(_, 0) is testing for exact equality.

I think pr is ok now? let me know

Copy link
Member

@kellertuer kellertuer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your PR – contributions are always happily welcome :)

I have two small remarks – and never mind several commits – for a PR that is always good to see also a bit the development, we squash-merge PRs anyways.

src/manifolds/ProbabilitySimplex.jl Outdated Show resolved Hide resolved
test/manifolds/probability_simplex.jl Outdated Show resolved Hide resolved
@mateuszbaran
Copy link
Member

sorry should have squashed all those first. had to turn off the random tangent test since isapprox has atol set to 0, and thus isapprox(_, 0) is testing for exact equality.

The easiest way to handle this is setting rand_tvector_atol_multiplier=1 keyword argument to test_manifold, or a bigger number if needed. I think we should start setting non-zero absolute tolerance for tangent vector tests by default but that's a separate issue 🙂 .

@kellertuer
Copy link
Member

I think we should start setting non-zero absolute tolerance for tangent vector tests by default but that's a separate issue 🙂 .

Good point to keep in mind for #548 .

Comment on lines 301 to 302
_r = sort(vcat(rand(n - 1), [0.0, 1.0]))
pX .= (circshift(_r, -1) .- _r)[1:n]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could make fewer allocations and avoid the O(n*log(n)) sorting by using the fact that Dirichlet(alpha=ones(n)) is the uniform distribution on the probability simplex, so the usual Gamma distribution (see https://en.wikipedia.org/wiki/Dirichlet_distribution#Random_variate_generation) approach for drawing from Dirichlet reduces to drawing from Exponential(1):

Suggested change
_r = sort(vcat(rand(n - 1), [0.0, 1.0]))
pX .= (circshift(_r, -1) .- _r)[1:n]
Random.randexp!(pX)
LinearAlgebra.normalize!(pX, 1)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! That should also be reflected in the docs.

Comment on lines 304 to 305
_r = randn(n) .* σ
pX .= _r .- mean(_r)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need the allocation to r_.

Suggested change
_r = randn(n) .* σ
pX .= _r .- mean(_r)
Random.randn!(pX)
pX .= (pX .- mean(pX)) .* σ

Comment on lines 304 to 305
_r = randn(n) .* σ
pX .= _r .- mean(_r)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need the allocation to r_.

Suggested change
_r = randn(n) .* σ
pX .= _r .- mean(_r)
Random.randn!(pX)
pX .= (pX .- mean(pX)) .* σ

end
return pX
end
function Random.rand!(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know elsewhere in this package we write separate methods for with and without rng, but this was a mistake. In general, we should implement sampling methods with rng and then the one without rng should just dispatch to this by passing Random.default_rng() as the first argument. This is what Random itself does.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not aware f that we should then slowly improve our code :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will make a PR that improves this.

@kellertuer
Copy link
Member

With your current tolerance factor of 1 you are still checking for machine precision zero, which might be a bit tough (that is why 1.6 failed, maybe raise the factor to 10.0?

test/manifolds/probability_simplex.jl Outdated Show resolved Hide resolved
test/manifolds/probability_simplex.jl Outdated Show resolved Hide resolved
@kellertuer
Copy link
Member

From my side this nearly looks good now, just that the doc string could maybe include a math formula or a reference where to find more information (maybe Seth's wikipedia remark)?

@sethaxen
Copy link
Member

There's a reference here I haven't read: https://stackoverflow.com/a/67202070

@sethaxen
Copy link
Member

sethaxen commented May 12, 2023

Okay, got it. The standard reference is Theorem 2.2, Chapter 5 of

Luc Devroye. Non-Uniform Random Variate Generation. Springer-Verlag, New York, 1986. doi: 10.1007/978-1-4613-8643-8

image

image

@kellertuer
Copy link
Member

Cool, maybe adding that as a footnote reference would be cool (at some point we should aim for some bib package in the docs, let's hope some package like that soon magically appears :D )

@adienes
Copy link
Contributor Author

adienes commented May 12, 2023

thank you for adding the reference. we should probably merge one of this PR or #605 first, then rebase the other? I think as-is they are slightly incompatible

@kellertuer kellertuer added the preview docs Add this label if you want to see a PR-preview of the documentation label May 12, 2023
@kellertuer
Copy link
Member

kellertuer commented May 12, 2023

Oh I might have crashed formatting, sorry for that (editing here has its disadvantages). But I‘ll activate Docs rendering as well now :)

There seems to be a space too much in the footnote (i.e. [^ D...] (the one after ^); and if you could run formatter locally that would be great: in the Manifolds.jl folder run using JuliaFormatter; format(".")

@mateuszbaran
Copy link
Member

thank you for adding the reference. we should probably merge one of this PR or #605 first, then rebase the other? I think as-is they are slightly incompatible

I think we can merge this one first, and then I will fix incompatibility in #605 .

kellertuer
kellertuer previously approved these changes May 12, 2023
Copy link
Member

@kellertuer kellertuer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My points are clarified. Thanks for the contribution and the good further ideas.

@kellertuer kellertuer requested a review from sethaxen May 13, 2023 19:46
Copy link
Member

@sethaxen sethaxen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a minor docstring fix. Also, this should get a patch version bump. Otherwise, LGTM.

src/manifolds/ProbabilitySimplex.jl Outdated Show resolved Hide resolved
src/manifolds/ProbabilitySimplex.jl Outdated Show resolved Hide resolved
@mateuszbaran
Copy link
Member

I think I will merge this and bump version in #605 -- thank you for your contribution 🙂

@kellertuer
Copy link
Member

But we already discussed, that this is w.r.t. the „wrong” metric, i.e. not our default one but the Euclidean (with respect to which the manifold is not complete).

@mateuszbaran
Copy link
Member

Yes, it's not uniform wrt. our metric but we never claimed to always use uniform distributions (for manifolds of finite volume). And we can change it later.

@kellertuer
Copy link
Member

Sure, I would still prefer if we could switch this one to be the one with the metric manifold / euclidean metric?

@mateuszbaran
Copy link
Member

Wait... that's actually a good question. I assumed change_representer and change_metric are fine but when I look at them now they indeed seem to produce vectors that are not tangent. @kellertuer do you remember where you got the formulas from?

@adienes
Copy link
Contributor Author

adienes commented Jun 13, 2023

that commit is just a rebase on master, diff shouldn't have changed

@kellertuer
Copy link
Member

We usually merge, that is easier (for PRs we squash-merge, which makes the master branch relatively clean again).

I will check the change of metric soon.

@kellertuer
Copy link
Member

kellertuer commented Jun 13, 2023

Hm, I am now a bit confused, since the documentation of exp for example follows https://arxiv.org/pdf/1603.05285.pdf, Proposition 2.4, but the implementation – at least as far as I understand it – does something completely different.

The metric conversion seems wrong for now, I might have stayed too close to what is done on the positive numbers (element wise), will check for a correction. The change of representer seems right, though.

edit: Ah no, the change representer is also wrong, but Proposition (2.3) provides the right formula, but then I can probably derive the change metric from that as well. - eh, oh! That formula includes projection which we do before, added that note to the docs, but change representer is correct. Will check how to adapt that though.

@kellertuer
Copy link
Member

kellertuer commented Jun 13, 2023

HM, that's a tough one to solve, because it seems there are only n degrees of freedom (for the change metric) but I currently seem to have n+1 conditions.
Yes, I am convinced we have to remove that method because it is just not possible.

@kellertuer kellertuer dismissed their stale review June 13, 2023 16:12

I have to dismiss my approval since there still seems to be work to be done

@mateuszbaran
Copy link
Member

I've discussed with @kellertuer what is wrong with change_representer and change_metric. I will make a PR fixing them and then you will be able to continue 🙂 . Thank you for your question about validity of change_metric, I wouldn't notice that something is wrong without it.

@kellertuer
Copy link
Member

Yeah, thanks for the discussion it seems there might be a change metric :)

@mateuszbaran
Copy link
Member

change_metric is now fixed on master branch, so you can continue now 🙂

mateuszbaran and others added 4 commits June 16, 2023 20:52
* Add a few exports, mainly reexports from ManifoldsBase.

* Improve caching.

i.e. doing it correctly.

* 📖🩹

at least a bit,

* 📖🩹 2

* export FisherRaoMetric.
@adienes
Copy link
Contributor Author

adienes commented Jun 17, 2023

I had tried also to add some tests for the ProbabilitySimplexEuclideanMetric, but it was very hard to get any to pass since basically only rand is defined. it seems to want retract ...

@mateuszbaran
Copy link
Member

Hm, I think you can just take exp from Euclidean:

exp!(::MetricManifold{ℝ,<:ProbabilitySimplex,<:EuclideanMetric}, q, p, X) = (q .= p .+ X)
exp!(::MetricManifold{ℝ,<:ProbabilitySimplex,<:EuclideanMetric}, q, p, X, t::Number) = (q .= p .+ t .* X)

and it should more or less work.

@adienes
Copy link
Contributor Author

adienes commented Jun 17, 2023

g2g?

Co-authored-by: Seth Axen <seth@sethaxen.com>
@mateuszbaran mateuszbaran merged commit d24f6d2 into JuliaManifolds:master Jun 24, 2023
25 of 30 checks passed
@kellertuer
Copy link
Member

But then we should also register 0.8.71 soon, because I think we were just waiting for this PR.

@mateuszbaran
Copy link
Member

Yes, I've even forgotten we have a few changes on master without a new registered version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
preview docs Add this label if you want to see a PR-preview of the documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants