random dot product graph generator? #14

jovo · 2013-12-30T15:25:29Z

i didn't see one in there.
there are obviously many ways to implement such a thing, i certainly do not know the best, and we often write papers in which we generate them, presumably, everybody using a different method.
perhaps we should have a reference RDPG generator?

gaborcsardi · 2013-12-30T15:44:22Z

A little bit more specifically? Input is a vector for each vertex, and a similarity function?

jovo · 2013-12-30T16:10:49Z

that's correct, similarity function is just a dot-product.

let n be the number of vertices,
associated with each vertex v is a latent X_v \in R^d for some d (a hyperparameter)
in the undirected case, the probability of an edge between u and v is <X_u,X_v>=sum_i X_u(i) * X_v(i)
so, X \in R^{d*n}
the trouble is, if we want to ensure that X'X \in [0,1], and we want each vector X_u to be sampled independently,
then we must impose the constraint that each X_u \in R^d_+ \cap B(0,1), where B(0,1) is the unit ball centered at zero, and R^d_+ is the non-negative orthant of R^d.

a work-around of this is to just sample each X_u from a dirichlet, which more than satisfies the above constraints. however, if one samples from the dirichlet, one can never get anything whose ||X_v|| = 1 - eps, unless the vector is 1-sparse.

an alternative strategy is to sample vectors on the unit sphere. one can do this by just sampling from a multivariate gaussian, normalizing by its norm. then, just discard everything outside the non-negative orthant, and you get a set of vectors that satisfy the conditions (i imagine there are other ways). but, the "problem" with this is that you get nothing inside the unit ball, only vectors on the boundary. if i correctly understand, this gives us a degree corrected SBM, so it may be desirable in its own right, but it is not exactly what i'm requesting here.

i'm sure you can easily imagine algorithms that do sample from the proper space, some kind of rejection sampler parameterized by a multivariate gaussian i assume would do the trick (basically, a truncated gaussian). i guess the efficiency of that would depend on the parameters of the gaussian, but probably that is ok?

does that help/make sense?

dpmcsuss · 2013-12-30T16:15:20Z

I personally prefer Gabor's suggestion where the input is a vector for each vertex (or a matrix of size n by d, n=#vertices).
If there are non-standard distributions we want for the latent positions, these should be elsewhere.

There are some advantages to combining sampling latent positions and the graph together in one function but it does sacrifice flexibility, I believe.

jovo · 2013-12-30T16:24:54Z

oh, i see what you guys mean!
yes, i also prefer Gabor's suggestion then :)

for clarity, the input would then be a d-by-n matrix.

what would happen if any element of X'X is >1 or <0?
some options:

truncate to 0 or 1 as appropriate, and tell the user that this happened
throw an exception

i vote 1.

i guess another desirable function would be to generate vectors that are
sufficient? so, one function generates the graph from the vectors, another
completely separate function generates feasible vectors?

On Mon, Dec 30, 2013 at 11:15 AM, Daniel Sussman
notifications@github.comwrote:

I personally prefer Gabor's suggestion where the input is a vector for
each vertex (or a matrix of size n by d, n=#vertices).
If there are non-standard distributions we want for the latent positions,
these should be elsewhere.

There are some advantages to combining sampling latent positions and the
graph together in one function but it does sacrifice flexibility, I believe.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/14#issuecomment-31353042
.

perhaps consider allowing the quest for eudaimonia to guide you
openconnecto.me, jovo.me

gaborcsardi · 2014-01-01T19:32:44Z

Btw. if we distance function is general, then this is quadratic. If we can use some properties of the distance function (e.g. if it is a dot-product and the vectors are on the unit sphere), and the graph is sparse, then we can maybe do something quicker, but it is tricky.

jovo · 2014-01-01T20:33:08Z

quicker sounds good, can you elaborate a bit?
dot-product is a special case of a similarity function,
in particular, of xMx', where M is d-by-d.
for dot-product, M=identity. is that what you mean?

On Wed, Jan 1, 2014 at 2:32 PM, Gabor Csardi notifications@github.comwrote:

Btw. if we distance function is general, then this is quadratic. If we can
use some properties of the distance function (e.g. if it is a dot-product
and the vectors are on the unit sphere), and the graph is sparse, then
we can maybe do something quicker, but it is tricky.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/14#issuecomment-31428537
.

perhaps consider allowing the quest for eudaimonia to guide you
openconnecto.me, jovo.me

gaborcsardi · 2014-01-01T20:43:13Z

What I mean is that if the function is opaque, you have no other choice but
evaluate it for every potential edge.

If you have a special case, like a dot product, that has some structure,
then you can potentially avoid all evaluations. E.g. if you know that ||x||
< ||y|| and cos(angle(x,z)) < cos(angle(y,z)), then the probability of
connecting x to z is than connecting y to z. You can use this to avoid
evaluating all dot products. It is easier if all the vectors have norm one,
because then you don't need to deal with lengths.

This only makes sense if the graph is sparse, otherwise you have O(n^2)
edges anyway, although if the vectors are long (d>>1) you might still want
to avoid dot product evaluations.

On Wed, Jan 1, 2014 at 3:33 PM, joshua vogelstein
notifications@github.comwrote:

quicker sounds good, can you elaborate a bit?
dot-product is a special case of a similarity function,
in particular, of xMx', where M is d-by-d.
for dot-product, M=identity. is that what you mean?

On Wed, Jan 1, 2014 at 2:32 PM, Gabor Csardi notifications@github.comwrote:

Btw. if we distance function is general, then this is quadratic. If we
can
use some properties of the distance function (e.g. if it is a
dot-product
and the vectors are on the unit sphere), and the graph is sparse, then
we can maybe do something quicker, but it is tricky.

—
Reply to this email directly or view it on GitHub<
https://github.com/gaborcsardi/igraph/issues/14#issuecomment-31428537>
.

perhaps consider allowing the quest for eudaimonia to guide you
openconnecto.me, jovo.me

—

Reply to this email directly or view it on GitHubhttps://github.com//issues/14#issuecomment-31429642
.

gaborcsardi · 2014-01-13T20:46:47Z

OK, so I am ready to write the functions to sample the vectors, and then we'll see if the dot-product generation itself can be made faster than O(n^2).

So, what kind of sampler do we want?

jovo · 2014-01-15T02:44:03Z

well, i guess a couple easy ones might be:

uniform on the intersection of the sphere and non-negative orthant (perhaps with a single parameter to adjust the diameter of the sphere to be between 0 and 1.
dirichlet
uniform in the intersection of the ball & the non-negative orthant (again with a single parameter controlling the diameter of the ball)

slightly harder:

truncated multivariate normal whose mean lives in the intersection of the sphere and non-negative orthant. i would think for this one, we might want to require something like x% of the mass lives in the feasible region.

i'm not sure if others have better suggestions, or a good prioritization of the above.

gaborcsardi · 2014-01-27T20:55:12Z

OK, 1-3 are done, please reopen this issue if you have a more complete description of 4.

jovo · 2014-01-27T23:06:48Z

awesome, thanks!

On Mon, Jan 27, 2014 at 3:55 PM, Gabor Csardi notifications@github.comwrote:

OK, 1-3 are done, please reopen this issue if you have a more complete
description of 4.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/14#issuecomment-33421623
.

perhaps consider allowing the quest for eudaimonia to guide you
openconnecto.me, jovo.me

gaborcsardi added a commit that referenced this issue Jan 10, 2014

Added dot product graph generator (#14).

4b12aad

gaborcsardi added a commit that referenced this issue Jan 27, 2014

Sample surface and volume of sphere uniformly (#14).

9f7efca

gaborcsardi closed this as completed Jan 27, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

random dot product graph generator? #14

random dot product graph generator? #14

jovo commented Dec 30, 2013

gaborcsardi commented Dec 30, 2013

jovo commented Dec 30, 2013

dpmcsuss commented Dec 30, 2013

jovo commented Dec 30, 2013

gaborcsardi commented Jan 1, 2014

jovo commented Jan 1, 2014

gaborcsardi commented Jan 1, 2014

gaborcsardi commented Jan 13, 2014

jovo commented Jan 15, 2014

gaborcsardi commented Jan 27, 2014

jovo commented Jan 27, 2014

random dot product graph generator? #14

random dot product graph generator? #14

Comments

jovo commented Dec 30, 2013

gaborcsardi commented Dec 30, 2013

jovo commented Dec 30, 2013

dpmcsuss commented Dec 30, 2013

jovo commented Dec 30, 2013

gaborcsardi commented Jan 1, 2014

jovo commented Jan 1, 2014

gaborcsardi commented Jan 1, 2014

gaborcsardi commented Jan 13, 2014

jovo commented Jan 15, 2014

gaborcsardi commented Jan 27, 2014

jovo commented Jan 27, 2014