Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add stochastic Kronecker graph model to networkx.generators #1031

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

nadesai
Copy link
Contributor

@nadesai nadesai commented Dec 11, 2013

I've added two simple implementations - one O(V^2) and one "fast" O(E) - of the stochastic Kronecker graph generation model (described in "Kronecker graphs: an approach to modeling networks" by Leskovec et al.) to random_graphs.py. I also added some tests of these generators.

This is my first pull request to NetworkX, so I'm probably doing something wrong - please let me know what I should change or add.

@chebee7i
Copy link
Member

One quick comment: We try, as much as possible, to keep line widths less than 80 characters. So maybe try to reformat some of the docstrings to stay within this (soft) boundary.

@bjedwards
Copy link
Member

It's been a while since I've read this paper. Isn't the definition provided the Stochastic Kronecker Graph? Having both might be nice, even if the Kronecker graph function was just a wrapper around the stochastic version. The deterministic version might instead take an initiator graph as in the original paper, which the adjacency matrix could be drawn from.


Parameters
----------
mtx : square matrix of floats
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use P to be consistent with the notation in the paper.

@nadesai
Copy link
Contributor Author

nadesai commented Dec 11, 2013

@bjedwards Should be straightforward to add that. Where should it go?

@bjedwards
Copy link
Member

That is a good question. It might be a good idea to put a new sub-module in the generators module. Maybe product_graph.py? If someone would be willing Drew Conway's Graph Motifs Model could be included. It's similar but I think uses the Cartesian product instead of the Kronecker product. However, where this could go might be a better answered by one of the more core devs @hagberg or @dschult.

@bjedwards
Copy link
Member

Since no one has answered in a few days, I'll make a suggestions and see how it goes over. I would create a new file networkx/generators/product.py, and include both the kronecker_graph,fast_kronecker_graph, stochastic_kronecker_graph, and maybe fast_stochastic_kronecker_graph. Truthfully, if the fast version works the same as the regular version, perhaps we don't need the regular version. I know gnp and fast_gnp are both included, but that may be because the slow version is a sort of reference implementation.

@hagberg
Copy link
Member

hagberg commented Dec 16, 2013

@bjedwards proposal is good. I agree that we could just include the "fast" version if that makes sense. (The "slow" gnp generator is indeed included for reference (and historical) value.)

@nadesai
Copy link
Contributor Author

nadesai commented Jan 4, 2014

The "fast" and "slow" versions are not identical. If we compare their runs when generating a directed graph with inputs P [n-by-n matrix] and k:

  • "Slow" computes P Kronecker-powered k times, generating an (n^k) by (n^k) stochastic adjacency matrix. It then steps through each cell of this matrix, flipping a biased coin to determine whether the corresponding edge belongs in the final graph.
  • "Fast" computes F, the expected value of |E(G)|. Here, G is a random variable representing the space of all graphs generated by the "slow" process, E(G) is a RV representing the edge sets of such graphs, and |E(G)| is a RV representing the sizes of such edge sets. (It turns out F=S^k where S is the sum of all the elements of P.) It then uses RMAT-style recursive descent to place exactly F edges in the graph. (As a consequence, every graph generated using the "fast" strategy is constrained to have exactly F edges.)

I'm not sure how useful it would be to use "fast" as a replacement for "slow." The set of graphs that could be generated by "fast" is almost always a proper subset of those that could be generated by "slow." It also does not seem to generalize as well to undirected graphs (there is no analogue to the F=S^k formula in the undirected case, so currently my workaround is generating a directed graph from P and converting it to an undirected graph afterwards). The only real advantage to using "fast" is the O(F) runtime.

Because of this, I think that it would make sense to either include just the "slow" implementation (which can generate the full space of possible graphs) or both "slow" and "fast". I think there might be problems if "fast" were the only Kronecker generator available. It may also be that using the name fast_kronecker_random_graph for the "fast" strategy may be confusing since it does not have the same properties as kronecker_random_graph, so perhaps we should rename it to something like kronecker2_random_graph.

@nadesai
Copy link
Contributor Author

nadesai commented Jan 4, 2014

@bjedwards I'll go ahead then and move these to a separate generators/product.py file and add a wrapper function for deterministically generating Kronecker graphs from a NetworkX initiator graph and/or adjacency matrix.

@bjedwards
Copy link
Member

@nadesai Thanks for the info on the difference. I see now, it is sort of like difference between gnm_random_graph and gnp_random_graph. I agree that 'fast' is probably the wrong word, even though that is the word they use in the paper. expected_edges_kronecker_graph? That seems a bit cumbersome. Just kronecker_random_graph_2, might be right.

@nadesai
Copy link
Contributor Author

nadesai commented Mar 5, 2014

This refactoring has been (belatedly) done.

@hagberg hagberg added this to the networkx-1.9 milestone Mar 6, 2014
@nadesai
Copy link
Contributor Author

nadesai commented Mar 12, 2014

The code as it stands does not currently work with NumPy matrices; this seems to be the standard for all matrix types, so I will switch to working with those.

There exists a function (networkx.algorithms.operators.product.tensor_product) to compute the Kronecker product of two graphs (not matrices), which could be leveraged for the deterministic generator. I also just found out there exists a numpy function for computing the Kronecker product of two matrices (numpy.kron), which could be leveraged for the "regular" Kronecker generator. I will work on this as well.

Add a new nx.generators.product module containing random graph generator
algorithms based on graph products. To begin with, contains implementation
of stochastic Kronecker model (described in Leskovec et al.,
https://arxiv.org/abs/0812.4905).
Base automatically changed from master to main March 4, 2021 18:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

5 participants