Ambiguity in Barabási–Albert possibly worth investigating #1534

mlhetland · 2021-02-18T10:34:41Z

This is not an issue with LightGraphs in itself, but possibly worth looking at, nonetheless. A recent preprint by Stamatelatos and Efraimidis (About Weighted Random Sampling in Preferential Attachment Models) discusses an ambiguity that seems pervasive in descriptions of the Barabási–Albert model. For example, they say:

In particular, concerning the common definitions of the BA model, the expression “select node i with probability pᵢ” may be interpreted in at least 3 different ways:

Each individual node i gets a link with an independent probability pᵢ.

Each node i has a selection probability pᵢ in a scheme where the m nodes are selected one by one.

The inclusion probability of i appearing in the collection of m nodes is pᵢ.

These interpretations are just 3 examples that are not equivalent to each other but all abide by the “probability proportional to degree” requirement imposed by the BA model.

[…]

[T]he majority of the BA algorithms do not follow the BA model strictly as imprinted in the analytical formulation of the model; most of the theoretical algorithms in the literature as well as the implementations in major open source frameworks consider the draw-by-draw scheme.

They argue that one cannot say that there is a single correct sampling scheme. Rather, in their conclusion, they say:

In this paper we have identified a subtle ambiguity in the definition of the Barabasi-Albert model by taking into consideration the field of unequal probability random sampling. We showed that the definition of the model, in particular the part where m nodes are selected based on their degrees, is open to interpretation and can lead to graphs with unexpected properties. For this reason, we proposed an extension to the definition in the form of a new parameter that dictates the exact weighted random sampling scheme.

They refer to previous publications that describe “[s]everal unequal probability sampling designs”, but they limit themselves to three (conditional Poisson, draw-by-draw and strπps).

It may not be a high priority to implement multiple sampling schemes, but to begin with it might be useful to at least document the choice being made (possibly even referring to Stamatelatos and Efraimidis for a discussion about the ambiguity in the definition), describing explicitly how the random sampling is done.

sbromberger · 2021-02-18T14:05:38Z

This sounds like a good approach (clarifying in the documentation which choice was made).

sbromberger added discussion requires in-depth discussion - participation encouraged doc related to documentation labels Feb 18, 2021

gdalle mentioned this issue Nov 11, 2021

[Port] Ambiguity in Barabási–Albert possibly worth investigating JuliaGraphs/Graphs.jl#47

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ambiguity in Barabási–Albert possibly worth investigating #1534

Ambiguity in Barabási–Albert possibly worth investigating #1534

mlhetland commented Feb 18, 2021

sbromberger commented Feb 18, 2021

Ambiguity in Barabási–Albert possibly worth investigating #1534

Ambiguity in Barabási–Albert possibly worth investigating #1534

Comments

mlhetland commented Feb 18, 2021

sbromberger commented Feb 18, 2021