Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GraphSAGENodeGenerator in GraphSAGE #2063

Open
kakaroa opened this issue Nov 8, 2022 · 1 comment
Open

GraphSAGENodeGenerator in GraphSAGE #2063

kakaroa opened this issue Nov 8, 2022 · 1 comment

Comments

@kakaroa
Copy link

kakaroa commented Nov 8, 2022

The parameter weighted in GraphSAGENodeGenerator, if weighted=True, is the sampling process sampling the first n nodes with the highest weight? Is it possible to interpret the weighted undirected graph as the original input data?

@cloudmadeofcandy
Copy link

After following a string of many different functions, particularly, this string:

GraphSageNodeGenerators -> _samplers = SeededPerWalk(SampledBreadthFirstWalk) -> _sample_neighbours_untyped() -> naive_weighted_choices()

I reached the definition of "naive_weighted_choices":

def naive_weighted_choices(rs, weights, size=None):
    """
    Select indices at random, weighted by the iterator `weights` of
    arbitrary (non-negative) floats. That is, `x` will be returned
    with probability `weights[x]/sum(weights)`.

    For doing a single sample with arbitrary weights, this is much (5x
    or more) faster than numpy.random.choice, because the latter
    requires a lot of preprocessing (normalized probabilties), and
    does a lot of conversions/checks/preprocessing internally.
    """
    probs = np.cumsum(weights)
    total = probs[-1]
    if total == 0:
        # all weights were zero (probably), so we shouldn't choose anything
        return None

    thresholds = rs.random() if size is None else rs.random(size)
    idx = np.searchsorted(probs, thresholds * total, side="left")

    return idx

So, the weights are normalized by the sum of weights, then are converted into probabilities.

Hope that this helps!!!

Quân Trương

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants