# Evaluating DeepWalk, Walklets and Node2Vec trained on exact and approximated random walks

In the following notebook we build an evaluation pipeline to evaluate how DeepWalk and Node2Vec performance differ, if at all, when trained on exact and approximated random walks. For DeepWalk, of course, this only applies when the considered graph is a weighted graph.

We will be running these experimenst on Homo Sapiens from STRING.

In [1]:
from grape.edge_prediction import edge_prediction_evaluation, PerceptronEdgePrediction
from grape.embedders import DeepWalkGloVeEnsmallen, DeepWalkSkipGramEnsmallen, DeepWalkCBOWEnsmallen
from grape.embedders import Node2VecGloVeEnsmallen, Node2VecSkipGramEnsmallen, Node2VecCBOWEnsmallen
from grape.embedders import WalkletsGloVeEnsmallen, WalkletsSkipGramEnsmallen, WalkletsCBOWEnsmallen
from grape.datasets.string import HomoSapiens
import pandas as pd
from tqdm.auto import tqdm

We load the homo sapiens graph and filter it at edge weight = `700`. Furthermore, the graph as it is has singleton nodes, which would be meaningless in a task such as the considered one. Therefore, we proceed to dropping them.

In [2]:
graph = HomoSapiens().filter_from_ids(min_edge_weight=700).remove_disconnected_nodes()
graph.enable()

And we proceed to building the evaluation pipeline.

In [3]:
results = pd.concat([
    edge_prediction_evaluation(
        holdouts_kwargs=dict(train_size=0.8),
        graphs=graph,
        models=[
            PerceptronEdgePrediction(
                edge_features=None,
                edge_embeddings="Hadamard"
            ),
        ],
        number_of_holdouts=10,
        node_features=EmbeddingMethod(
            # We use slightly less taxing parameters
            # for this test, likely with a stronger 
            # parametrization the differences between
            # the various models will be even less noticeable.
            epochs=10,
            window_size=5,
            iterations=3,
            max_neighbours=max_neighbours
        ),
        # !!! IMPORTANT !!!
        # Right now we have enabled the smoke test to rapidly run and
        # test that everything works. To reproduce the results,
        # do set the smoke test flag to `False`.
        smoke_test=True,
        enable_cache=True
    )
    # When the `max_neighbours` parameter is set to a value
    # higher than the maximum degree of the graph or to None,
    # no approximation is employed, while when it is set
    # to a provided amount the neighbours will be capped to that value using SUSS.
    # Here we use either the number of nodes in the graph or 10, so
    # to stress the two different approaches.
    for max_neighbours in tqdm(
        (graph.get_number_of_nodes(), 10),
        desc="Approximation",
        leave=False
    )
    for EmbeddingMethod in tqdm(
        (
            Node2VecCBOWEnsmallen, Node2VecGloVeEnsmallen, Node2VecSkipGramEnsmallen,
            DeepWalkCBOWEnsmallen, DeepWalkGloVeEnsmallen, DeepWalkSkipGramEnsmallen,
            WalkletsGloVeEnsmallen, WalkletsSkipGramEnsmallen, WalkletsCBOWEnsmallen
        ),
        desc="Embedding",
        leave=False
    )
])

Approximation:   0%|          | 0/2 [00:00<?, ?it/s]

Embedding:   0%|          | 0/9 [00:00<?, ?it/s]

Evaluating on HomoSapiens:   0%|                         | 0/10 [00:00<?, ?it/s]

Evaluating on HomoSapiens:   0%|                         | 0/10 [00:00<?, ?it/s]

Evaluating on HomoSapiens:   0%|                         | 0/10 [00:00<?, ?it/s]

Evaluating on HomoSapiens:   0%|                         | 0/10 [00:00<?, ?it/s]

Evaluating on HomoSapiens:   0%|                         | 0/10 [00:00<?, ?it/s]

Evaluating on HomoSapiens:   0%|                         | 0/10 [00:00<?, ?it/s]

Evaluating on HomoSapiens:   0%|                         | 0/10 [00:00<?, ?it/s]

Evaluating on HomoSapiens:   0%|                         | 0/10 [00:00<?, ?it/s]

Evaluating on HomoSapiens:   0%|                         | 0/10 [00:00<?, ?it/s]

Embedding:   0%|          | 0/9 [00:00<?, ?it/s]

Evaluating on HomoSapiens:   0%|                         | 0/10 [00:00<?, ?it/s]

Evaluating on HomoSapiens:   0%|                         | 0/10 [00:00<?, ?it/s]

Evaluating on HomoSapiens:   0%|                         | 0/10 [00:00<?, ?it/s]

Evaluating on HomoSapiens:   0%|                         | 0/10 [00:00<?, ?it/s]

Evaluating on HomoSapiens:   0%|                         | 0/10 [00:00<?, ?it/s]

Evaluating on HomoSapiens:   0%|                         | 0/10 [00:00<?, ?it/s]

Evaluating on HomoSapiens:   0%|                         | 0/10 [00:00<?, ?it/s]

Evaluating on HomoSapiens:   0%|                         | 0/10 [00:00<?, ?it/s]

Evaluating on HomoSapiens:   0%|                         | 0/10 [00:00<?, ?it/s]