When does sampling actually occur in NeighborLoader? #8950

rohitmujumdar · 2024-02-22T00:51:03Z

rohitmujumdar
Feb 22, 2024

I had a conceptual question.

Let's say I am doing GraphSAGE using the following code : https://mlabonne.github.io/blog/posts/2022-04-06-GraphSAGE.html, and as shown, using train_loader = NeighborLoader(...) to 'perform sampling'. My question - is that when the 'sampling' actually happens? The PyG documetnation says this call "returns subgraphs where global node indices are mapped to local indices corresponding to this specific subgraph. However, often times it is desired to map the nodes of the current subgraph back to the global node indices. The :class:~torch_geometric.loader.NeighborLoader will include this mapping as part of the :obj:data object:"

From the description it seems like it does return the 'sampled subgraphs' (or batches) and the train function has to merely load them one by one and send them off for training?

Or does this function merely return a queue of nodes that need to be processed, and instead for each batch, the dynamic sampling (with the specified number of neighbors actually happens when the batch is loaded (in the training loop?)

Thanks a lot!

rusty1s · 2024-02-23T10:31:27Z

rusty1s
Feb 23, 2024
Maintainer

NeighborLoader is an iterator (sorry if the documentation is confusing in this regard). Sampling happens when requested, e.g.:

for batch in loader:

or via next(iter(loader)).

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When does sampling actually occur in NeighborLoader? #8950

{{title}}

Replies: 1 comment

{{title}}

Select a reply

When does sampling actually occur in NeighborLoader? #8950

rohitmujumdar Feb 22, 2024

Replies: 1 comment

rusty1s Feb 23, 2024 Maintainer

rohitmujumdar
Feb 22, 2024

rusty1s
Feb 23, 2024
Maintainer