When does sampling actually occur in NeighborLoader? #8950
rohitmujumdar
started this conversation in
General
Replies: 1 comment
-
or via |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I had a conceptual question.
Let's say I am doing GraphSAGE using the following code : https://mlabonne.github.io/blog/posts/2022-04-06-GraphSAGE.html, and as shown, using
train_loader = NeighborLoader(...)
to 'perform sampling'. My question - is that when the 'sampling' actually happens? The PyG documetnation says this call "returns subgraphs where global node indices are mapped to local indices corresponding to this specific subgraph. However, often times it is desired to map the nodes of the current subgraph back to the global node indices. The :class:~torch_geometric.loader.NeighborLoader
will include this mapping as part of the :obj:data
object:"From the description it seems like it does return the 'sampled subgraphs' (or batches) and the train function has to merely load them one by one and send them off for training?
Or does this function merely return a queue of nodes that need to be processed, and instead for each batch, the dynamic sampling (with the specified number of neighbors actually happens when the batch is loaded (in the training loop?)
Thanks a lot!
Beta Was this translation helpful? Give feedback.
All reactions