-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
An initial implementation of the LABOR-0 sampling algorithm #242
Conversation
I am unfamiliar with disjoint, directed, or temporal options, so I deleted them for now. But LABOR-0 can be used as a drop-in replacement to neighbor sampling so it should be possible to extend its support to those cases too. Extension to heterogenous case should also be straightforward. |
Codecov Report
@@ Coverage Diff @@
## master #242 +/- ##
==========================================
+ Coverage 83.69% 84.70% +1.00%
==========================================
Files 28 29 +1
Lines 883 987 +104
==========================================
+ Hits 739 836 +97
- Misses 144 151 +7
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
@rusty1s What are the next steps to seamlessly enable users to experiment with this new sampler in PyG? |
for more information, see https://pre-commit.ci
Right now, only LABOR-0 is implemented. For the importance sampling versions, an edge vector containing reciprocals of importance sampling probabilities will be returned. Should I add that vector to the return value as an optional return value before those cases are implemented? Or is it fine to make modifications to the API later? |
If weighted sampling is of interest to PyG users, I can contribute that too. |
Contains the result of the benchmark script. labor2023-07-13 034625.678968.csv |
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Is there an easy way to refactor the neighbor_kernel and this file so that the extra features such as temporal or disjoint come for free for this implementation as well? |
CPU and GPU implementations that support weighted sampling can be found here: https://docs.dgl.ai/en/latest/generated/dgl.graphbolt.LayerNeighborSampler.html#dgl.graphbolt.LayerNeighborSampler |
Implements the LABOR-0 sampling algorithm described in https://arxiv.org/abs/2210.13339 and https://docs.dgl.ai/en/1.1.x/generated/dgl.dataloading.LaborSampler.html.
Sequential poisson sampling was used so that each vertex gets a deterministic number of neighbors fully matching the behaviour of neighbor sampler.
Update: To appear at NeurIPS 2023.