Skip to content

Enable C++ PPR Sampling#556

Merged
mkolodner-sc merged 200 commits intomainfrom
mkolodner-sc/cpp_ppr_tracer
May 5, 2026
Merged

Enable C++ PPR Sampling#556
mkolodner-sc merged 200 commits intomainfrom
mkolodner-sc/cpp_ppr_tracer

Conversation

@mkolodner-sc
Copy link
Copy Markdown
Collaborator

@mkolodner-sc mkolodner-sc commented Mar 24, 2026

Scope of work done

PPR-Based Neighbor Sampling for GiGL

Adds C++ based PPR (Personalized PageRank) based neighbor sampling as an alternative to k-hop sampling in GiGL's distributed training stack, switching from the previous pythonic implementation.

C++ Kernel (gigl-core/csrc/sampling/)

  • Implements the Forward Push algorithm (Andersen et al., 2006) as PPRForwardPushState — a pybind11-wrapped C++ class that owns all hot-loop state (scores, residuals, queue, neighbor cache).
  • Python drives the async RPC neighbor fetches; C++ handles the residual push loop, keeping the asyncio event loop unblocked via run_in_executor.
  • gigl-core is a separately installable package (uv pip install -e gigl-core/) with a py.typed marker and .pyi stub for full mypy coverage.

Python Sampler (gigl/distributed/dist_ppr_sampler.py)

  • DistPPRNeighborSampler extends BaseGiGLSampler and integrates with the existing DistNeighborLoader / Graph Store infrastructure.
  • Output format: (seed_type, "ppr", neighbor_type) edge types with edge_index ([2, N] int64) and edge_attr ([N] float PPR scores).
  • max_fetch_iterations: Optional[int] = None caps RPC calls per batch; the algorithm continues to convergence using cached neighbor lists after the budget is exhausted.
  • num_neighbors_per_hop defaults to 1000 — high-degree hub nodes receive diminishing residual per neighbor, so capping the fetch has negligible effect on PPR accuracy.

Configuration

  • PPRSamplerOptions added to sampler_options.py alongside existing KHopNeighborSamplerOptions; pass via sampler_options= on DistNeighborLoader.

C++ Tests (gigl-core/tests/ppr_forward_push_test.cpp)

  • 6 GoogleTest cases covering: initial queue drain, convergence, score absorption, residual distribution, cross-seed deduplication, and top-k limiting.
  • tests/CMakeLists.txt updated to auto-link torch and auto-discover kernel sources under csrc/ (excluding python_* entry points).

Where is the documentation for this feature?: N/A

Did you add automated tests or write a test plan?

Updated Changelog.md? NO

Ready for code review?: NO

@mkolodner-sc mkolodner-sc force-pushed the mkolodner-sc/cpp_ppr_tracer branch from 4d2db2a to cfcb8cb Compare March 25, 2026 18:59
@mkolodner-sc mkolodner-sc changed the base branch from main to mkolodner-sc/cpp-infrastructure March 25, 2026 19:27
@mkolodner-sc mkolodner-sc force-pushed the mkolodner-sc/cpp_ppr_tracer branch from 7009d70 to dd118ef Compare March 25, 2026 21:58
Comment thread gigl-core/core/sampling/ppr_forward_push.cpp Outdated
Comment thread gigl-core/core/sampling/ppr_forward_push.cpp Outdated
Comment thread gigl-core/core/sampling/ppr_forward_push.cpp Outdated
Comment thread gigl-core/core/sampling/ppr_forward_push.cpp Outdated
Comment thread gigl-core/core/sampling/ppr_forward_push.cpp Outdated
Comment thread tests/unit/distributed/dist_ppr_sampler_test.py Outdated
Comment thread Makefile
Comment thread gigl/distributed/utils/neighborloader.py Outdated
Comment thread gigl/distributed/dist_ppr_sampler.py
Comment thread gigl/distributed/dist_ppr_sampler.py Outdated
Copy link
Copy Markdown
Collaborator

@kmontemayor2-sc kmontemayor2-sc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Matt! just nits now the only thing that's a bit odd is why we can't do std::optional<std::vector>

Comment thread gigl-core/core/sampling/ppr_forward_push.cpp Outdated
Comment thread gigl-core/core/sampling/ppr_forward_push.h Outdated
Comment thread gigl-core/core/sampling/ppr_forward_push.h
Comment thread gigl-core/core/sampling/ppr_forward_push.cpp
Comment thread gigl-core/core/sampling/ppr_forward_push.cpp
Comment thread gigl-core/core/sampling/python_ppr_forward_push.cpp
Comment thread Makefile Outdated
Copy link
Copy Markdown
Collaborator

@kmontemayor2-sc kmontemayor2-sc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Matt! just nits now the only thing that's a bit odd is why we can't do std::optional<std::vector>

Copy link
Copy Markdown
Collaborator

@yliu2-sc yliu2-sc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second stamp

@mkolodner-sc mkolodner-sc enabled auto-merge May 4, 2026 18:34
@mkolodner-sc mkolodner-sc added this pull request to the merge queue May 4, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks May 4, 2026
@mkolodner-sc mkolodner-sc added this pull request to the merge queue May 4, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks May 4, 2026
@mkolodner-sc mkolodner-sc added this pull request to the merge queue May 4, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks May 4, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 4, 2026

GiGL Automation

@ 22:20:44UTC : Starting to build base images for CUDA and CPU.
This may take a while, please be patient.
The images will be pushed to the GCP Artifact Registry.
Once done, the workflow will update the gigl/dep_vars.env file with the new image names.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 4, 2026

GiGL Automation

@ 22:48:36UTC : Built and pushed new images:

  • CUDA base image: us-central1-docker.pkg.dev/external-snap-ci-github-gigl/public-gigl/gigl-cuda-base:7d3182eeb6446ce3e35910babba990c8e003879d.109.1
  • CPU base image: us-central1-docker.pkg.dev/external-snap-ci-github-gigl/public-gigl/gigl-cpu-base:7d3182eeb6446ce3e35910babba990c8e003879d.109.1
  • Dataflow base image: us-central1-docker.pkg.dev/external-snap-ci-github-gigl/public-gigl/gigl-dataflow-base:7d3182eeb6446ce3e35910babba990c8e003879d.109.1
  • Builder image: us-central1-docker.pkg.dev/external-snap-ci-github-gigl/gigl-base-images/gigl-builder:7d3182eeb6446ce3e35910babba990c8e003879d.109.1

Updated gigl/dep_vars.env with new image names.
Updated .github/cloud_builder/run_command_on_pr_cloud_build.yaml with new builder image.

@mkolodner-sc mkolodner-sc enabled auto-merge May 4, 2026 23:29
@mkolodner-sc mkolodner-sc disabled auto-merge May 4, 2026 23:32
@mkolodner-sc mkolodner-sc added this pull request to the merge queue May 5, 2026
Merged via the queue into main with commit ca4430f May 5, 2026
10 checks passed
@mkolodner-sc mkolodner-sc deleted the mkolodner-sc/cpp_ppr_tracer branch May 5, 2026 06:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants