ComputePool

Production LLM inference on the consumer GPUs you already own — at near-zero latency overhead vs. a single big GPU. A 70B model needs ~140 GB of VRAM. An RTX 4090 has 24. ComputePool shards a model layer-wise across two (or more) consumer cards, streams hidden states peer-to-peer over Gensyn AXL, settles via x402 + Superfluid on 0G Galileo testnet, and orchestrates everything through KeeperHub workflows.

The per-token network hop adds single-digit milliseconds on top of the actual forward pass — negligible next to the GPU compute itself, because AXL keeps the hidden-state transport peer-to-peer with no orchestrator round-trip.

Live on 0G Galileo testnet · Built for ETHGlobal 2026 · Hits all three sponsor tracks: 0G, Gensyn (AXL), KeeperHub.


Demo	`make build && make up` → http://localhost:8000
Pitch deck	`frontend/app/pitch/page.tsx` (also live at `/pitch`)
End-to-end script	`scripts/e2e_demo.py`

Sponsor judging packets

Three deep-dive READMEs — start here:

0G ($15,000) — sponsors/0g/README.md › Pooled-GPU SDK behind a TEE-attested orchestrator · Superfluid live on Galileo (chainId 16602) · ERC-7857 PoolINFT as access primitive
Gensyn — AXL ($5,000) — sponsors/gensyn-axl/README.md › First production layer-pipelined LLM inference over AXL · prebuilt NVIDIA + CPU images with the AXL binary inside · native Tailscale in the same compose stack
KeeperHub ($5,000) — sponsors/keeperhub/README.md › Two upstream PRs (#1106 Superfluid plugin, #1105 Coalition plugin) · five workflows drive the full x402 + Superfluid payment lifecycle

What it actually does

A single GPU can't fit a real model. We turn N small GPUs into one virtual one:

Sharding: worker/model.py loads only the assigned layer slice from a HuggingFace model. Entry holds embed + layers[0:mid], exit holds layers[mid:N] + lm_head.
P2P transport: worker/axl_client.py + worker/framing.py carry hidden-state tensors and sampled tokens across Gensyn AXL frames.
Token loop: worker/pipeline.py — entry_generate and exit_loop drive one forward hop per token until EOS. Network overhead per token is negligible vs. the actual GPU compute — the AXL hop is single-digit ms on a Tailscale mesh, while a single transformer forward pass on a consumer card is tens of ms; sharded throughput tracks single-GPU throughput within noise.
Payments: orchestrator/x402.py gates /pools/{n}/infer with HTTP 402; orchestrator/economics.py opens the Superfluid stream once the voucher settles.
Coordination: orchestrator/keeperhub.py drives five KeeperHub workflows for coalition activation, pool wiring, stream start/stop, and slashing.
Custody / iNFT: orchestrator/inft/ mints a per-pool ERC-7857 INFT on 0G with encrypted intelligence metadata.

Why it wins (one line per sponsor)

Sponsor	What we shipped on top	Track relevance
0G	First CREATE2-deployed verified Superfluid contracts on 0G Galileo · pooled-GPU SDK lets consumer cards qualify for 0G Compute together · orchestrator runs in TEE so 0G's signing flow stays intact · live ERC-7857 INFT per pool	Best Agent Framework / Tooling and Best Autonomous Agents / iNFT
Gensyn — AXL	First production deployment of layer-pipelined LLM inference over AXL. Multi-node by construction (entry node ⇄ exit node). Prebuilt NVIDIA + CPU Docker images = one-line deploy. Tailscale-native — zero exposed ports.	Best Application of AXL — depth, multi-node, real utility
KeeperHub	Five workflows that drive the full demo · upstream Superfluid plugin (#1106) · upstream Coalition plugin (#1105) — multi-party on-chain commitments with slashing · agents pay autonomously via x402	Best Innovative Use and Best Integration (payments + framework)

Each sponsor folder contains the full breakdown — what we built, where it lives in the code, and how to verify it.

Repo layout

.
├── orchestrator/          FastAPI control plane + payments + INFT + KeeperHub client
│   ├── api/                    /infer, /v1/chat/completions (OpenAI-compat)
│   ├── x402.py                 self-hosted x402 facilitator client
│   ├── economics.py            Superfluid pool + stream lifecycle
│   ├── keeperhub.py            KeeperHub MCP / JSON-RPC client
│   ├── inft/                   ERC-7857 INFT mint + 0G-Storage encrypted metadata
│   └── tee/                    SGX attestation + signer
├── worker/                P2P inference shard (entry or exit)
│   ├── model.py                SplitModel — loads only assigned layer slice
│   ├── axl_client.py           HTTP wrapper around the AXL daemon
│   ├── framing.py              binary wire format (header + raw bf16 tensor)
│   └── pipeline.py             entry_generate / exit_loop token loop
├── frontend/              Next.js 16 / React 19 dashboard + pitch deck
├── facilitator/           x402 facilitator (relayer for transferWithAuthorization)
├── contracts/             Foundry — PoolINFT.sol, deploy scripts
├── keeperhub/             Five workflow JSON exports (re-importable)
├── docker/                Multi-stage Dockerfile (Go for AXL + Python for app)
├── scripts/               e2e_demo.py, register_0g_provider.py, run-remote-worker.sh
└── sponsors/              ⭐ Per-sponsor deep-dive READMEs (start here for judging)

Quick start

make build           # multi-stage build: Go (AXL daemon) + Python (worker + orchestrator)
make up              # start orchestrator + node-a + node-b
make ps              # confirm three containers Up

Open http://localhost:8000, register a user, copy the API key, restart the workers with OWNER_API_KEY=<key> so they self-register, then from the dashboard: create a pool → add both nodes → initialize with a model → load → infer.

For the full payments flow (x402 + Superfluid + KeeperHub):

cp .env.example .env  # fill 0G testnet keys + KH workflow IDs + Superfluid addresses
make build && make up
docker compose up -d facilitator
DEMO_PAYER_KEY=0x... python scripts/e2e_demo.py

Tear down: make clean.

Supported models

Model	Layers	Split (entry / exit)
`meta-llama/Llama-3.2-1B`	16	0–7 / 8–15
`meta-llama/Llama-3.2-3B`	28	0–13 / 14–27
`Qwen/Qwen2.5-3B-Instruct`	36	0–17 / 18–35
`Qwen/Qwen3-4B-Instruct-2507`	36	0–17 / 18–35

To add a new model: append to MODEL_LAYERS in orchestrator/app.py.

Configuration

Worker container env (full table in docs/):

Variable	Required	Notes
`NODE_ID`, `WORKER_URL`, `ORCHESTRATOR_URL`, `OWNER_API_KEY`	yes	self-registration
`MODEL_NAME`	no	default `Qwen/Qwen2.5-3B-Instruct`
`PEER_HOST` or `PEER_ADDR`	yes	the other worker (single-worker mode is not supported)
`AXL_API_URL`	no	default `http://localhost:9002` (in-container)

Orchestrator reads MONGODB_URI, MONGODB_DB, plus 0G + Superfluid + x402 + KeeperHub env documented in .env.example.

API surface (orchestrator)

All endpoints except /, /health, /api/models, /auth/* require X-API-Key.

Method	Path	Purpose
`GET`	`/`	HTML dashboard
`GET`	`/api/state`	JSON cluster snapshot
`POST`	`/auth/register`, `/auth/login`	API-key issuance
`POST`	`/nodes/register`	(called by workers on startup)
`*`	`/pools/...`	full pool CRUD + initialize / load / unload / infer
`POST`	`/v1/chat/completions`	OpenAI-compatible streaming inference (gated by x402)

Honest limits

Single in-flight generation per pool (no batching, no parallel requests in v1).
KV cache lives in worker RAM; dropped on unload or container restart.
Worker callbacks are unauthenticated — never expose worker ports publicly. AXL TLS handshake protects the P2P link itself but there is no peer allowlist yet.
CPU-only by default — bundled torch is the CPU wheel. Swap for the matching CUDA wheel in worker/requirements.txt for GPU.
AXL port is overridden to 7001 (away from the worker's 7000); don't change it.

Team & contact

Architecture deep-dive: docs/ Sponsor judging packets: sponsors/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ComputePool

Sponsor judging packets

What it actually does

Why it wins (one line per sponsor)

Repo layout

Quick start

Supported models

Configuration

API surface (orchestrator)

Honest limits

Team & contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 140 Commits
contracts		contracts
docker		docker
docs		docs
facilitator		facilitator
frontend		frontend
keeperhub		keeperhub
orchestrator		orchestrator
scripts		scripts
sponsors		sponsors
worker		worker
.env.example		.env.example
.gitignore		.gitignore
.gitmodules		.gitmodules
CLAUDE.md		CLAUDE.md
Makefile		Makefile
README.md		README.md
docker-compose.gpu-extra.yml		docker-compose.gpu-extra.yml
docker-compose.gpu.yml		docker-compose.gpu.yml
docker-compose.tailscale.yml		docker-compose.tailscale.yml
docker-compose.yml		docker-compose.yml

Folders and files

Latest commit

History

Repository files navigation

ComputePool

Sponsor judging packets

What it actually does

Why it wins (one line per sponsor)

Repo layout

Quick start

Supported models

Configuration

API surface (orchestrator)

Honest limits

Team & contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages