Plexus

Heterogeneous GPU mesh for verifiable LLM inference. Your N GPUs of N kinds → one high-spec GPU. Open, verifiable, vendor-neutral.

Democratize datacenter-class compute on consumer hardware.

What is Plexus?

Plexus 는 서로 다른 종류 의 N개 GPU 를 1 개의 고사양 (high-spec) logical GPU 처럼 사용할 수 있게 하는 분산 LLM 추론 mesh 입니다. 사용자 A 의 RTX 4090, 사용자 B 의 Apple M3 Ultra, 사용자 C 의 AMD MI300X, 사용자 D 의 H100 이 동일한 pool 안에서 단일 모델 (Llama 3.3 70B, DeepSeek V3 685B 등) 의 추론을 협력 수행합니다.

핵심 가치 명제 — Democratize Datacenter-Class Compute:

합치는 것	가격	Aggregate 성능	등가 datacenter GPU	가격 차이
16x RTX 4090 (24GB ea, 384GB total)	$16K	2640 TFLOPs FP16, 16TB/s aggregate BW	1x H100 SXM (80GB)	$30K-40K (2-2.5x ↓)
8x Apple M3 Ultra (192GB ea, 1.5TB total)	$24K	115 TFLOPs FP16, 6TB/s	1x H200 (141GB)	$35K+ (1.5x ↓)
32x Apple M3 Max (128GB ea, 4TB total)	$64K	230 TFLOPs FP16, 12TB/s	2-3x H100 또는 1x B100	$60K-100K (1-1.5x ↓)

Inference 워크로드는 memory-bound 이므로 VRAM 합산 + bandwidth 합산 이 datacenter GPU 와 실질 등가. Plexus 는 consumer hardware 의 협력적 결합 으로 H100 / H200 / MI300X 급 추론 을 open source + privacy-preserving 하게 제공.

Built for Homelab Engineers (ADR-0004)

Plexus 의 primary target = homelab 엔지니어 + 소규모 랩 운영자. r/homelab + r/LocalLLaMA + 학술 lab + SMB + hobbyist 시장 (수십만 잠재 사용자).

🎯 Hero Scenario — 작은 VRAM 합산 → 큰 모델 구동

3개의 8GB VRAM GPU = 24GB pool → 24GB 크기 LLM 구동

Homelab Pool	가격 (used)	VRAM 합	처리 가능 모델
3x RTX 3060 Ti 8GB ⭐ Hero	$600	24GB	Llama 3 30B int4 / Mixtral 8x7B int4 / CodeLlama 34B int4
3x RTX 4060 8GB	$900	24GB	(동일)
3x RTX 3060 12GB	$900	36GB	Llama 70B int4 / Qwen 32B int8
6x RTX 3060 Ti 8GB	$1,200	48GB	Llama 70B int4 / DeepSeek Coder 33B fp8
3x Tesla P40 24GB (used)	$600-900	72GB	Llama 70B fp8 / Mixtral 8x22B int4
2x 3090 + 4x 3060 Ti (mixed!)	$2,200	80GB	Llama 70B fp16 / Mixtral 8x22B int4

본 example 모두 Phase 2 (Heterogeneous Pool) verify 의무. 8GB 단일 GPU 에 안 들어가는 24GB 모델 을 3 GPU 협력으로 구동 = Plexus 의 진정한 가치.

Homelab-specific 설계 (Phase 5+):

전력 budget 옵션: plexus serve --power-budget 200W — 가정 전기료 한계 명시
소음 / fan curve: worker side optional control
Network: 1G/10G Ethernet 최적화 (Infiniband 부재 가정)
무인 운영: self-healing + Discord/email webhook alert
Docker 부재 옵션: single binary curl | sh + systemd unit (no Docker required)

사용자 A (RTX 4090)   ─┐
사용자 B (M3 Ultra)   ─┤
사용자 C (MI300X)     ─┼── Plexus Pool ──→ 1 logical GPU
사용자 D (H100)       ─┤                    Llama 3.3 405B
사용자 E (CPU only)   ─┘                    Qwen3 32B
                                            DeepSeek V3 685B

핵심 가치

차원	Plexus
Heterogeneous virtualization	16+ GPU 종류 단일 logical GPU. NVIDIA / Apple / AMD / Intel / CPU 동등 1급
Native Rust implementation	Layer 0 kernel (CUDA/Metal/ROCm/SYCL/CPU) ~ Layer 3 runtime (Llama/Qwen/Mistral) 전부 직접 구현. Candle/vLLM/llama.cpp 의존 없음
Hybrid Tier Verifiable	`trust_budget: fast \| verified \| attested` — 사용자가 추론별 신뢰 수준 선택 (spot-check / TEE / future zkML)
Open source 영구	Apache 2.0. 토큰 / NFT / 유료 SaaS / closed-source enterprise edition 모두 명시적 비대상

Anti-positioning

❌ 또 다른 라우터 (engine adapter wrapper) — 직접 구현
❌ Web3 토큰 프로젝트 — 순수 OSS
❌ Apple / NVIDIA 단일 vendor 락인 — 5 backend 동등
❌ Python ecosystem 의존 — Rust 1.95.0 단일

Status

Phase 0 — Foundation Bootstrap (현재).

Phase	상태	내용
0	`[~]` 진행 중	Repo 스캘폴드, 라이센스, CoC, 설계 문서
1	`[ ]`	Multi-Backend Single GPU (CUDA / Metal / ROCm / SYCL / CPU)
2	`[ ]` ⭐	Heterogeneous Multi-GPU Pool (16 GPU 종류)
3	`[ ]`	LAN Multi-Node Cluster (PP + cross-node TP)
4	`[ ]`	Public Swarm + L1 Verifiable
5	`[ ]`	L2 TEE Attestation
6	`[ ]`	API + Multimodal Polish
7	`[ ]`	v1.0 Public Launch

상세: docs/architecture/2026-05-22-plexus-design.md §10 Roadmap.

Quick Start (Future)

NOTE: Phase 1 완료 후 동작. 현재는 스캘폴드 only.

# 설치
curl -fsSL https://plexus.ai/install.sh | sh

# 단일 GPU 추론
plexus serve --gpu cuda:0 --model llama-3.3-8b --port 8080

# Heterogeneous pool (NVIDIA + Apple + AMD 동시)
plexus serve --pool cuda:0,mlx:0,rocm:0 --model llama-3.3-70b

# LAN cluster join
plexus serve --cluster cluster.example.com --model llama-3.3-405b

# Public swarm (verifiable)
plexus serve --swarm public --bootstrap dnsaddr/plexus.ai

# Client (OpenAI 호환)
curl localhost:8080/v1/chat/completions \
  -H "X-Plexus-Trust: verified" \
  -d '{"model": "plexus/llama-3.3-70b", "messages": [...]}'

API 호환

Plexus Gateway 가 동시 4개 API 노출:

MCP (JSON-RPC over stdio/HTTP) — agent-native, Anthropic 표준 [primary]
OpenAI Chat Completions (/v1/chat/completions)
Anthropic Messages (/v1/messages)
Ollama (/api/chat, /api/generate)

agent 코드를 바꾸지 않고 OPENAI_API_BASE=https://plexus.example/v1 만 변경하면 즉시 사용.

Architecture

┌─────────────────────────────────────────────────────┐
│  Layer 4 — API & Gateway (MCP/OpenAI/Anthropic)     │
├─────────────────────────────────────────────────────┤
│  Layer 3 — Native Inference Runtime (Llama/Qwen/..) │
├─────────────────────────────────────────────────────┤
│  Layer 2 — Compute Graph & Scheduler (TP/PP/DP/EP)  │
├─────────────────────────────────────────────────────┤
│  Layer 1 — Distributed Tensor & Collective Ops      │
├─────────────────────────────────────────────────────┤
│  Layer 0 — Kernel Backend (CUDA/Metal/ROCm/SYCL/CPU)│
└─────────────────────────────────────────────────────┘

상세 다이어그램 + 9-step data flow: Design Spec §4.

Workspace 구조

plexus/
├── crates/
│   ├── plexus-core/        # DHT, libp2p, crypto
│   ├── plexus-kernel/      # GPU kernel (CUDA/Metal/ROCm/SYCL/CPU)
│   ├── plexus-tensor/      # Distributed Tensor + collective ops
│   ├── plexus-graph/       # Compute graph + scheduler
│   ├── plexus-runtime/     # Native model architectures
│   ├── plexus-gateway/     # MCP/OpenAI/Anthropic/Ollama API
│   ├── plexus-worker/      # Block hosting + inference + attestation
│   ├── plexus-verifier/    # Spot-check + Merkle + TEE verify
│   ├── plexus-telemetry/   # Prometheus + reputation + anti-Sybil
│   └── plexus-cli/         # Single binary entry
├── docs/
│   ├── architecture/       # 설계 SSOT
│   ├── adr/                # 결정 기록
│   ├── api/                # API 명세
│   └── operations/         # 운영 가이드
├── examples/
└── tests/e2e/

기여

CONTRIBUTING.md 를 먼저 읽어주세요.

DCO sign-off (git commit -s) 의무
Conventional Commits 형식
PR target = dev branch
로컬 게이트 PASS evidence

영감 (Inspiration)

Plexus 의 설계는 다음 프로젝트의 영향을 받았습니다 (단, 모두 기술적으로 독립):

exo-explore/exo — LAN cluster + tensor parallel
bigscience-workshop/petals — 글로벌 swarm + block partition
Folding@home — reciprocity credit (non-monetary)
Tor — onion routing

라이센스

Code: Apache License 2.0
Docs: CC-BY-SA 4.0

⚠ Early Phase 0. 본 README 는 forward-looking. 실제 동작 가능 기능은 Roadmap 참고.

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
crates		crates
deploy		deploy
docs		docs
proto/plexus/v1		proto/plexus/v1
tests		tests
.gitignore		.gitignore
AUTHORS.md		AUTHORS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
DCO.md		DCO.md
GOVERNANCE.md		GOVERNANCE.md
LICENSE		LICENSE
NOTICE		NOTICE
PATENTS.md		PATENTS.md
PRIVACY.md		PRIVACY.md
README.md		README.md
SECURITY.md		SECURITY.md
STYLE.md		STYLE.md
THIRD_PARTY_LICENSES.md		THIRD_PARTY_LICENSES.md
TRADEMARK.md		TRADEMARK.md
lefthook.yml		lefthook.yml
rust-toolchain.toml		rust-toolchain.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Plexus

What is Plexus?

Built for Homelab Engineers (ADR-0004)

핵심 가치

Anti-positioning

Status

Quick Start (Future)

API 호환

Architecture

Workspace 구조

기여

영감 (Inspiration)

라이센스

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Plexus

What is Plexus?

Built for Homelab Engineers (ADR-0004)

핵심 가치

Anti-positioning

Status

Quick Start (Future)

API 호환

Architecture

Workspace 구조

기여

영감 (Inspiration)

라이센스

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages