Skip to content

Distributed inference and LoRA training across towers via reticulum #337

@joelteply

Description

@joelteply

Problem

Inference and LoRA fine-tuning are locked to the local machine. Can't use the 5090 tower's GPU from a MacBook at work. Can't distribute training across multiple towers.

Vision

Commands.execute('ai/inference', { model: 'coder-32b' }) transparently routes to whichever tower has the model loaded and VRAM available. Same for genome/train. The user doesn't care WHERE it runs — reticulum handles routing.

Architecture

  • Discovery: towers announce capabilities (GPU type, VRAM, loaded models) via reticulum
  • Routing: inference requests route to best available tower (lowest latency, has model, has VRAM)
  • Training: genome/train can target a remote tower (--tower=5090 or auto-select by VRAM)
  • Transport: Tailscale (Tailscale mesh network for multi-tower remote inference #323) provides the mesh network, reticulum provides the command routing
  • Streaming: inference tokens stream back to caller in real-time (not batch)

Use cases

  1. MacBook at work → 5090 at home for inference (via Tailscale)
  2. MacBook starts LoRA training on 5090 (32GB VRAM) while continuing to use local system
  3. Multiple towers: 5090 handles training, Mac handles UI, phone handles voice
  4. Load balancing: if 5090 is training, route inference to Mac's Metal GPU

Implementation path

  1. Tailscale mesh (Tailscale mesh network for multi-tower remote inference #323) — network connectivity
  2. Tower capability announcement (GPU, models, VRAM available)
  3. Commands.execute() remote routing (already designed in remote command delegation)
  4. Inference streaming over WebSocket/gRPC
  5. Training job dispatch + progress events across towers
  6. Model sync: ensure same GGUF/adapter files available on target tower

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions