Skip to content

Grid health: resource envelope reporting + workload scheduling #864

@joelteply

Description

@joelteply

Vision

Each Continuum node reports its resource envelope to the grid, enabling intelligent workload placement.

Phases

v0 (today): 80% RAM self-policing per node. Hardcoded 4096MB limit replaced with dynamic detection via sysctl/proc/meminfo.

v1: Each node reports resource envelope (RAM, VRAM, CPU, disk, network) to grid coordinator. Coordinator assigns per-node budgets based on fleet-wide capacity.

v2: Grid coordinator IS the scheduler — places workloads by capacity. 48GB node gets 27B model, 8GB node gets 0.8B. Like ECS task placement but for AI inference.

v3: Containerized forge jobs dispatched to nodes with available GPU. Factory becomes a distributed job scheduler across the mesh.

Context

The 80% RAM guard was added to fix MEMLEAK FATAL killing continuum-core on machines with >4GB RAM (issue #603). This is the local-only v0 — grid coordination is the long-term architecture.

Related: Foreman (per-node executor) and Ares (centralized authority) role split.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions