AIR — Adaptive Intelligence Runtime

Topics: adaptive-runtime inference model-loading batching resource-scheduling fleet-layer cocapn

AIR — Adaptive Intelligence Runtime

The medium agents breathe through — AIR is the runtime layer that makes fleet intelligence operational.

AIR (Adaptive Intelligence Runtime) is the runtime interface layer between AI agents and their compute environment. It provides dynamic model loading, adaptive batch sizing, and resource-aware scheduling — so agents can focus on thinking, not managing hardware.

Part of the Cocapn fleet — lighthouse keeper architecture.

What It Does

AIR sits between the agent and the hardware. When an agent needs to run inference, AIR handles:

Dynamic model loading — Load models on-demand, hot-swap between model sizes based on task complexity
Adaptive batch sizing — Group requests into batches sized to the available GPU memory (small for RTX 4050, large for A100)
Resource-aware scheduling — Route based on device capability, current load, and budget constraints

Think of it as the air traffic control tower for inference requests. Every agent submits its request; AIR routes it to the right runway.

Quick Start

Install

pip install cocapn-air

Basic Usage

import air

# Initialize the runtime
runtime = air.Runtime(
    model="fleet/default",           # Model path or registry ID
    adaptive_batching=True,         # Dynamically size batches
    resource_policy="auto"          # Let AIR decide based on device
)

# Execute inference
result = runtime.execute("What is the current fleet status?")

print(f"Response: {result['text']}")
print(f"Model: {result['model']}")
print(f"Latency: {result['latency_ms']}ms")
print(f"Batch size: {result['batch_size']}")

Configuration Options

runtime = air.Runtime(
    model="glm-5.1",               # Specific model
    adaptive_batching=True,        # True = auto-size batches
    batch_size=8,                  # Fixed batch size (overrides adaptive)
    resource_policy="jetson",       # Target device: "auto" | "jetson" | "cloud" | "cpu"
    max_tokens=2048,               # Generation limit
    temperature=0.7,               # Sampling temperature
)

Architecture

AIR/
├── README.md
├── CHARTER.md
├── DOCKSIDE-EXAM.md
├── LICENSE
└── tests/
    └── test_air_docs.py          # Documentation contract tests

Component Overview

Component	Role
`Runtime`	Main entry point. Initializes models, manages batching, routes requests
Model Loader	Loads and hot-swaps models based on device capability
Batch Scheduler	Groups concurrent requests into GPU-efficient batches
Resource Monitor	Tracks GPU memory, CPU load, latency percentiles

Decision Flow

Agent Request
    │
    ▼
Batch Scheduler ──are there other pending requests?── Yes ──► Group into batch
    │                                                        │
    │ No                                                     ▼
    ▼                                                   Execute on GPU
Resource Monitor                                          │
    │                                                        ▼
    ▼                                                   Unbundle results
Return Result ───────────────────────────────────────────► Agent

Demo: Simulated Runtime

Since AIR is documented-first (code is implementation in progress), here's a simulated demo showing the expected API:

import air

# Simulate the runtime behavior
runtime = air.Runtime(
    model="fleet/default",
    adaptive_batching=True,
    resource_policy="auto"
)

# Example: Fleet status check
result = runtime.execute("Check fleet health")
# Expected output:
# {
#   'text': 'Fleet is healthy. 12 agents active, 2 services down.',
#   'model': 'glm-5.1',
#   'latency_ms': 234,
#   'batch_size': 3,
#   'device': 'jetson-orin'
# }

# Example: Complex reasoning task
result = runtime.execute("Design a routing algorithm for fleet orchestration")
# {
#   'text': '[detailed response...]',
#   'model': 'glm-5.1',
#   'latency_ms': 1247,
#   'batch_size': 1,
#   'device': 'cloud-a100'
# }

Fleet Context

Part of the Cocapn fleet. Related repos:

Repo	Role
JetsonClaw1-vessel	Edge-native agent case study
Equipment-Swarm-Coordinator	Multi-agent orchestration
Equipment-Consensus-Engine	Multi-agent deliberation
plato-sdk	Agent communication protocol
cudaclaw	GPU-accelerated agent orchestration

Status

AIR is in Active development. The README and documentation describe the intended API; the implementation follows the spec in DOCKSIDE-EXAM.md.

For implementation status, see STATE.md (if present) or check the fleet-status live endpoints.

🦐 Cocapn fleet — lighthouse keeper architecture

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AIR — Adaptive Intelligence Runtime

What It Does

Quick Start

Install

Basic Usage

Configuration Options

Architecture

Component Overview

Decision Flow

Demo: Simulated Runtime

Fleet Context

Status

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
tests		tests
.gitignore		.gitignore
CHARTER.md		CHARTER.md
DOCKSIDE-EXAM.md		DOCKSIDE-EXAM.md
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

AIR — Adaptive Intelligence Runtime

What It Does

Quick Start

Install

Basic Usage

Configuration Options

Architecture

Component Overview

Decision Flow

Demo: Simulated Runtime

Fleet Context

Status

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages