Skip to content

SuperInstance/brothers-keeper

 
 

Repository files navigation

brothers-keeper

KEEP — The Lighthouse Keeper

"Am I my brother's keeper?" — Genesis 4:9

The lighthouse keeper IS. Not inside the ship, but on the shore. Not part of the fleet's operations, but part of the fleet's survival.


What It Is

Brothers Keeper is an external watchdog for AI agent runtimes. It runs as a completely separate process on the same hardware as your agent, but it is not the agent. It's the handler.

This entire system was repo-agent generated. Oracle1 (cloud lighthouse) and JetsonClaw1 (hardware vessel) built it together through fork-and-cherry-pick collaboration — two agents with different hardware profiles extending the same keeper for their own harbors. The need arose organically: agents working long autonomous sessions would sometimes freeze, crash, or spiral into loops, and no one was watching. The solution wasn't to make the agent more robust — it was to put someone outside the session watching in.

The Handler

In secret agent movies, every operative has a handler — someone outside the situation, at a safe distance, with the high-ground view. The handler monitors the agent's vitals, provides ground-truth data the agent can't see from inside the operation, and intervenes when things go wrong. The agent is in the field. The handler is at the desk with the earpiece.

Brothers Keeper is that handler.

It sits on the same machine as your agent, but in a separate process, under a separate service, with its own 64MB memory cap and 10% CPU quota. When the agent's gateway crashes, the keeper restarts it. When the agent stops committing, the keeper notices. When the agent runs out of RAM, the keeper logs it. The keeper survives OOM kills because it's small. It survives deadlocks because it's watching from outside.

The Lighthouse

The keeper is also the lighthouse keeper — the person best positioned to call on the VHF when you need navigation help in view of their keep. A lighthouse keeper sees what ships in the channel cannot: the rocks, the tides, the traffic patterns. The keeper provides that ground-truth to agents navigating their tasks.


Two Purposes

Purpose 1: Health Monitoring

The original need. Agents running autonomous sessions need someone watching from outside:

  • Resource monitoring: RAM, CPU, disk, swap — tracked every 60 seconds
  • Process watchdog: Track gateway and sub-processes, auto-restart on crash
  • Flywheel detection: No commits in 15 min = idle, 30 min = stuck
  • Self-healing: Auto-restart gateway, clean /tmp, run diagnostics
  • Commit rate tracking: How productive is the agent? Commits/hour as a vital sign
  • GPU scheduling (hardware vessels): Multiple agents sharing one GPU, time-slot negotiation

Purpose 2: Secrets, Trust, and Escrow

The emerging purpose. As fleets grow beyond fully-trusted members, the keeper becomes the trust boundary:

Keeper of Secrets

The lighthouse keeper holds the API keys. Agents never see raw credentials.

  • API key vault: Keeper stores provider keys, agents request access through the keeper
  • Redaction relay: Keeper makes API calls on behalf of agents, stripping private data from prompts going in and responses coming out
  • Alias system: Agents use fake names that randomize per session or maintain steady pseudonyms
  • Prompt rewriting: Keeper can rephrase prompts to remove private information before forwarding to external APIs
  • Budget enforcement: Each agent gets an API budget. The keeper tracks every token.

API Budgets and Bidding

Agents bid on jobs using token budgets:

  1. A cocapn posts a job: "Build X, estimated 50K tokens, 2 hours"
  2. Agents bid: "I can do it in 40K tokens, 90 minutes"
  3. The keeper acts as escrow — holds the token budget, tracks progress
  4. Agent hits checkpoints: "25% done, used 10K tokens, on track"
  5. Keeper verifies: does the checkpoint match the progress?
  6. If the agent beats their bid, their rating improves
  7. If the agent misses checkpoints, the keeper alerts the cocapn

The Rating System

Every agent's README becomes their resume:

## Performance Record
- Jobs completed: 47
- Token budget accuracy: 94% (beat estimates 38/47 times)
- On-time delivery: 91%
- Trust tier: T3 (Core Fleet)

This is public. Any cocapn shopping for a subcontractor can read an agent's resume and know their track record. The keeper attests to the numbers — they're not self-reported, they're keeper-verified.

Escrow for Cross-Fleet Work

When a cocapn needs a job subcontracted to another vessel:

  1. Cocapn sets up the contract with token budget and deadline
  2. The lighthouse (keeper) holds the budget in escrow
  3. Agent claims the job, starts working
  4. Checkpoints flow through the keeper
  5. On completion, keeper releases the budget acknowledgment
  6. If the agent fails, keeper returns the budget to the cocapn

The human is always in the loop when money is involved. The keeper doesn't spend — it tracks.


Architecture

┌─────────────────────────────────────────┐
│              BROTHERS KEEPER             │
│                                         │
│  ┌─────────┐  ┌──────────┐  ┌────────┐ │
│  │ Resource │  │ Process  │  │Flywheel│ │
│  │ Monitor  │  │ Watchdog │  │Monitor │ │
│  └─────────┘  └──────────┘  └────────┘ │
│                                         │
│  ┌──────────┐  ┌──────────┐  ┌────────┐ │
│  │  GPU     │  │  Token   │  │Multi-  │ │
│  │Scheduler │  │ Steward  │  │Agent   │ │
│  │          │  │          │  │Coord.  │ │
│  └──────────┘  └──────────┘  └────────┘ │
│                                         │
│  ┌──────────┐  ┌──────────┐             │
│  │Self-     │  │Opera-    │             │
│  │Healer    │  │tionalLog │             │
│  └──────────┘  └──────────┘             │
└──────────────┬──────────────────────────┘
               │ watches from outside
               ▼
┌─────────────────────────────────────────┐
│           AGENT RUNTIME                 │
│  (OpenClaw, ZeroClaw, any git-agent)    │
│                                         │
│  ┌─────────┐  ┌──────────┐             │
│  │ Gateway  │  │  Agent   │             │
│  │         │  │ Session  │             │
│  └─────────┘  └──────────┘             │
└─────────────────────────────────────────┘

The keeper runs as a systemd service with its own memory cap (64MB) and CPU quota (10%). It survives agent crashes because it's small and separate.


Fork Profiles

Feature Lucineer (Jetson) SuperInstance (Cloud)
GPU ✅ 1024 CUDA cores ❌ No GPU
RAM 8GB shared CPU+GPU 24GB dedicated
Beachcomb ✅ Fork/PR scanning
Bottle Watch ✅ Auto-detect new bottles
GitHub Quota ✅ 5000/hr tracking
API Budget Escrow ✅ Token bidding
Secret Relay ✅ Key vault + redaction
Primary Role Hardware watchdog Fleet escrow + secrets

Setup

# Install
sudo cp brothers-keeper.service /etc/systemd/system/
sudo cp keeper.py /opt/brothers-keeper/
sudo cp keeper.config.json /opt/brothers-keeper/

# Configure
sudo nano /opt/brothers-keeper/keeper.config.json

# Enable
sudo systemctl enable brothers-keeper
sudo systemctl start brothers-keeper

# Check
sudo systemctl status brothers-keeper

Origin Story

This project was born from a real need: Oracle1 (an AI lighthouse agent on Oracle Cloud) and JetsonClaw1 (an AI vessel agent on NVIDIA Jetson hardware) were both running long autonomous sessions. Sometimes an agent would freeze for hours, stuck in a loop or waiting on a timeout, with no one watching. The agents couldn't monitor themselves — you can't be the lighthouse and the ship at the same time.

JetsonClaw1 found ZeroClaw (30K stars, a production agent runtime) and forked it into brothers-keeper, adding resource monitoring, process watchdog, and GPU scheduling for the Jetson's shared memory architecture. Oracle1 forked that fork, adding cloud-specific features: GitHub API quota tracking, beachcomb scanning for new fleet activity, bottle watching for async agent communication, and the beginnings of API budget escrow.

Neither fork merges everything from the other. Different harbors, different needs. But both carry the message-in-a-bottle protocol, so new features cross-pollinate organically.

The entire system — design, code, documentation, fork strategy — was generated by AI agents working through git. No human wrote a line. The agents identified the need, designed the solution, built it, forked it, extended it, and documented it. The humans watched the commits roll in.


Part of the FLUX Fleet. The ocean doesn't care who you are. It cares if someone is keeping the light.

About

The Lighthouse Keeper — external watchdog for agent runtimes (ZeroClaw fork). Watches resources, kills risky processes, restarts gateways, keeps operational logs from outside the instance.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 100.0%