⚡ Alvus

~5 MB binary. Zero dependencies. Zero 429s. A lightweight Go proxy that silently absorbs rate limit errors and keeps your AI agent running.

The Problem

You're in the middle of an agentic session — OpenClaw is halfway through a task, Cline is on a roll, your agent is doing things — and then:

Error: 429 Too Many Requests

The loop breaks. Context is lost. You're staring at a spinner.

If you use free-tier providers like NVIDIA NIM, this happens constantly. Free keys cap around 40 RPM. One productive session burns through that in seconds.

The Solution

Alvus sits between your agent and the upstream API. You give it a pool of keys. It handles everything else — round-robin distribution, per-key cooldowns, automatic retries, streaming passthrough. Your agent never sees a 429.

Any OpenAI-compatible agent or IDE
              │
              ▼
   ┌─────────────────────┐
   │        Alvus        │  ← localhost:3000
   │                     │
   │  [key1] ✅ ready    │
   │  [key2] ✅ ready    │  ──→  NVIDIA NIM / any OpenAI-compatible API
   │  [key3] ❄️ cooling  │
   └─────────────────────┘

3 keys × 40 RPM = 120+ effective RPM. The math is simple. The setup is simpler.

Idle RAM usage: ~2 MB. Alvus is a single static binary with no runtime. It won't compete with your models for memory.

Works With Everything

If it speaks OpenAI-compatible API, it works with Alvus.

Tool	Type	Setup
OpenClaw	AI agent	Set base URL in provider config
PicoClaw	Lightweight agent	Set `api_base` in config.json
Nanobot	Lightweight agent	Set `api_base` in config.yaml
Cline	VS Code agent	OpenAI Compatible provider
Cursor	IDE	Base URL override in settings
Aider	CLI agent	`--openai-api-base` flag
Any OpenAI-compatible client	—	Point at `http://localhost:3000/v1`

Features


🔑 Key pool	Multiple keys, one endpoint. Distribute load transparently
🔄 Round-robin	Even distribution across all healthy keys
🚫 Silent retry on 429	Failed key enters cooldown, request retries instantly with the next
⏱️ Retry-After support	Respects upstream `Retry-After` headers — no blind fixed waits
🔑 Auto-disable on 401/403	Invalid or revoked keys are permanently removed from the pool
📡 Streaming passthrough	SSE and chunked responses piped with zero buffering overhead
❤️ Health endpoint	`GET /health` shows live key status and cooldown timers
🪶 Zero dependencies	Pure Go stdlib. One file. One binary
🔧 `.env` support	Built-in parser — no `godotenv`, no extras
🖥️ Runs anywhere	linux/amd64, arm64, arm, 386 — including Pi Zero and older x86 hardware
💾 ~2 MB idle RAM	Static binary, no runtime, won't compete with your models for memory

Quickstart

1. Get the binary

Build from source (requires Go 1.21+):

git clone https://github.com/YOUR_USERNAME/alvus.git
cd alvus
go build -o alvus main.go

Cross-compile for a remote server (e.g. Raspberry Pi Zero, 32-bit x86):

# Pi Zero / older ARM
GOOS=linux GOARCH=arm CGO_ENABLED=0 go build -o alvus main.go

# 32-bit x86 (Atom, old netbooks, salvaged hardware)
GOOS=linux GOARCH=386 CGO_ENABLED=0 go build -o alvus main.go

The binary is fully static — drop it on the machine and run it. No runtime, no dependencies, no install step.

Download a prebuilt release:

Go to Releases and grab the binary for your platform.

2. Configure

Create .env in the same directory as the binary:

# Your API keys, comma-separated
API_KEYS=nvapi-xxxxxxxxxxxx,nvapi-yyyyyyyyyyyy,nvapi-zzzzzzzzzzzz

# Port to listen on (default: 3000)
PORT=3000

# Upstream API base URL (default: NVIDIA NIM)
TARGET_BASE_URL=https://integrate.api.nvidia.com/v1

# Seconds to cool down a key after a 429 (default: 60)
COOLDOWN_SEC=60

Real environment variables take precedence over .env — useful for systemd or containers.

3. Run

./alvus

⚡ Alvus started on :3000
   Target  : https://integrate.api.nvidia.com/v1
   Keys    : 3 loaded
   Cooldown: 60s per key on 429

4. Point your agent at it

OpenClaw

{
  "models": {
    "providers": {
      "nim": {
        "baseUrl": "http://localhost:3000/v1",
        "apiKey": "sk-proxy-dummy"
      }
    },
    "defaults": {
      "provider": "nim",
      "model": "deepseek-ai/deepseek-r1"
    }
  }
}

PicoClaw / Nanobot

{
  "model_name": "deepseek-r1",
  "model": "openai/deepseek-ai/deepseek-r1",
  "api_base": "http://localhost:3000/v1",
  "api_keys": ["sk-proxy-dummy"]
}

Cline (VS Code)

Setting	Value
API Provider	`OpenAI Compatible`
Base URL	`http://localhost:3000/v1`
API Key	`sk-proxy-dummy` (any string)
Model ID	`deepseek-ai/deepseek-r1`

Cursor

Settings → Models → set base URL to http://localhost:3000/v1, any dummy key.

Aider

aider --openai-api-base http://localhost:3000/v1 --openai-api-key sk-dummy

How It Works

1. Request arrives from your agent or IDE
2. Body is buffered (needed for retry replay)
3. Round-robin picks the next available key
4. Request forwarded upstream with that key injected
        │
        ├── ✅ 2xx/3xx → headers + body streamed back, done
        ├── ❄️ 429     → key enters cooldown, retry with next key
        ├── 🔑 401/403 → key permanently removed from pool
        └── ⚠️ other 4xx/5xx → passed through as-is

Your agent sees a clean stream or a final error. Never a 429.

Key Status

curl http://localhost:3000/health

{
  "status": "ok",
  "keys": 3,
  "pool": "[0]:ready  [1]:cooling(42s)  [2]:ready"
}

Other Providers

TARGET_BASE_URL is all you need to change:

# OpenRouter
TARGET_BASE_URL=https://openrouter.ai/api/v1

# Together AI
TARGET_BASE_URL=https://api.together.xyz/v1

# Groq
TARGET_BASE_URL=https://api.groq.com/openai/v1

# Any other OpenAI-compatible endpoint
TARGET_BASE_URL=https://your-provider.com/v1

Running as a Service (systemd)

[Unit]
Description=Alvus
After=network.target

[Service]
ExecStart=/usr/local/bin/alvus
WorkingDirectory=/etc/alvus
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target

Put your .env in /etc/alvus/. Reload and start:

sudo systemctl daemon-reload
sudo systemctl enable --now alvus

FAQ

Do I need Go installed to run this? No. Download a prebuilt binary from Releases.

Are my keys safe? Keys live in .env on your machine and are only ever sent to the upstream provider. Alvus logs key indices, never key values.

What if ALL keys are cooling? Alvus waits for the soonest key to become available and retries, up to 10 times. If everything stays exhausted, it returns 503. In practice, with 3 keys and a 60s window this is very hard to trigger.

Can I reload keys without restarting? Not yet — planned for a future release. For now, restart the binary after editing .env. With systemd: sudo systemctl restart alvus.

Does it work on a Raspberry Pi Zero / 32-bit hardware? Yes. Prebuilt binaries include linux/arm and linux/386. The binary is fully static — no runtime needed.

How much memory does it use? Around 2 MB at idle. It's a single static Go binary with no runtime overhead — you won't notice it sitting next to a running model.

Roadmap

Hot-reload when .env changes (no restart needed)
Per-key request counters in /health
Web dashboard (opt-in, zero-dep binary stays the same)

Contributing

PRs welcome. This project lives in a single file with zero external dependencies — keep it that way. If a feature needs an import beyond stdlib, it doesn't belong in main.go. Open an issue first and we'll figure out the right shape for it.

License

MIT.

Built at 2am when an OpenClaw task hit its fifth 429 in a row.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.github/workflows		.github/workflows
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.go		main.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

⚡ Alvus

The Problem

The Solution

Works With Everything

Features

Quickstart

1. Get the binary

2. Configure

3. Run

4. Point your agent at it

OpenClaw

PicoClaw / Nanobot

Cline (VS Code)

Cursor

Aider

How It Works

Key Status

Other Providers

Running as a Service (systemd)

FAQ

Roadmap

Contributing

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

⚡ Alvus

The Problem

The Solution

Works With Everything

Features

Quickstart

1. Get the binary

2. Configure

3. Run

4. Point your agent at it

OpenClaw

PicoClaw / Nanobot

Cline (VS Code)

Cursor

Aider

How It Works

Key Status

Other Providers

Running as a Service (systemd)

FAQ

Roadmap

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages