An OpenAI-compatible API layer for AI Horde that enables opencode and any OpenAI-compatible client to use AI Horde's distributed GPU network for text generation.
AI Horde is a crowdsourced distributed cluster of text and image generation workers. This interposer translates OpenAI chat completion requests to AI Horde's native async format and handles the polling workflow.
┌─────────────┐ ┌──────────────────┐ ┌─────────────┐
│ opencode │───▶│ Interposer │───▶│ AI Horde │
│ or any │ │ (this project) │ │ API │
│ OpenAI SDK │◀───│ │◀───│ │
└─────────────┘ └──────────────────┘ └─────────────┘
│
▼
┌───────────────┐
│ Model Registry│
│ (from workers)│
└───────────────┘
- OpenAI-compatible endpoints:
/v1/chat/completionsand/v1/models - Automatic request translation: Converts OpenAI format to AI Horde format
- Async polling: Handles submit/poll/retrieve workflow automatically
- Model discovery: Fetches capabilities from
/v2/workers?type=text - Instruct format support: ChatML, Mistral, and Alpaca prompt formats
- OpenCode integration: Auto-updating
opencode.jsonwith available models
pip install -e .SET AI_HORDE_API_KEY=your_api_key(defaults to low priority queue if left empty)
uvicorn horde_openai.server:app --host 0.0.0.0 --port 8080You WILL randomly hit 403 errors if this is not specified, it's not a bug.
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "awsome_engine/splendid_model",
"messages": [{"role": "user", "content": "Hello! How are you?"}],
"max_tokens": 50
}'curl http://localhost:8080/v1/modelspython update_opencode_models.py --onceThis creates an opencode.json with:
- All available AI Horde text models
- Proper OpenCode provider format
- Model-specific context/output limits
- Default model set to most available
# Run continuously with 5-minute refresh (default)
python update_opencode_models.py
# Custom refresh interval
python update_opencode_models.py --interval 600HordeStreaming/
├── src/horde_openai/
│ ├── __init__.py # Package exports
│ ├── client.py # AI Horde HTTP client with async polling
│ ├── models.py # Model registry from /v2/workers
│ ├── translate.py # Request/response translation
│ └── server.py # FastAPI server with OpenAI endpoints
├── tests/
│ └── test_interposer.py # 26 unit tests
├── docs/
│ └── INTERPOSER_SPEC.md # API specification
├── opencode.json # OpenCode provider config (auto-generated)
├── pyproject.toml # Package configuration
└── update_opencode_models.py # Model updater script
POST /v1/chat/completions
{
"model": "koboldcpp/Fimbulvetr-11B-v2",
"messages": [
{"role": "system", "content": "You are helpful."},
{"role": "user", "content": "Tell me a joke."}
],
"temperature": 0.7,
"max_tokens": 100
}GET /v1/models
Returns all available text generation models with their capabilities.
- Request received at
/v1/chat/completions - Translate OpenAI messages to AI Horde prompt format
- Submit to
/api/v2/generate/text/async→ get job ID - Poll
/api/v2/generate/text/status/{id}until done - Translate response back to OpenAI format
- Return completion response
pytest tests/ -v| Variable | Description | Default |
|---|---|---|
AI_HORDE_API_KEY |
AI Horde API key | 0000000000 (anonymous) |
Even though you can use the anon apikey, still, its suggested for you to obtain your key (and contribute back to the horde).
- No true streaming: AI Horde's async API doesn't support real-time streaming
- Latency: 2-30 seconds depending on queue
- Token limit: Maximum 4096 tokens per generation
- Availability: Depends on volunteer workers
MIT