-
-
Notifications
You must be signed in to change notification settings - Fork 3
LLM Providers
CortexPrism supports 24 LLM providers out of the box. Each provider is implemented in src/llm/ and exposes a common LLMProvider interface.
| Provider | File | Type | Notes |
|---|---|---|---|
| Anthropic Claude | anthropic.ts |
Native SDK | SSE streaming, extended thinking |
| OpenAI | openai.ts |
Native SDK | GPT-4o, o-series models, reasoning effort |
| Google Gemini | google.ts |
Native SDK | Native vision, thinking budget |
| Mistral AI | mistral.ts |
OpenAI-compatible | Mistral API |
| Groq | groq.ts |
OpenAI-compatible | Ultra-fast inference |
| DeepSeek | deepseek.ts |
OpenAI-compatible | Chat + Reasoner models |
| OpenRouter | openrouter.ts |
OpenAI-compatible | Routes to 200+ models |
| xAI Grok | xai.ts |
OpenAI-compatible | Grok-2 and newer |
| Together AI | together.ts |
OpenAI-compatible | 100+ open models |
| AWS Bedrock | bedrock.ts |
AWS SDK | Converse API (Claude, Llama, Titan) |
| Cohere | cohere.ts |
Native v2 API | Command R+ models |
| Ollama | ollama.ts |
Native | Local models, NDJSON streaming |
| Cerebras | cerebras.ts |
OpenAI-compatible | Cerebras inference |
| Fireworks | fireworks.ts |
OpenAI-compatible | Fireworks AI |
| Perplexity | perplexity.ts |
OpenAI-compatible | Online LLM with citations |
| NVIDIA NIM | nvidia.ts |
OpenAI-compatible | NVIDIA inference |
| Moonshot (Kimi) | moonshot.ts |
OpenAI-compatible | Moonshot AI |
| Novita AI | novita.ts |
OpenAI-compatible | Novita inference |
| LM Studio | lmstudio.ts |
OpenAI-compatible | Local LLM server |
| LiteLLM | litellm.ts |
OpenAI-compatible | Multi-provider proxy |
| Hugging Face | huggingface.ts |
OpenAI-compatible | HF Inference Router |
| Alibaba (Qwen) | alibaba.ts |
OpenAI-compatible | Alibaba Cloud |
| Venice AI | venice.ts |
OpenAI-compatible | Privacy-focused LLMs |
| Kilo AI | kilo.ts |
OpenAI-compatible | Kilo-hosted models |
All providers implement:
interface LLMProvider {
readonly name: string;
readonly defaultModel: string;
complete(options: CompletionOptions): Promise<CompletionResult>;
stream(options: CompletionOptions): AsyncIterable<CompletionChunk>;
}Each provider exposes unique parameters in config:
| Provider | Field | Purpose |
|---|---|---|
| Anthropic, OpenAI, Google | reasoningEffort |
Extended thinking budget (low/medium/high) |
| OpenRouter |
httpReferer, xTitle
|
HTTP headers for ranking |
| Perplexity | searchRecencyFilter |
Recency filter (month/week/day/hour) |
| Together, Fireworks, Novita | repetitionPenalty |
Token repetition penalty |
| Ollama, LM Studio |
numCtx, keepAlive
|
Context window size, keep-alive duration |
| LiteLLM | dropParams |
Drop unsupported parameters silently |
| Venice AI | includeVeniceSystemPrompt |
Venice system prompt toggle |
Providers with native vision/image support:
| Provider | Image Support | Document Support |
|---|---|---|
| Anthropic | Native image blocks | Native document blocks |
| Google Gemini |
inlineData parts |
— |
| OpenAI |
image_url parts |
— |
| Ollama |
images array |
— |
| Bedrock | Text extraction | Text extraction |
For text-only models, uploaded files are saved to disk and a descriptive message is appended to the prompt.
The model router wraps providers with intelligent routing strategies:
- Cascade: Tries cheapest provider first, escalates on low confidence
- Threshold (RouteLLM-style): Scores prompt complexity, routes to strong or weak model
Both routers implement LLMProvider, making them drop-in replacements.
MQM is a learning-based model selection engine that dynamically routes requests based on:
- Historical performance by task category
- Episodic memory hits and quality signals
- Cost optimization
- Trajectory patterns (recent model usage)
- Reflection feedback
Three arbiter strategies: conservative, balanced, aggressive.
Configure in ~/.cortex/config.json:
{
"defaultProvider": "anthropic",
"providers": {
"anthropic": {
"kind": "anthropic",
"model": "claude-sonnet-4-5",
"apiKey": "sk-ant-..."
}
},
"modelSelection": {
"enabled": true,
"mode": "balanced",
"observeThreshold": 50
}
}- Configuration — Full config reference
- Model Routing — Router strategies in detail
- Model Quartermaster — MQM architecture
CortexPrism — Open-source agentic AI harness · MIT License · Built with Deno 2.x + TypeScript