ai multi provider routing

🔀 Multi-Provider Routing

Note

👋 Hey there! Siyarix is a personal passion project built by a single developer that is growing and under active development. Some of the architectural components and features described on this page might currently be Planned, Work in Progress, or basic implementations. Stay tuned as it evolves! 🚀

Siyarix boasts robust support for 25 AI providers (24 cloud/local + 1 offline registry), all accessible through a unified, OpenAI-compatible adapter located in openai_compat.py.

At the heart of this system is the ProviderManager singleton. Think of it as the air traffic controller for your AI requests—it handles provider registration, credential pooling, seamless failover, exponential-backoff cooldowns, and smart multi-model ensemble decisions.

🌐 Supported Providers

Siyarix integrates with a wide array of top-tier AI providers. Here's a quick look at what's supported out of the box:

Provider	Type	Env Variable	Default Model	Base URL
OpenAI	Cloud	`OPENAI_API_KEY`	gpt-5.5	(default)
Anthropic	Cloud	`ANTHROPIC_API_KEY`	claude-sonnet-4-6	(via openai compat)
Google Gemini	Cloud	`GEMINI_API_KEY`	gemini-3.1-pro-preview	`generativelanguage.googleapis.com/v1beta/openai/`
DeepSeek	Cloud	`DEEPSEEK_API_KEY`	deepseek-v4-flash	`api.deepseek.com`
xAI (Grok)	Cloud	`XAI_API_KEY`	grok-4.1	`api.x.ai`
Perplexity	Cloud	`PERPLEXITY_API_KEY`	sonar-pro	`api.perplexity.ai`
Groq	Cloud	`GROQ_API_KEY`	llama-4-scout	`api.groq.com/openai/v1`
Together AI	Cloud	`TOGETHER_API_KEY`	Llama-4-Scout	`api.together.xyz/v1`
OpenRouter	Cloud	`OPENROUTER_API_KEY`	openai/gpt-5.5	`openrouter.ai/api/v1`
Cerebras	Cloud	`CEREBRAS_API_KEY`	gpt-oss-120b	`api.cerebras.ai/v1`
Fireworks AI	Cloud	`FIREWORKS_API_KEY`	kimi-k2.6	`api.fireworks.ai/inference/v1`
Mistral AI	Cloud	`MISTRAL_API_KEY`	(from profile)	`api.mistral.ai`
Z.AI	Cloud	`ZAI_API_KEY`	glm-5.1	`api.z.ai/api/paas/v4`
MiniMax	Cloud	`MINIMAX_API_KEY`	MiniMax-M3	`api.minimax.io/v1`
Moonshot	Cloud	`MOONSHOT_API_KEY`	kimi-k2.6	`api.moonshot.ai/v1`
NVIDIA NIM	Cloud	`NVIDIA_API_KEY`	Nemotron-3-Super	`integrate.api.nvidia.com/v1`
HuggingFace	Cloud	`HUGGINGFACE_API_KEY`	(varies)	`api-inference.huggingface.co/v1`
Azure OpenAI	Cloud	`AZURE_OPENAI_API_KEY`	gpt-5.5	(user-configured)
OpenCodeZen	Cloud	`OPENCODE_API_KEY`	deepseek-v4-flash	`opencode.ai/zen/v1`
Ollama	Local	—	llama3.1	`localhost:11434/v1`
LM Studio	Local	—	(varies)	`localhost:1234/v1`
llama.cpp	Local	—	(varies)	`localhost:18080/v1`
vLLM	Local	—	(varies)	`localhost:8000/v1`
LocalAI	Local	—	(varies)	`localhost:8080/v1`
Registry	Heuristic	—	—	—

Tip

Local providers like Ollama, LM Studio, and others typically don't require an API key environment variable. Siyarix is smart enough to handle them seamlessly!

🏗️ Architecture

Understanding how Siyarix routes a user request can help you debug and optimize your configuration. Here's a simplified flow:

User Input → _execute_instruction()
                │
                ▼
         ProviderManager.select_provider(preferred)
                │
                ▼
         ┌──────────────┐
         │  Provider A  │ ← preferred (user config or auto-detect)
         │  (primary)   │
         └──────┬───────┘
                │
        ┌─── Success ────→ Return result
        │
        └─── Failure ────→ ProviderManager.classify_error()
                           ├── AUTH → mark credential "dead"
                           ├── RATE_LIMIT → exponential backoff
                           ├── TIMEOUT → retry with backoff
                           ├── CONTEXT_OVERFLOW → compact and retry
                           ├── MODEL_NOT_FOUND → fallback model
                           └── SERVER_ERROR → record_failure with cooldown
                                    │
                                    ▼
                           ProviderStateManager.record_failure()
                           (persistent cooldown across restarts via JSON)

Note

The ProviderStateManager ensures that failure cooldowns persist even if you restart Siyarix, preventing endless retry loops on failing APIs.

🎛️ Provider Manager (Singleton)

The ProviderManager is a thread-safe singleton, meaning there's only ever one instance running, and it safely handles requests from multiple threads. It centralizes all provider logic.

from siyarix.providers import ProviderManager

pm = ProviderManager.get_instance()

📝 Registration

All 25 providers are registered using individual profile files located in src/siyarix/providers/profiles/. This modular approach makes it super easy to add new providers in the future.

pm.register(ProviderProfile(
    name="openai",
    display_name="OpenAI",
    default_model="gpt-5.5",
    api_key_env="OPENAI_API_KEY",
    base_url="",
    supports_streaming=True,
    supports_tools=True,
    supports_vision=True,
    cost_tier=CostTier.HIGH,
    provider_type=ProviderType.CLOUD,
    priority=10,
    docs_url="https://platform.openai.com/docs/models",
))

Each profile defines its supported models using the ModelInfo dataclass, ensuring Siyarix knows exactly what each model is capable of:

ModelInfo(
    name="gpt-5.5",
    supports_vision=True,
    supports_structured_output=True,
    supports_function_calling=True,
    context_window=1050000,
    cost_tier=CostTier.HIGH,
)

🕵️ Auto-Detect

If you set model_provider = "auto", Siyarix isn't just guessing. ProviderManager.auto_detect_provider() intelligently scans through profiles based on priority, looking for configured API keys or running local endpoints.

def auto_detect_provider(self) -> str | None:
    for profile in self.list_profiles():
        if resolve_api_key(profile.name, profile.api_key_env):
            return profile.name
        if profile.provider_type == ProviderType.LOCAL and profile.base_url:
            return profile.name
    return None

⚖️ Preference Ordering

You can control the priority of providers via your settings.toml file. The list_profiles() function respects this configuration.

provider_priority = "openai, gemini, anthropic, groq"

Providers are sorted first by their index in your priority list, and then by their default priority score.

📦 Provider Data Models

To keep things organized, all data structures representing providers and models are stored in src/siyarix/providers/types.py.

ProviderProfile

This defines everything Siyarix needs to know about a provider.

@dataclass
class ProviderProfile:
    name: str                          # Internal identifier (e.g. "openai")
    display_name: str                  # Human-readable name
    models: list[ModelInfo]            # Supported models with capability metadata
    default_model: str                 # Fallback model for this provider
    api_key_env: str                   # Environment variable for API key
    base_url: str                      # API base URL
    supports_streaming: bool           # Streaming support
    supports_tools: bool               # Function/tool calling
    supports_vision: bool              # Image input support
    supports_structured_output: bool   # JSON structured output mode
    sdk_dependency: str                # Optional SDK package requirement
    max_tokens: int                    # Max output tokens
    max_context_tokens: int            # Max context window size
    priority: int                      # Preference ordering
    cost_tier: CostTier                # FREE / LOW / MEDIUM / HIGH
    provider_type: ProviderType        # CLOUD or LOCAL
    fallback_models: list[str]         # Alternative models to try on failure
    docs_url: str                      # Link to provider documentation

ProviderCredential

This keeps track of API keys, URLs, and the current health status of the credential.

@dataclass
class ProviderCredential:
    provider: str
    api_key: str = ""
    base_url: str = ""
    status: str = "active"             # "active", "dead", or "cooldown"
    cooldown_until: float = 0.0
    failure_count: int = 0
    last_used: float = 0.0

    @property
    def is_available(self) -> bool:
        # True unless dead, in cooldown, or missing both key and URL

ModelInfo

Details the specific capabilities of an individual model.

@dataclass
class ModelInfo:
    name: str
    supports_vision: bool = False
    supports_tools: bool = True
    supports_structured_output: bool = False
    supports_function_calling: bool = True
    context_window: int = 8192
    cost_tier: CostTier = CostTier.MEDIUM

🏷️ Enums

We use standardized enums to keep categories consistent:

FailoverReason: AUTH, RATE_LIMIT, BILLING, TIMEOUT, SERVER_ERROR, CONTEXT_OVERFLOW, MODEL_NOT_FOUND, UNKNOWN
CostTier: FREE, LOW, MEDIUM, HIGH
ProviderType: CLOUD, LOCAL

ClassifiedError

When something goes wrong, it's classified into an actionable format.

@dataclass
class ClassifiedError:
    reason: FailoverReason
    retryable: bool = True
    should_rotate_credential: bool = False
    should_fallback: bool = False
    should_compress: bool = False
    message: str = ""

🛡️ Error Classification & Failover

Robust error handling is critical when working with external APIs. Siyarix is designed to handle hiccups gracefully.

Classification Strategy

When an API call fails, ProviderManager.classify_error() kicks into action using a multi-pass strategy:

HTTP status code: Quickly maps standard errors (like 429 for rate limits) to a FailoverReason.
Error message text: Scans the error response for keywords (e.g., "rate limit", "timeout", "401").
Credential rotation hints: Detects auth or billing failures to trigger credential rotation.

Failover Reasons & Actions

Reason	HTTP Status	Retryable	Action
`AUTH`	401, 403	No	Mark credential dead, rotate
`RATE_LIMIT`	429	Yes	Exponential backoff (10s→20s→40s→...→3600s)
`BILLING`	402	No	Mark credential dead
`TIMEOUT`	408	Yes	Backoff (5s→10s→...→300s)
`SERVER_ERROR`	500, 502, 503, 504, 529	Yes	Backoff (5s→10s→...→300s)
`CONTEXT_OVERFLOW`	—	Yes	Compact history, retry
`MODEL_NOT_FOUND`	404	No	Fall back to alternative model
`UNKNOWN`	—	No	Propagate error

Warning

If a credential fails due to AUTH or BILLING issues, Siyarix marks it as "dead" to prevent burning through retries and instantly pivots to a fallback provider.

Failure Recording (Circuit Breaking)

ProviderManager.record_failure() is Siyarix's built-in circuit breaker:

AUTH/BILLING: Immediate halt. No further attempts with this credential.
RATE_LIMIT: Calculates an exponential backoff time (up to an hour) to let the API recover.
TIMEOUT/SERVER_ERROR: Uses a shorter backoff curve (up to 5 minutes) since these are often temporary glitches.

pm.record_failure(provider, classified.reason)

Per-Session "Skip-Known-Bad" Cache

Nobody likes waiting for the same failing model over and over. ProviderStateManager keeps a short-term memory (5 minutes) of failing (provider, model) combos to skip them entirely.

state_manager.mark_skip_candidate(session_id, "openai", "gpt-5.5")
state_manager.is_candidate_skipped(session_id, "openai", "gpt-5.5")  # True for 5 min

Availability Checks

Need to know who's ready to work?

pm.get_available_providers(preferred=["openai", "gemini"])
# Returns only non-cooldown providers, with preferred ones at the top of the list

💾 Provider State Manager

API state shouldn't be lost when you restart the app. The ProviderStateManager persists cooldown and failure states to a lightweight JSON file (provider_state.json).

COOLDOWN_STEPS = [30.0, 60.0, 300.0]
MAX_COOLDOWN = 300.0

This ensures that if you hit an hour-long rate limit, restarting Siyarix won't accidentally hammer the API again. It tracks:

disabled: Timestamps for when cooldowns expire.
failure_counts: How many times a provider has failed consecutively.
last_fail_time: When the most recent failure happened.

state_manager.record_failure(provider, reason)  # Saves to JSON automatically
state_manager.record_success(provider)           # Clears cooldown status
state_manager.is_disabled(provider)              # Checks if still in cooldown
state_manager.cooldown_remaining(provider)       # Time left until ready

🔑 Credential Resolution

Finding the right API key is handled by resolve_api_key(). It uses a smart, three-tier fallback approach:

Credential Store: Checks the secure CredentialStore (CredentialStore.retrieve(provider, "api_key")).
Environment Variable: Looks for standard env vars like OPENAI_API_KEY.
Empty String: Allows local providers (like Ollama) to proceed without a key.

def resolve_api_key(provider: str, env_var: str | None = None) -> str | None:
    # 1. Try credential store
    # 2. Try environment variable
    # 3. Return None

🪪 Model ID Normalization

Model names change, and standardizing them is crucial. model_aliases.py ensures that no matter what the user types, Siyarix knows the correct internal name.

from siyarix.model_aliases import normalize_model_id, resolve_alias, list_aliases, register_alias

model = normalize_model_id("anthropic", "claude-opus-4.8")  # → "claude-opus-4-8"
model = normalize_model_id("gemini", "gemini-3-pro")        # → "gemini-3.1-pro-preview"
model = normalize_model_id("deepseek", "deepseek-v4")       # → "deepseek-v4-flash"

🦙 Ollama Utilities

Working with local models should be frictionless. ollama_utils.py provides helpers to ensure Ollama is running when you need it.

from siyarix.providers.ollama_utils import ensure_ollama_running

# Launches Ollama in background if configured and not already running
ensure_ollama_running()

Tip

Siyarix can automatically launch Ollama if model_provider is set to "ollama" or if _start_ollama_on_launch is enabled in your settings!

🎯 Provider Selection

Need to ask an AI a question? Here's how Siyarix decides who gets the job.

# Auto-detect the first available provider
provider, model = pm.select_provider(preferred=None)

# Explicitly request a specific provider
provider, model = pm.select_provider(preferred="openai")

Capability-Based Filtering

You can also ask Siyarix for providers that meet specific criteria:

# Get all cloud providers supporting function calling
providers = pm.get_providers_by_capability(function_calling=True, local=False, free=False)

# Get only free-tier local providers
free_local = pm.get_providers_by_capability(free=True, local=True)

# Get vision-capable providers
vision_providers = pm.get_providers_by_capability(vision=True)

📊 Usage Tracking

Keep an eye on your API costs! The UsageTracker (found in usage.py) monitors token consumption and estimates costs per provider.

from siyarix.providers import UsageTracker

tracker = UsageTracker()
tracker.record_call("openai", "gpt-5.5", input_tokens=500, output_tokens=150, cost_tier=CostTier.HIGH)
print(tracker.summary())
# LLM calls: 1 | Tokens: 500↑ 150↓ | Est. cost: $0.0086

Important

Usage statistics are persisted to JSON, allowing you to track costs and token limits across multiple sessions.

🩺 Health Check

Wondering if your providers are online? Run the health check command:

siyarix health

This command pings all configured providers and reports back on their availability, latency, and any recent errors.

📈 Provider Statistics

For programmatic access to provider health:

stats = pm.stats()
# {
#     "total_providers": 25,
#     "credentials": {"openai": 1, "anthropic": 0, ...},
#     "error_counts": {"openai": 3},
# }

📁 Related Modules

Want to dive deeper into the code? Here is where everything lives:

Module	Path	Purpose
`ProviderManager`	`src/siyarix/providers/manager.py`	Singleton provider registry, failover, ensemble, stats
`ProviderStateManager`	`src/siyarix/providers/state.py`	Persistent cooldown state (JSON-based), skip-known-bad cache
`UsageTracker`	`src/siyarix/providers/usage.py`	Token usage and cost estimation
`ProviderProfile` / `ModelInfo`	`src/siyarix/providers/types.py`	Data models for provider metadata
`openai_compat.py`	`src/siyarix/chat/openai_compat.py`	Universal OpenAI-compatible adapter
`normalize_model_id`	`src/siyarix/model_aliases.py`	Model ID normalization and alias resolution
`ensure_ollama_running`	`src/siyarix/providers/ollama_utils.py`	Ollama background launcher
`profiles/`	`src/siyarix/providers/profiles/`	25 individual provider profiles

Note

👋 Welcome to Siyarix! This is a personal passion project built by a single developer. It's currently under active development and growing fast. Expect rough edges, but lots of love! ❤️

🗺️ Siyarix Documentation Map

Welcome to the Siyarix Documentation Map! This page serves as your master compass for navigating the extensive documentation we have built for the platform.

Whether you are a brand new user, a seasoned security operator, or a developer looking to contribute to the core engine, you can find exactly what you need here.

🧭 Quick Navigation

Not sure where to start? Pick the path that best describes you:

🌱 For New Users

Just getting started? We highly recommend following these guides in order:

Installation Guide — Get Siyarix running on your machine.
Onboarding Wizard — Let our interactive wizard help you set up your API keys and environment.
Setup & Configuration — A deeper dive into customizing your setup.
Your First Run — A gentle walkthrough of your very first Siyarix command.

🛡️ For Security Operators

Ready to put Siyarix to work? Dive into our operational guides:

Interactive Chat (REPL) — Learn how to use the powerful interactive terminal.
Security Workflows — Best practices for recon, vulnerability assessment, and incident response.
Cloud & IaC Scanning — How to secure your cloud environments and infrastructure code.
Compliance Frameworks — Map your scans to SOC 2, HIPAA, ISO 27001, and more.

💻 For Developers & Contributors

Looking under the hood or wanting to write some code? Start here:

Contribution Guide — Our workflow, standards, and how you can help!
Codebase Overview — A comprehensive map of our 82+ source modules.
Testing Standards — How we ensure reliability with pytest and CI/CD.
Module Architecture — Component design and responsibilities.

📂 The Complete Documentation Tree

If you prefer to browse the raw structure, here is a complete layout of the docs/ folder:

docs/
├── 🚀 getting-started/       # Installation, onboarding, and configuration
│   ├── installation.md       # Multi-platform install (pip, brew, winget, docker)
│   ├── onboarding.md         # The interactive 11-step setup wizard
│   ├── setup.md              # Managing API keys, credentials, and settings
│   ├── first-run.md          # A walkthrough of your first session
│   ├── configuration.md      # A deep-dive into advanced settings
│   └── troubleshooting.md    # Common issues and how to fix them instantly
│
├── 📖 user/                  # Daily operations and workflows
│   ├── cli-commands.md       # Reference for 50+ CLI commands across 12 groups
│   ├── interactive-chat.md   # Mastering the AI REPL and 54+ slash commands
│   ├── security-workflows.md # Recon, vulnerability assessment, incident response
│   ├── cloud-scanning.md     # Multi-cloud security scanning (under development)
│   ├── compliance.md         # Framework mapping (SOC 2, NIST, GDPR, PCI-DSS)
│   ├── threat-intelligence.md# Integrations with OTX, NVD, and MITRE ATT&CK
│   ├── playbooks.md          # Building automated YAML-based IR playbooks
│   ├── workflow-files.md     # DAG workflow reference (programmatic API)
│   ├── reporting.md          # Multi-format report generation
│   ├── offline-registry.md   # Running without AI (Offline/Registry execution mode)
│   └── ai-workflows.md       # Advanced AI-driven autonomous operations
│
├── 💻 developer/             # Building, testing, and extending Siyarix
│   ├── codebase-overview.md  # Full module structure mapping
│   ├── contribution-guide.md # How to submit PRs and our coding standards
│   ├── module-architecture.md# Component design and responsibilities
│   ├── testing.md            # Writing tests (pytest), coverage, and CI/CD
│   └── building.md           # Packaging, distribution, and Docker builds
│
├── 🏗️ architecture/          # System design and core internals
│   ├── overview.md           # High-level data flow and layered orchestration
│   ├── ai-agent-pipeline.md  # The AgentCore reasoning and execution pipeline
│   ├── provider-abstraction.md# How we unify 26 different AI providers
│   ├── execution-engine.md   # Plan-based step orchestration
│   ├── memory-and-state.md   # Knowledge graph, session persistence, and learning
│   ├── security-model.md     # The Permission Gate, DLP, audit logging, and OPSEC
│   └── intent-routing.md     # Semantic intent classification and routing
│
├── 🧠 ai/                    # Deep dive into the AI provider & agent systems
│   ├── routing.md            # Managing 26 providers, failovers, and circuit breakers
│   ├── persona-system.md     # Overview of our 10 security personas
│   ├── agent-reasoning.md    # The Observe-Reason-Act loop and tool call repair
│   ├── tool-execution.md     # The tool registry, capability graph, and parsers
│   ├── ensemble.md           # Parallel LLM voting strategies
│   ├── multi-wave.md         # Iterative goal execution with context carry-over
│   ├── prompt-architecture.md# System prompt design and management
│   └── safety.md             # Our rigorous 8-layer hallucination mitigation system
│
├── 🛡️ security/              # Safety, ethics, and threat models
│   ├── reporting.md          # How to safely report vulnerabilities to us
│   ├── threat-model.md       # System threat model and our mitigations
│   ├── operational-security.md# TOR routing, stealth modes, and OPSEC controls
│   ├── ethical-policy.md     # Mandatory rules of engagement for all users
│   └── abuse-prevention.md   # How we prevent misuse of the AI engine
│
└── ⚖️ legal/                 # Licensing and governance
    ├── agpl-guide.md         # A plain-English overview of the AGPL-3.0-or-later license
    ├── why-agpl.md           # The philosophy behind our license choice
    ├── trademark-policy.md   # Branding and trademark guidelines
    ├── responsible-ai.md     # Our framework for ethical AI usage
    ├── disclaimer.md         # Important legal disclaimers
    └── plugin-exception.md   # The license exception for building custom plugins

📖 Key Terminology

As you read through the documentation, you might encounter some specific terms. Here is a quick cheat sheet:

Term	What It Means
Provider	The backend AI engine powering Siyarix (e.g., OpenAI, Anthropic, Ollama).
Tool	A traditional security executable installed on your system (e.g., `nmap`, `nuclei`).
Plan	A step-by-step sequence of tool commands intelligently generated by the AI.
Workflow	A hardcoded, predefined execution path (usually defined in YAML/JSON) that doesn't require AI generation.
Persona	A specialized behavioral profile given to the AI (e.g., instructing it to act specifically as a "Network Recon Specialist").
Knowledge Graph	Siyarix's internal memory where it stores findings (like IP addresses, open ports) to contextually inform future steps.

Need help finding something specific? Feel free to use the search bar at the top of the documentation site, or open a discussion on our GitHub!

ai multi provider routing

🔀 Multi-Provider Routing

🌐 Supported Providers

🏗️ Architecture

🎛️ Provider Manager (Singleton)

📝 Registration

🕵️ Auto-Detect

⚖️ Preference Ordering

📦 Provider Data Models

ProviderProfile

ProviderCredential

ModelInfo

🏷️ Enums

ClassifiedError

🛡️ Error Classification & Failover

Classification Strategy

Failover Reasons & Actions

Failure Recording (Circuit Breaking)

Per-Session "Skip-Known-Bad" Cache

Availability Checks

💾 Provider State Manager

🔑 Credential Resolution

🪪 Model ID Normalization

🦙 Ollama Utilities

🎯 Provider Selection

Capability-Based Filtering

📊 Usage Tracking

🩺 Health Check

📈 Provider Statistics

📁 Related Modules

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!