Skip to content

ai multi model ensemble

MD MUFTHAKHERUL ISLAM MIRAZ edited this page Jun 24, 2026 · 2 revisions

🧠 Multi-Model Ensemble

Note

👋 Hey there! Siyarix is a personal passion project built by a single developer that is growing and under active development. Some of the architectural components and features described on this page might currently be Planned, Work in Progress, or basic implementations. Stay tuned as it evolves! 🚀

Ever wish you could ask a panel of experts a question and go with the majority opinion? That's exactly what the ProviderManager.ensemble_decide() method does!

By running a single query across multiple AI providers simultaneously, this method returns the majority-vote result. This approach gives your application three large superpowers:

  • Hallucination Resistance: Catches when one AI model goes completely off the rails.
  • Consensus Validation: Builds confidence when multiple top-tier models agree.
  • Graceful Degradation: Keeps your app running smoothly even if an individual provider fails or times out.

Note

Currently, this is a lightweight, functional implementation embedded directly in ProviderManager rather than a standalone class. We have an exciting roadmap for a more feature-rich ensemble, including weighted voting strategies and advanced hallucination scoring!


🏗️ Architecture

Here is a high-level look at how a user task flows through the ensemble:

User Task
    │
    ▼
┌──────────────────────────────────────────────┐
│       ProviderManager.ensemble_decide()      │
│                                              │
│  ┌──────────┐  ┌──────────┐  ┌────────────┐  │
│  │  OpenAI  │  │  Gemini  │  │ Anthropic  │  │
│  │(gpt-4o)  │  │(gemini)  │  │ (claude)   │  │
│  └────┬─────┘  └────┬─────┘  └──────┬─────┘  │
│       │             │               │        │
│       ▼             ▼               ▼        │
│  ┌────────────────────────────────────────┐  │
│  │        Majority Vote (Counter)         │  │
│  └────────────────────────────────────────┘  │
│                      │                       │
│                      ▼                       │
│              Selected Response               │
└──────────────────────────────────────────────┘

⚙️ How It Works

Behind the scenes, we use asynchronous Python to make this process incredibly fast and robust. Here is the magic signature:

async def ensemble_decide(
    self, system_prompt: str, user_prompt: str, providers: list[str]
) -> str:

The 5-Step Process

  1. Concurrent Execution: Every provider in your list is called at the exact same time using asyncio.gather.
  2. Fault Tolerance: If one provider crashes, it doesn't bring down the ship. Errors are caught and ignored.
  3. Data Extraction: The system normalizes responses, pulling out the core text whether the API returns a dictionary, an object, or a raw string.
  4. Tallying the Votes: A classic Python collections.Counter finds the most common response.
  5. Declaring a Winner: The majority response is returned.

Warning

If all providers fail to return a valid response, the method will raise a RuntimeError. Always ensure you have reliable fallback providers in your list!

Here is the core logic in action:

# 1 & 2: Call all providers concurrently, ignoring individual failures
responses = await asyncio.gather(
    *[self.complete(p, self.select_provider(p)[1], system_prompt, user_prompt) for p in providers],
    return_exceptions=True,
)

# 3: Extract valid text content
valid = []
for r in responses:
    if isinstance(r, Exception):
        continue
    if isinstance(r, dict) and "content" in r:
        valid.append(r["content"])
    elif hasattr(r, "content"):
        valid.append(r.content)
    elif isinstance(r, str):
        valid.append(r)

# Guard against total failure
if not valid:
    raise RuntimeError("All ensemble providers failed")

# 4 & 5: Find and return the most common answer
most_common = Counter(valid).most_common(1)[0][0]
return most_common

🗳️ Voting Strategy

Right now, we use a straightforward majority vote (plurality) system. Whichever response text occurs most frequently across your selected providers is declared the winner.

Because AI ensemble decision-making is an emerging field, we've focused heavily on creating a reliable, reliable foundation:

Aspect Behavior
Strategy Majority (plurality) — the most common identical response wins.
Speed Maximum efficiency! All providers are queried concurrently.
Resilience Individual API timeouts or errors are completely absorbed.
Flexibility Automatically parses dict, object, and plain string response formats.

🎯 Selecting Providers

You rarely want to hardcode your providers. Instead, use ProviderManager.get_providers_by_capability() to dynamically select the best models for the job based on what they can do:

# Get all cloud providers that support function calling
providers = pm.get_providers_by_capability(
    function_calling=True,
    local=False,
    free=False,
)

# On a budget? Get only free-tier providers!
free_providers = pm.get_providers_by_capability(free=True)

Capability Filters

Parameter What it filters for
vision Providers that can "see" and process image inputs.
free Models where the cost tier is explicitly set to FREE.
local Privacy-first models running locally on your machine.
function_calling Providers capable of executing tools and structured functions.

Tip

Mixing local and cloud providers is a great way to maintain high availability while managing costs!


🚀 Usage Example

Ready to put it to the test? Here is a complete example of how to use the ensemble in your code:

from siyarix.providers import ProviderManager

pm = ProviderManager.get_instance()

# Hand-pick your dream team
providers = ["openai", "gemini", "anthropic"]

result = await pm.ensemble_decide(
    system_prompt="You are a senior security analyst.",
    user_prompt="What ports are typically open on a standard web server?",
    providers=providers,
)

print(f"Ensemble consensus: {result}")

💬 Chat Engine Integration

The ensemble concept isn't just for raw API calls. The chat engine (chat/engine.py) uses a lightweight MultiModelEnsemble stub to bring this power directly to user conversations. It applies a weighted voting strategy and gives you a neat little consensus dashboard:

┌──────────────────────────────────────────────┐
│ 🔮 Multi-Model Ensemble                      │
│                                              │
│ Ensemble: Weighted consensus across 3 models │
│ Providers: openai, gemini, anthropic         │
│ Consensus: 67%  Hallucination risk: 33%      │
└──────────────────────────────────────────────┘

🕵️ Hallucination Detection (Emerging Feature)

One of the coolest things about querying multiple models is that you can mathematically detect when an AI is "hallucinating" (making things up). We do this by measuring the variance between their answers:

  • Low Variance: Everyone agrees. You can trust this answer. (Low Hallucination Risk)
  • High Variance: The models are giving wildly different answers. Flag this for human review! (High Hallucination Risk)

Our EnsembleResult dataclass tracks all of this metadata for you:

@dataclass
class EnsembleResult:
    task: str
    responses: list[dict]         # Every provider's raw answer
    selected_plan: str            # The winning response
    voting_strategy: str          # e.g., 'majority', 'weighted'
    consensus_level: float        # Score from 0.0 to 1.0
    hallucination_risk: float     # Score from 0.0 to 1.0 (Higher = bad)
    total_cost: float             # Cumulative cost of all API calls
    total_latency_ms: float       # Total wall-clock time

💰 Cost Tiers

Running queries across multiple providers means costs can add up quickly. Thankfully, the UsageTracker monitors everything per-call based on our defined tiers:

Caution

Remember that an ensemble multiplies your API costs by the number of paid providers you include. Use FREE and LOW tier providers strategically!

Cost Tier Rate (per output token) Example Providers
FREE $0.000000 Ollama, LM Studio, llama.cpp
LOW $0.00000015 Groq, Perplexity, Cerebras
MEDIUM $0.000002 OpenAI, Together, OpenRouter
HIGH $0.00001 Anthropic (certain premium models)

Internal rate card implementation:

rates = {
    CostTier.FREE: 0.0,
    CostTier.LOW: 0.15e-6,
    CostTier.MEDIUM: 2.0e-6,
    CostTier.HIGH: 10.0e-6,
}

🔗 Related Modules

Want to dive deeper into the codebase? Check out these related files:

Module Location What it does
ProviderManager.ensemble_decide src/siyarix/providers/manager.py:302 The core production ensemble logic.
ProviderManager.get_providers_by_capability src/siyarix/providers/manager.py:240 Helper for filtering and selecting providers.
UsageTracker src/siyarix/providers/usage.py Calculates and tracks your token costs.
ProviderProfile src/siyarix/providers/types.py Metadata and capability flags for each AI.
MultiModelEnsemble src/siyarix/chat/stubs.py UI/Chat integration for displaying consensus.

Note

👋 Welcome to Siyarix! This is a personal passion project built by a single developer. It's currently under active development and growing fast. Expect rough edges, but lots of love! ❤️

🗺️ Siyarix Documentation Map

Welcome to the Siyarix Documentation Map! This page serves as your master compass for navigating the extensive documentation we have built for the platform.

Whether you are a brand new user, a seasoned security operator, or a developer looking to contribute to the core engine, you can find exactly what you need here.


🧭 Quick Navigation

Not sure where to start? Pick the path that best describes you:

🌱 For New Users

Just getting started? We highly recommend following these guides in order:

  1. Installation Guide — Get Siyarix running on your machine.
  2. Onboarding Wizard — Let our interactive wizard help you set up your API keys and environment.
  3. Setup & Configuration — A deeper dive into customizing your setup.
  4. Your First Run — A gentle walkthrough of your very first Siyarix command.

🛡️ For Security Operators

Ready to put Siyarix to work? Dive into our operational guides:

💻 For Developers & Contributors

Looking under the hood or wanting to write some code? Start here:


📂 The Complete Documentation Tree

If you prefer to browse the raw structure, here is a complete layout of the docs/ folder:

docs/
├── 🚀 getting-started/       # Installation, onboarding, and configuration
│   ├── installation.md       # Multi-platform install (pip, brew, winget, docker)
│   ├── onboarding.md         # The interactive 11-step setup wizard
│   ├── setup.md              # Managing API keys, credentials, and settings
│   ├── first-run.md          # A walkthrough of your first session
│   ├── configuration.md      # A deep-dive into advanced settings
│   └── troubleshooting.md    # Common issues and how to fix them instantly
│
├── 📖 user/                  # Daily operations and workflows
│   ├── cli-commands.md       # Reference for 50+ CLI commands across 12 groups
│   ├── interactive-chat.md   # Mastering the AI REPL and 54+ slash commands
│   ├── security-workflows.md # Recon, vulnerability assessment, incident response
│   ├── cloud-scanning.md     # Multi-cloud security scanning (under development)
│   ├── compliance.md         # Framework mapping (SOC 2, NIST, GDPR, PCI-DSS)
│   ├── threat-intelligence.md# Integrations with OTX, NVD, and MITRE ATT&CK
│   ├── playbooks.md          # Building automated YAML-based IR playbooks
│   ├── workflow-files.md     # DAG workflow reference (programmatic API)
│   ├── reporting.md          # Multi-format report generation
│   ├── offline-registry.md   # Running without AI (Offline/Registry execution mode)
│   └── ai-workflows.md       # Advanced AI-driven autonomous operations
│
├── 💻 developer/             # Building, testing, and extending Siyarix
│   ├── codebase-overview.md  # Full module structure mapping
│   ├── contribution-guide.md # How to submit PRs and our coding standards
│   ├── module-architecture.md# Component design and responsibilities
│   ├── testing.md            # Writing tests (pytest), coverage, and CI/CD
│   └── building.md           # Packaging, distribution, and Docker builds
│
├── 🏗️ architecture/          # System design and core internals
│   ├── overview.md           # High-level data flow and layered orchestration
│   ├── ai-agent-pipeline.md  # The AgentCore reasoning and execution pipeline
│   ├── provider-abstraction.md# How we unify 26 different AI providers
│   ├── execution-engine.md   # Plan-based step orchestration
│   ├── memory-and-state.md   # Knowledge graph, session persistence, and learning
│   ├── security-model.md     # The Permission Gate, DLP, audit logging, and OPSEC
│   └── intent-routing.md     # Semantic intent classification and routing
│
├── 🧠 ai/                    # Deep dive into the AI provider & agent systems
│   ├── routing.md            # Managing 26 providers, failovers, and circuit breakers
│   ├── persona-system.md     # Overview of our 10 security personas
│   ├── agent-reasoning.md    # The Observe-Reason-Act loop and tool call repair
│   ├── tool-execution.md     # The tool registry, capability graph, and parsers
│   ├── ensemble.md           # Parallel LLM voting strategies
│   ├── multi-wave.md         # Iterative goal execution with context carry-over
│   ├── prompt-architecture.md# System prompt design and management
│   └── safety.md             # Our rigorous 8-layer hallucination mitigation system
│
├── 🛡️ security/              # Safety, ethics, and threat models
│   ├── reporting.md          # How to safely report vulnerabilities to us
│   ├── threat-model.md       # System threat model and our mitigations
│   ├── operational-security.md# TOR routing, stealth modes, and OPSEC controls
│   ├── ethical-policy.md     # Mandatory rules of engagement for all users
│   └── abuse-prevention.md   # How we prevent misuse of the AI engine
│
└── ⚖️ legal/                 # Licensing and governance
    ├── agpl-guide.md         # A plain-English overview of the AGPL-3.0-or-later license
    ├── why-agpl.md           # The philosophy behind our license choice
    ├── trademark-policy.md   # Branding and trademark guidelines
    ├── responsible-ai.md     # Our framework for ethical AI usage
    ├── disclaimer.md         # Important legal disclaimers
    └── plugin-exception.md   # The license exception for building custom plugins

📖 Key Terminology

As you read through the documentation, you might encounter some specific terms. Here is a quick cheat sheet:

Term What It Means
Provider The backend AI engine powering Siyarix (e.g., OpenAI, Anthropic, Ollama).
Tool A traditional security executable installed on your system (e.g., nmap, nuclei).
Plan A step-by-step sequence of tool commands intelligently generated by the AI.
Workflow A hardcoded, predefined execution path (usually defined in YAML/JSON) that doesn't require AI generation.
Persona A specialized behavioral profile given to the AI (e.g., instructing it to act specifically as a "Network Recon Specialist").
Knowledge Graph Siyarix's internal memory where it stores findings (like IP addresses, open ports) to contextually inform future steps.

Need help finding something specific? Feel free to use the search bar at the top of the documentation site, or open a discussion on our GitHub!

Clone this wiki locally