Runtime-Adaptive Orchestration for LLM-Native Multi-Agent Systems in Operational Environments
A research project under the OrionAI research lab
Mahboub Parhizkar, PhD — Google Scholar · LinkedIn · ORCID
Research metrics: 200+ citations · h-index: 5 · i10-index: 4
How can LLM-native multi-agent systems dynamically select, adapt, and re-plan their orchestration strategies at runtime based on operational context signals — and what are the measurable effects on coordination quality, efficiency, and reliability?
Today, orchestration strategy in multi-agent systems — which agents run, in what order, with what fallbacks, under what coordination pattern — is decided at design time by the developer. It is hardcoded into graphs, prompt templates, or workflow definitions. When runtime conditions change (task complexity increases, an agent fails, intermediate results shift the problem), the system cannot adapt its coordination approach. It continues executing the strategy it was designed with, regardless of whether that strategy still fits.
ContextFlow investigates a different model: orchestration-as-a-reasoning-problem. The system observes runtime context signals and uses them to select, modify, or re-plan its coordination strategy dynamically — treating orchestration decisions as first-class reasoning outputs, not fixed infrastructure.
Current LLM-native orchestration frameworks (LangGraph, AutoGen, CrewAI) have solved the mechanics of agent coordination well. They provide:
- Graph-based agent state machines
- Role-based task decomposition
- Tool-augmented reasoning loops
- Conversational agent interaction
What they have not solved is adaptive strategy selection. All of these frameworks require the developer to specify the coordination pattern upfront. The framework executes it. There is no mechanism for the system to reason: "given what I know now about this task, I should coordinate my agents differently than originally planned."
This gap becomes critical in operational environments — enterprise systems where tasks are variable, conditions change, agents fail, and the cost of wrong coordination is real. Insurance claim processing, banking transaction orchestration, healthcare workflow automation — these environments expose the brittleness of static coordination strategies constantly.
| ID | Contribution |
|---|---|
| C1 | A taxonomy of runtime context signals relevant to multi-agent orchestration decisions (task complexity, agent state, intermediate result quality, latency budget, failure history) |
| C2 | A framework for strategy-aware orchestration — a coordination layer that reasons about how to coordinate agents, not just what to execute |
| C3 | An evaluation methodology for multi-agent coordination quality beyond task success rate, covering coordination efficiency, adaptation latency, and reliability under agent failure |
| C4 | Empirical findings comparing static vs. adaptive orchestration across operational scenarios of varying complexity and failure rate |
┌──────────────────────────────────────────────────────────────┐
│ API Gateway Layer │
│ FastAPI · Auth · Request Routing │
└────────────────────────────┬─────────────────────────────────┘
│
┌────────────────────────────▼─────────────────────────────────┐
│ Context Observation Layer │
│ Task Complexity · Agent State · Latency · Failure Log │
└────────────────────────────┬─────────────────────────────────┘
│ Runtime context signals
┌────────────────────────────▼─────────────────────────────────┐
│ Strategy Reasoning Engine ◄── CORE │
│ LLM-based meta-planner · Strategy selection & re-planning │
└──────┬──────────────────┬──────────────────┬─────────────────┘
│ │ │
┌──────▼──────┐ ┌────────▼───────┐ ┌──────▼──────────────┐
│ Planning │ │ Execution │ │ Retrieval │
│ Agent │ │ Agent │ │ Agent │
│ LangGraph │ │ LangGraph │ │ Qdrant + RAG │
└─────────────┘ └────────────────┘ └─────────────────────┘
│
┌────────────────────────────▼─────────────────────────────────┐
│ Distributed Execution Layer │
│ Temporal · Kafka · Docker · Celery │
└────────────────────────────┬─────────────────────────────────┘
│
┌────────────────────────────▼─────────────────────────────────┐
│ Observability Layer │
│ Prometheus · Grafana · Structured Logs │
└──────────────────────────────────────────────────────────────┘
The Strategy Reasoning Engine is the novel component. It sits between context observation and agent execution, using an LLM-based meta-planner to select or modify the coordination strategy based on what it observes at runtime. All other layers use proven, production-grade technologies.
contextflow/
├── research/
│ ├── gap-analysis.md # Formal research gap and literature review
│ ├── hypotheses.md # Formal hypotheses and expected findings
│ ├── literature-notes/ # Annotated summaries of key papers
│ └── evaluation-design.md # Metrics framework and experiment design
├── architecture/
│ ├── system-design.md # Architecture decisions and rationale
│ ├── strategy-reasoning.md # Core component design
│ └── diagrams/ # System and component diagrams
├── contextflow/
│ ├── gateway/ # FastAPI gateway
│ ├── context_observer/ # Runtime signal collection
│ ├── strategy_engine/ # Core: LLM meta-planner + strategy selection
│ ├── agents/
│ │ ├── planning_agent/
│ │ ├── execution_agent/
│ │ └── retrieval_agent/
│ ├── workflow/ # Temporal workflow definitions
│ └── infrastructure/ # Kafka, Docker configurations
├── experiments/
│ ├── scenarios/ # Operational scenario definitions
│ ├── baselines/ # Static orchestration baselines
│ ├── adaptive/ # Adaptive strategy experiments
│ └── results/ # Raw and processed results
├── paper/
│ └── drafts/ # Working paper sections
└── README.md
| Concern | Technology | Rationale |
|---|---|---|
| Agent coordination | LangGraph | Graph-based agent state control, Python-native |
| LLM backend | OpenAI API + Ollama | Interchangeable — reproducible in any environment |
| Workflow engine | Temporal | Python-native, production-grade, LLM-era aligned |
| Event streaming | Apache Kafka | Real-time context signal propagation |
| Vector memory | Qdrant | Fast similarity search for retrieval agent |
| API gateway | FastAPI | Lightweight, async-first |
| Deployment | Docker Compose | Fully reproducible research environment |
| Observability | Prometheus + Grafana | Latency and throughput benchmarking |
To ensure generalisability, adaptive vs. static orchestration will be evaluated across three operational scenario types:
| Scenario | Description | Variable factor |
|---|---|---|
| S1 — Linear workflow | Sequential task execution, low complexity | Agent failure rate |
| S2 — Branching workflow | Conditional task routing, medium complexity | Task complexity variance |
| S3 — Dynamic re-planning | Mid-execution context shift requiring strategy change | Context signal volatility |
Metrics collected: coordination accuracy, adaptation latency, agent utilisation, failure recovery time, end-to-end throughput.
- Literature survey: LLM-native orchestration, adaptive planning, multi-agent coordination
- Formal research gap statement (
/research/gap-analysis.md) - System architecture design and component specifications
- Repository scaffolding with research-quality documentation
- FastAPI gateway + context observer implementation
- Static orchestration baseline (LangGraph + Temporal)
- Three baseline agents (planning, execution, retrieval)
- Evaluation metrics framework
- LLM meta-planner design and implementation
- Strategy selection and re-planning logic
- Kafka-based runtime context signal streaming
- Integration with agent coordination layer
- Comparative experiments across three scenarios
- Results analysis and visualisation
- First paper draft
- Research findings published as LinkedIn technical series
- IEEE Transactions on Services Computing
- Journal of Artificial Intelligence Research (JAIR)
- IEEE Intelligent Systems
- AAMAS 2026 — International Conference on Autonomous Agents and Multi-Agent Systems
- ICSOC 2026 — International Conference on Service-Oriented Computing
ContextFlow is positioned as operational AI systems research — the study of intelligent systems that act, coordinate, and adapt in real enterprise environments.
- Not a tutorial implementation of LangGraph or AutoGen
- Not a chatbot or conversational AI application
- Not a benchmark of existing LLM capabilities
- Is an investigation of a novel architectural problem: how multi-agent systems should reason about their own coordination strategy at runtime
This framing aligns with postdoctoral research tracks in autonomous agents, intelligent systems, AI infrastructure, and distributed AI at both university research labs and industry research groups.
OrionAI is an independent AI research initiative focused on operational intelligent systems, distributed AI infrastructure, and LLM-native orchestration architectures for real-world enterprise environments.
🔬 Active research — Phase 1 in progress
Research inquiries and collaboration proposals welcome via LinkedIn