Skip to content

EnVision-Research/StreamMA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 

Repository files navigation

StreamMA logo
StreamMA

Streaming Communication in Multi-Agent Reasoning

arXiv Project Page BibTeX

Zhen Yang1  ·  Xiaogang Xu3  ·  Wen Wang3  ·  Cong Chen3  ·  Xander Xu2*  ·  Ying-Cong Chen1,4*

1HKUST(GZ)  ·  2Alibaba Group  ·  3ZJU  ·  4HKUST
*Co-corresponding authors


⚡ Quick Start

pip install openai
python StreamMA.py

Or in your own script:

import asyncio
from StreamMA import (
    StreamMA, RunLogger,
    PROMPT_A_CHAIN, PROMPT_B_CHAIN, PROMPT_C_CHAIN,
)

config = {
    "Agent_A": {"system_prompt": PROMPT_A_CHAIN, "next": ["Agent_B"]},
    "Agent_B": {"system_prompt": PROMPT_B_CHAIN, "next": ["Agent_C"]},
    "Agent_C": {"system_prompt": PROMPT_C_CHAIN},
}

async def main():
    logger = RunLogger(input="***", logger_path="logger.json")
    logger.start()
    await StreamMA.run(
        config, "your problem here",
        api_key="***", base_url="***", model="***",
        logger=logger,
    )
    logger.finish(config)

asyncio.run(main())

🔧 Customising the topology

The DAG is fully driven by the config dict — change the topology by editing each agent's next: [...], and pair every agent with its own system_prompt:

# Chain   A → B → C
{"A": {..., "next": ["B"]}, "B": {..., "next": ["C"]}, "C": {...}}

# Tree    A → {B, C}
{"A": {..., "next": ["B", "C"]}, "B": {...}, "C": {...}}

# Graph   A → B → C, with shortcut A → C
{"A": {..., "next": ["B", "C"]}, "B": {..., "next": ["C"]}, "C": {...}}

📈 Logger output

RunLogger records per-agent token counts, KV-cache hits, API time, and an ASCII timeline of streaming segments:

{
  "summary": {
    "agents": {
      "Agent_A": {"segments": 1, "prefill_tokens": 190,   "cached_tokens": 0,     "kv_cache_hit_ratio": 0.0,    "decode_tokens": 4084,  "api_time": 123.30},
      "Agent_B": {"segments": 4, "prefill_tokens": 30373, "cached_tokens": 10624, "kv_cache_hit_ratio": 0.3498, "decode_tokens": 10175, "api_time": 308.74},
      "Agent_C": {"segments": 4, "prefill_tokens": 42755, "cached_tokens": 23040, "kv_cache_hit_ratio": 0.5389, "decode_tokens": 7197,  "api_time": 221.08}
    },
    "agent_count": 3,
    "total_prefill_tokens": 73318,
    "total_cached_tokens":  33664,
    "total_decode_tokens":  21456,
    "total_tokens":         94774,
    "speedup_analysis": {
      "api_time": 653.12,
      "wall_time": 376.02,
      "speedup": 1.74,
      "total_kv_cache_hit_ratio": 0.4592,
      "critical_path_time": 653.12,
      "streaming_speedup": 1.74,
      "timeline": [
        "[Timeline]",
        "  Agent Agent_A:█████████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 123.3s",
        "  Agent Agent_B:░░░░░███████████████████████████████████████████████░░░ 308.7s",
        "  Agent Agent_C:░░░░░░░░░░░░░████████████░░░░░░███████████████████ 221.1s",
        "            ──────────────────────────────────────────────────",
        "            │    │       │  │       │      │      │   │   │  │          ",
        "            0.0s 40.0s   98.0s      183.7s 240.6s 288.6s  353.1s        ",
        "",
        "  Legend: █ = processing, ░ = idle"
      ]
    }
  }
}

📚 Citation

If you find StreamMA useful, please cite:

@article{yang2026streamma,
  title={Streaming Communication in Multi-Agent Reasoning},
  author={Yang, Zhen and Xu, Xiaogang and Wang, Wen and Chen, Cong and Xu, Xander and Chen, Ying-Cong},
  journal={arXiv preprint arXiv:2606.05158},
  year={2026}
}

About

Official implementation of "Streaming Communication in Multi-Agent Reasoning"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages