Compact text-surface protocol for AI-agent state. Lives inside markdown fenced blocks. Cross-vendor ratified across 6+ LLM families.
Multi-agent workflows need to pass state between LLM turns. JSON-RPC is verbose. Plain prose drifts. Every team invents their own slot syntax. MLLANG = a small shared convention that fits inside markdown fenced blocks, 50–70% less tokens than JSON-RPC for the same state.
# Refactor auth module
Workflow: switch sessions → JWT, gate on test=pass, rollback ready.
```mllang
V:0.1.r1; I:auth-001; G:{task=refactor_auth, from=sessions, to=jwt};
S:{users_db=preserve, api_compat=hard}; D:[migration_plan, rollback_path];
R:[downtime, token_leak]; N:@K -> implement; H:test=pass; P:0.85;
EN: Refactor auth sessions → JWT, preserve DB + API compat.
```One-line workflow summary above the fenced block, full state inside the packet, EN: line for human skim. No content duplication. Agents parse the block; humans glance at the summary. Long prose is opt-in (mode="verbose") for human-authored docs.
See docs/markdown-embedded.md for the full pattern.
from mllang import Packet, embed_in_markdown, extract_summary_and_packet
# Build a packet
p = Packet(
version="0.1.r1",
thread_id="auth-001",
goal={"task": "refactor_auth", "to": "jwt"},
state={"users_db": "preserve"},
next_agent="@K -> implement",
halt="test=pass",
confidence=0.85,
en_shadow="Refactor auth sessions → JWT, preserve DB.",
)
# Embed in markdown (summary mode = default, tight)
md = embed_in_markdown(
p,
title="Refactor auth module",
summary="Workflow: switch sessions → JWT, gate on test=pass.",
)
# Pull both channels back out
summary, packet = extract_summary_and_packet(md)
print(summary) # the workflow summary
print(packet.next_agent) # @K -> implement
print(packet.halt) # test=passThree modes: summary (default, agent-loaded files), verbose (long prose for human-authored docs), packet_only (pure agent pipelines).
MLLANG composes with existing standards rather than replacing them:
| Layer | Standard | Role |
|---|---|---|
| Doc / agent rules | AGENTS.md, llms.txt, Cursor rules | host markdown containing MLLANG blocks |
| Tool calls | MCP (Model Context Protocol) | carry MLLANG packets as tool payload |
| Agent transport | A2A, ACP | carry MLLANG packets as message body |
| Runtime state | MLLANG | THIS LAYER — slot grammar, halt enum, confidence |
MLLANG sits inside MCP/A2A as payload, inside markdown files as runtime state. Doesn't compete with them.
RecursiveMAS (arXiv 2604.25917) solves multi-agent collaboration by exchanging latent vectors between recursive agent rounds — elegant when you control the model weights, but it requires direct activation access (Qwen / Llama / Gemma / DeepSeek self-hosted) and cannot drive closed APIs like Claude / GPT / Gemini.
MLLANG is the text-surface approximation of the same goal. Loses the latent-recursion benefit. Gains universal compatibility — runs everywhere RecursiveMAS can't (closed APIs, mixed-vendor loops, paste-based workflows). A team running RecursiveMAS internally can still emit MLLANG packets as the text log of their latent rounds. Orthogonal layers; not competing.
| JSON-RPC | YAML frontmatter | LangGraph state | MLLANG | |
|---|---|---|---|---|
| Format | JSON-RPC envelope | YAML | Python dict | Compact ASCII |
| Single-line | No | No | No | Yes |
| Token cost | High | High | High | 50–70% lower |
| Embed in markdown | No | header only | No | YES (fenced block) |
| Human readable | Verbose | Yes | No | 5-min learn |
| Confidence built-in | No | No | No | P: slot |
| Halt enum | No | No | No | 9-way native |
| Multi-agent handoff | Manual | Manual | Framework-locked | N: slot |
MLLANG v0.1 was tested across these LLM families before locking:
| Family | Model | Status |
|---|---|---|
| Anthropic | Claude Sonnet 4.6 / Opus 4.7 | ✅ ratified |
| OpenAI | GPT-5.5 Thinking, Codex (gpt-5.5) | ✅ ratified |
| Gemini Pro web, Gemini Flash CLI | ✅ ratified | |
| Google (local) | Gemma-4 26B (AIR backend) | ✅ ratified |
| Google (local) | Gemma-4 4B (local, ~4B params) | ✅ ratified — small-model compactness test PASS |
| DeepSeek | DeepSeek V3 | ✅ ratified (P:0.92, 1 round) |
| Moonshot | Kimi K2.6 | ✅ ratified (P:0.88, 1 round) |
| Alibaba | Qwen 3.6 | ✅ ratified (P:0.85, 1 round) |
Mean ratification confidence: 0.93 (0.95 original 6 vendors + 0.883 Chinese 3-vendor pass). Spec locked 2026-05-19.
Audit trail: examples/05_negotiation_trace.jsonl (original 6), examples/ratification_chinese_2026-05-20.jsonl (Chinese pass).
The Chinese-family pass converged in 1 round per model (vs 3 rounds for the original spec-locking pass) because the spec was already locked — each backend only needed to acknowledge and emit a valid packet, not negotiate the grammar.
mllang/
├── spec/ Spec (v0.1 locked + v0.2 draft)
├── parser/ Python reference parser (pure stdlib)
├── examples/ 5 generic packet examples + markdown-embedded demo
├── bootstrap/ Role prompts (orchestrator, critic, implementer, synthesizer)
├── conformance/ Test suite — any parser implementation can score against it
├── codex-cli-integration/ Conservative Codex CLI session parser + report wrapper
├── docs/ GitHub Pages site
├── rfc/ Quarterly RFC proposals
└── .github/workflows/ CI: conformance test on PR
pip install mllang-protocolPyPI distribution name is mllang-protocol (the bare mllang name was already held by an unrelated 2021 ML library on PyPI). The Python import path is still mllang:
from mllang import Packet, parse, embed_in_markdown, sanitizeEditable install from the repo also works:
git clone https://github.com/jakeliu/mllang.git
cd mllang
pip install -e parser/- Specification (v0.1 locked)
- Quickstart
- Markdown-embedded usage
- Bootstrap — 3-layer setup
- MCP server
- shim-engine — observability
- Telemetry & privacy
- FAQ
- Conformance tests
- RFC process
- Security policy
See CONTRIBUTING.md. Quarterly RFC review windows. Multi-vendor test required for any spec change.
Same repo, separate PyPI listing. Local-first observability for LLM-agent loops — token / latency / cost / outcome distribution in a JSONL log, aggregated by a small CLI. Works with any model. Optional MLLANG awareness for 5x more signal.
pip install shim-engine # standalone (any LLM)
pip install 'shim-engine[mllang]' # auto-extract halt / confidence / agent code from MLLANG responses
pip install 'mllang-protocol[shim]' # same as above, reverse install orderKiller demo:
$ shim-engine-report calls.jsonl
200 packets logged
mean token reduction: 56.8% vs JSON-RPC equivalent (200/200 packets MLLANG-tagged)
estimated tokens saved: 9,536
p50 latency: 4.32s p99: 7.92s
mean P: 0.81
halt distribution: test=pass 62% | <=> 18% | accept 9% | escalate@H 7% | risk!high 4%
model distribution: gpt-5.5-thinking 42% | claude-opus 32% | gemini-pro 13% | gemma-26b 10% | deepseek-v3 4%
agent distribution: @G 23% | @X 21% | @K 20% | @M 19% | @C 17%
Full guide: docs/shim.md and shim/README.md.
One-keystroke MLLANG inside Claude Code (Anthropic's official CLI). The skill loads the bootstrap, instruments sub-agent (Agent tool) calls via a PostToolUse hook, and lets you check live token-savings at any time.
Install:
curl -sL https://raw.githubusercontent.com/jakeliu/mllang/main/claude-code-skill/install.sh | bash
pip install 'shim-engine[mllang]' # optional, enables /mllang reportThen in Claude Code:
/mllang # load bootstrap, ratify session
/mllang load critic # adopt the critic role's conventions
/mllang report # token-savings report from sub-agent calls this session
/mllang status # one-liner summary
/mllang spec # 5-line MLLANG v0.1 summary
The sub-agent log is privacy-redacted (slot SHAPES only, no slot values) and lives at ~/.claude/mllang-shim/session.jsonl. Full guide: claude-code-skill/README.md.
Conservative Path 1 for Codex CLI. No hooks are installed. The integration parses Codex's persisted session JSONL under ~/.codex/sessions/YYYY/MM/DD/rollout-*.jsonl, records one shim-engine row per completed turn, and reports real tokens in/out from Codex token_count events.
Install:
bash codex-cli-integration/install.sh
pip install 'shim-engine[mllang]' # optional, enables reportsUse:
python3 ~/.codex/mllang-integration/scripts/codex_session_parser.py --latest
bash ~/.codex/mllang-integration/scripts/codex-mllang-report.shBudget source is tagged honestly: B-slot when the final MLLANG packet self-reports B:{tokens_in=..., tokens_out=..., time=...}, otherwise session-token_count, otherwise none. Capture boundary is top-level Codex turn, not sub-agent. Full guide: codex-cli-integration/README.md.
pip install 'mllang-protocol[mcp]'
mllang-mcp-serverOr drop into claude_desktop_config.json:
{
"mcpServers": {
"mllang": { "command": "mllang-mcp-server" }
}
}Exposes 7 tools (mllang_parse / mllang_compose / mllang_validate / mllang_embed_in_markdown / mllang_extract_summary / mllang_sanitize / mllang_spec). End-to-end test in conformance/test_mcp_server.py drives every tool over real MCP stdio. Full guide: docs/mcp.md.
MLLANG ships with a built-in sanitize() function. Telemetry is opt-in only, off by default, and the library never auto-enables it.
# default — nothing sent
export MLLANG_TELEMETRY=off
# opt-in levels:
export MLLANG_TELEMETRY=shape # slot presence + halt + confidence (recommended)
export MLLANG_TELEMETRY=structured # add map keys + verb names + counts
export MLLANG_TELEMETRY=full # add redacted values (research consent only)from mllang import sanitize, sanitize_to_json
payload = sanitize(packet) # None unless env var set
line = sanitize_to_json(packet) # None or one-line JSONSlot shapes are public; slot values stay private. Thread ids are hashed (<I:hash:<sha256_12>>), file paths and tool args are never logged, and a leak-detector refuses payloads that still contain emails / paths / API-key patterns / long quoted strings.
Full slot-by-slot rules and before/after examples: docs/telemetry.md. Disclosure policy and IP-leak bug bounty: .github/SECURITY.md.
- v0.1 spec: LOCKED 2026-05-19
- v0.2 spec: draft, RFC window open
- Parser: pure Python 3, no external deps
- Conformance: 160-packet test suite (parse / halt / roundtrip / markdown extract / sanitize / embed)
- Telemetry: opt-in
sanitize()with 4 levels + leak-detector defense
Apache 2.0. See LICENSE.
If you use MLLANG in research or production, please cite:
MLLANG: Markdown Language for AI Agents (2026)
github.com/jakeliu/mllang