EvoScientist v0.1.0 — Async Sub-Agents, Official Docker Image, Personal WeChat & Sessions DB Compaction

G'Day! v0.1.0 is the biggest release since launch. The headline is async sub-agents — writing-agent and data-analysis-agent now run on a managed langgraph dev subprocess, so 30-180s long-runners no longer block your chat, and you get proactive completion notifications without polling. We also ship an official multi-arch Docker image, a personal-WeChat (iLink) channel backend with QR login, the new /model-fallback command, a PruningCheckpointer that compacts multi-GB legacy sessions.db files in place, and a critical HITL fix that unblocks parallel sub-agents calling execute.

🚀 Async Sub-Agents (default-on)

writing-agent and data-analysis-agent typically run 30-180s. With sync delegation, your main chat was blocked the whole time. v0.1.0 transparently deploys the main agent and any sub-agent flagged async: true onto an auto-managed langgraph dev subprocess; the supervisor returns immediately with a task_id and keeps talking (#200).

Single source of truth — per-agent YAML in EvoScientist/subagents/ (replaces the monolithic subagent.yaml); both the in-process sync path and the deployed async path read from the same file. Adding a new async sub-agent: flip async: true, add a binding in langgraph_dev/graphs.py, register in langgraph_dev/langgraph.json.
Auto-managed subprocess — langgraph_dev/manager.py handles start / health-check / stop. Uses filelock to serialize concurrent CLI invocations and psutil to detect stale PID files. Default port 6174 (Kaprekar's constant); onboarding has a "LangGraph Port" step that rejects already-occupied ports unless reused by EvoSci itself.
Workspace sync widget — cli/widgets/workspace_sync_widget.py shows live progress when CLI workspace files sync to the langgraph dev subprocess.
Graceful fallback — if langgraph dev fails to start, async sub-agents fall back to in-process synchronous delegation transparently.
Toggle — set enable_async_subagents: false in ~/.config/evoscientist/config.yaml to keep everything sync.

🔔 Async Sub-Agent Auto-Notifications

The main agent now learns about async completions automatically — no more "check status" polling (#214).

╭── ✦ Agent Teams ✦ ──────────────────╮
│ ✔ writing-agent       success       │
│ ✗ data-analysis-agent error         │
╰─────────────────────────────────────╯

Watcher pattern — every async launch spawns a background asyncio task subscribed to client.runs.join_stream(thread_id, run_id) (SDK-native SSE long-poll). Terminal state pushes an AsyncTaskNotification onto a thread-safe queue; CLI / TUI / serve poll loops drain, dedup, batch, and inject as a synthetic HumanMessage to wake the supervisor.
Token-efficient — LLM input gets compact JSON per task line ({"agent": ..., "status": ..., "task_id": ...}); the decorative frame with colored ✔/✗/⚠ icons lives in a separate render path.
Update support — update_async_task (continuing a conversation with a sub-agent) creates a new run_id on the same thread_id. _watcher_by_thread dict + replace semantics ensure the new watcher takes over; old watcher cancelled before the new run is awaited (no stale "success" race).
Per-thread routing — notifications are scoped to their originating CLI thread, so /new between sub-agent launch and completion no longer injects updates into the wrong thread.
False-positive watcher fix — production-observed bug where the SSE long-poll closed cleanly while the run was still alive on the server (httpx timeout, HTTP/2 GOAWAY, proxy idle keep-alive) caused fake ✅ success notifications. Watcher now uses a bounded reconnect loop and verifies via runs.get after every clean close (#216).
/model propagation — /model switches now reach async sub-agents via RunnableConfig.configurable. Previously the deployed graphs froze the boot-time model, so async agents kept the old one forever (#217).

🐳 Official Docker Image

Multi-arch (amd64 / arm64) image at ghcr.io/evoscientist/evoscientist, built via GitHub Actions (#198 by @din0s, closes #175).

docker run -it --rm \
  -v evosci-data:/home/evosci/.evoscientist \
  -v $(pwd)/workspace:/workspace \
  ghcr.io/evoscientist/evoscientist

Multi-stage build off python:3.11-slim-bookworm, uv-managed venv with the all-channels extra installed via uv sync --frozen --no-dev --no-editable.
Node.js 24 LTS (copied from node:24-bookworm-slim) so npx works for the majority of MCP servers.
uv binary so the MCP registry can install Python MCP servers on demand at runtime.
One-volume persistence — /home/evosci/.evoscientist holds sessions DB, global skills, memories, and config (XDG_CONFIG_HOME points inside it; UV_TOOL_DIR / UV_TOOL_BIN_DIR redirect uv-tool artifacts there too). MCP servers installed during onboarding survive docker run --rm.
Non-root user evosci for shell sandboxing.
See the README's Docker section for derivation recipes for unbundled extras (stt, oauth, TinyTeX) and proxy / cert handling.

💬 Personal WeChat (iLink) Channel Backend

A third WeChat backend — personal — rides Tencent's iLink Bot long-poll gateway, letting a personal WeChat account act as a bot alongside the existing WeCom and Official Account backends (#212 by @MuXinCG2004).

No app-id / secret — credentials come from a QR-code scan (python -m EvoScientist.channels.wechat.serve --qr-login) and are persisted under DATA_DIR/wechat_personal/accounts/. The wizard can drive the QR scan inline.
Backend picker in onboard — first-time users pick WeCom / 微信公众号 / 个人微信 without leaving the wizard.
Group delivery is opt-in and disabled by default — iLink rarely delivers group messages for QR-login bot identities.
AES-128-ECB CDN media protocol against novac2c.cdn.weixin.qq.com.
New deps: qrcode>=7.4 and certifi>=2024.0 added to the wechat and all-channels extras.

Why personal vs the existing backends: WeCom and Official Account both require a registered organisation / app and issue stable credentials — fine for production bots, awkward for hobbyists. The iLink personal backend lowers that barrier: scan a QR with a personal WeChat, get a session token, run the bot.

📱 QQ Bot QR-Code Onboarding

The config wizard can now auto-fill qq_app_id and qq_app_secret after the developer scans a QR code with a bound QQ account — no more manually copying credentials from q.qq.com (#213 by @MuXinCG2004).

qr_register() flow drives q.qq.com's create_bind_task / poll_bind_result APIs.
AES-256-GCM with a client-generated key — the bot's client_secret never travels in plaintext.
Manual fallback on scan failure / cancel — no behavior change for users who'd rather paste credentials.
The bot must already be registered at q.qq.com; scanning binds your QQ account to the existing app.

🔄 `/model-fallback` Command

A new /model-fallback command (#196 by @mooshroom4422, closes #67) appends a model to the session-level fallback chain. If the primary model errors out (rate limit, transient 5xx, etc.), EvoSci automatically retries with the next model in the chain.

Same picker as /model — pick fallback models from any registered provider.
/model-fallback save persists the chain to ~/.config/evoscientist/config.yaml.
/model-fallback help lists all sub-commands (add / remove / list / save).
Real-time status messages during fallback attempts.

🗜️ PruningCheckpointer — Multi-GB `sessions.db` Compaction

LangGraph's AsyncSqliteSaver writes a full state snapshot per super-step. EvoScientist's resume / HITL / sub-agent paths only ever read the latest checkpoint per (thread_id, checkpoint_ns), so every intermediate snapshot is dead weight. Real-world observed: a single user's sessions.db grew to 2.6 GB with 8,337 checkpoints across 153 threads — one thread alone held 1,086 checkpoints (#194).

Automatic per-write pruning — PruningCheckpointer (subclass of AsyncSqliteSaver) overrides aput() to retain at most checkpoint_keep_per_thread (default 10) rows per (thread_id, checkpoint_ns). Outer asyncio.Lock makes super().aput() + prune atomic; just-written row is always kept under concurrent writers.

One-time migration sweep — gated by PRAGMA user_version and a 100 MB size threshold. Prunes every existing (thread, ns) pair to N, schedules VACUUM via atexit. Live progress bar with ETA:

· Compacting sessions DB (14.53 GB, 99 thread-namespace pairs, est. ~2m 15s)
⠹ Compacting 27% (27/99 pairs · 38s elapsed · ~1m 41s remaining)
· ✓ Compaction done in 2m 18s

Safe across providers — filters by metadata.agent_name so co-located non-EvoSci agents are never touched.
Race fix (#195) — sweep now runs synchronously inside get_checkpointer() before yielding the saver, so concurrent channel aput()s no longer race the sweep for SQLite's EXCLUSIVE lock and trigger Error: database is locked.
Diagnostic — new EvoSci sessions stats CLI command surfaces DB state.

🛡️ HITL Fix — Parallel Sub-Agents No Longer Crash

When the main agent dispatched 2+ parallel sub-agents (via task) that each called execute, LangGraph used to raise:

When there are multiple pending interrupts, you must specify the interrupt id when resuming.

Root cause: DeepAgents 0.5.5 propagated the top-level interrupt_on={"execute": True} we passed to create_deep_agent(...) into every declarative sub-agent and the auto-injected general-purpose sub-agent. Parallel sub-agents then produced multiple pending interrupts in the same checkpoint, which a single resume decision could not disambiguate.

Fix (#202): stop passing interrupt_on= to create_deep_agent. Instead, append HumanInTheLoopMiddleware directly to the main agent's stack in both _get_default_agent() and create_cli_agent(). Sub-agent shell commands remain protected by CustomSandboxBackend static checks (blocks sudo / chmod / dd / path traversal, 300s timeout, 100KB cap). Semantic model: user authorizes the high-level task goal → sub-agent executes within sandbox limits.

🪟 Context Window Patch Table

A small per-model context-window patch table (llm/context_window.py) for the newest models that LangChain provider packages haven't registered profile data for yet — claude-opus-4-7 (1M), gpt-5.5 (1.05M), kimi-k2.6 (262K), deepseek-v4-pro (1.05M), qwen3.6-flash (1M), mimo-v2.5 (1.05M), glm-5 (203K) (#191).

Without this, three things were broken for those models:

Status bar percentage showed <used>/200K instead of the real <used>/1M (the only fallback was DEFAULT_CONTEXT_WINDOW_FALLBACK = 200_000).
deepagents SummarizationMiddleware triggered way too early — its hardcoded 170K tokens fallback meant a 1M-context model summarized at ~17% instead of ~85%.
ContextEditingMiddleware clear-tool-uses trigger fell back to a fixed 100K, similarly pessimistic.

The table is meant to be trimmed over time — once an entry shows up upstream in sst/models.dev, the corresponding row can be removed (the profile-reading layer wins automatically).

🔧 DeepSeek Cross-Provider Reasoning Fix

Fixes a 400 from DeepSeek V4 Pro thinking mode (deepseek-chat) when the conversation history contains assistant messages from another provider, from DeepSeek Flash, or from a session that predates the v0.0.9 reasoning_content capture patch (#192).

Trigger scenarios (any one is enough to break):

User chats with Anthropic / OpenAI for a few turns, then switches to DeepSeek V4 Pro
/resume of a thread previously driven by a non-thinking model
Resuming a thread created on an older EvoSci version

Fix: drop the is_reasoner gate in _patch_deepseek_reasoning_passback. Empty-string fallback now applies to every assistant message that lacks reasoning_content, regardless of which DeepSeek model is making the call. Safe because the patch is mounted only when provider == "deepseek", so all callers are guaranteed DeepSeek endpoints.

🧹 Refactors & Cleanup

ChannelRuntime replaces cli/channel.py module globals (#197 by @din0s, closes #182). _cli_agent / _cli_thread_id were poked directly by /model, /channel, _auto_start_channel, and the Rich/TUI session loops; tests needed _reset_channel_globals autouse fixtures to keep state from leaking. Now: a single ChannelRuntime() per session, threaded through CommandContext. /model rebinds via ctx.channel_runtime.agent = new_agent. Behavior unchanged.
Skill install manifest (#199 by @din0s). Onboarding's "install skills" step kept showing already-installed skills as still-installable. skills_manager now writes a per-tier .installed.yaml sidecar mapping installed dir name → original install source (URL, shorthand, or local path). _step_skills checks both the workspace and global tiers AND the manifest, so packs that explode into many child dirs (e.g. EvoScientist/EvoSkills@skills → paper-writing/, evo-memory/, …) are correctly detected.
Dead MessageBus.dispatch_outbound removed (#205 by @MuXinCG2004). Two implementations shared the same bus.outbound queue: MessageBus.dispatch_outbound (subscriber-based) and ChannelManager._dispatch_outbound (registry lookup). Only the latter was wired up in production — the former was a footgun for anyone reading MessageBus in isolation. Net change: +14 / −139 across 3 files. No behavior change.
Markdown ATX heading spacing (#201) — ensure proper spacing for ATX headings in Markdown rendering so streaming output renders correctly.

📦 Dependency Bumps

deepagents 0.5.5 → 0.5.7 (#211)
- 0.5.6 — CompiledSubAgent now propagates lc_agent_name into metadata (langchain-ai/deepagents#3045). Direct improvement for our 6-level sub-agent name resolution chain (priority 1 reads metadata["lc_agent_name"]).
- 0.5.7 — auto-added GP sub-agent inherits parent permissions (langchain-ai/deepagents#3131). Preparatory for future FilesystemPermission adoption.
langchain-openrouter 0.2.1 → 0.2.3 — includes upstream _merge_reasoning_details() (langchain-ai/langchain#36401) which consolidates streaming-fragmented reasoning_details before serialization. Our local _patch_openrouter_reasoning_details workaround is now removed (it dropped fragments lossily; upstream merges them losslessly).
Transitive bumps: langchain-anthropic 1.4.2 → 1.4.3, langsmith 0.7.38 → 0.8.0.

What's Changed

Feat/llm context window patch table by @X-iZhang in #191
fix(deepseek): add empty-string fallback for reasoning_content in cro… by @X-iZhang in #192
Implement PruningCheckpointer for efficient checkpoint management and… by @X-iZhang in #194
Fix/sessions migration sweep race by @X-iZhang in #195
refactor(cli): replace channel module globals with ChannelRuntime by @din0s in #197
fix(onboard): detect installed skill packs via install manifest by @din0s in #199
feat(docker): official image with all runtime deps pre-installed by @din0s in #198
fix/cli hitl by @X-iZhang in #202
fix(markdown): ensure proper spacing for ATX headings in Markdown ren… by @X-iZhang in #201
Refactor sub-agent architecture and introduce async support by @X-iZhang in #200
chore(deps): bump deepagents 0.5.7 and langchain-openrouter 0.2.3, drop redundant patch by @X-iZhang in #211
refactor(channels): remove dead MessageBus dispatcher by @MuXinCG2004 in #205
feat(cmd): add /model-fallback command by @mooshroom4422 in #196
feat(wechat): add personal-WeChat (iLink) backend with QR login by @MuXinCG2004 in #212
feat: Implement async sub-agent auto-notification system by @X-iZhang in #214
fix: Improve watcher logic to prevent false-positive notifications on… by @X-iZhang in #216
feat(qq): add QR-code scan-to-configure onboarding for QQ Bot by @MuXinCG2004 in #213
Fix/async subagent model switch by @X-iZhang in #217

Full Changelog: v0.0.9...v0.1.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.1.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

EvoScientist v0.1.0 — Async Sub-Agents, Official Docker Image, Personal WeChat & Sessions DB Compaction

🚀 Async Sub-Agents (default-on)

🔔 Async Sub-Agent Auto-Notifications

🐳 Official Docker Image

💬 Personal WeChat (iLink) Channel Backend

📱 QQ Bot QR-Code Onboarding

🔄 `/model-fallback` Command

🗜️ PruningCheckpointer — Multi-GB `sessions.db` Compaction

🛡️ HITL Fix — Parallel Sub-Agents No Longer Crash

🪟 Context Window Patch Table

🔧 DeepSeek Cross-Provider Reasoning Fix

🧹 Refactors & Cleanup

📦 Dependency Bumps

What's Changed

Contributors

Uh oh!

v0.1.0

EvoScientist v0.1.0 — Async Sub-Agents, Official Docker Image, Personal WeChat & Sessions DB Compaction

🚀 Async Sub-Agents (default-on)

🔔 Async Sub-Agent Auto-Notifications

🐳 Official Docker Image

💬 Personal WeChat (iLink) Channel Backend

📱 QQ Bot QR-Code Onboarding

🔄 /model-fallback Command

🗜️ PruningCheckpointer — Multi-GB sessions.db Compaction

🛡️ HITL Fix — Parallel Sub-Agents No Longer Crash

🪟 Context Window Patch Table

🔧 DeepSeek Cross-Provider Reasoning Fix

🧹 Refactors & Cleanup

📦 Dependency Bumps

What's Changed

Contributors

Uh oh!

🔄 `/model-fallback` Command

🗜️ PruningCheckpointer — Multi-GB `sessions.db` Compaction