v0.1.0
EvoScientist v0.1.0 โ Async Sub-Agents, Official Docker Image, Personal WeChat & Sessions DB Compaction
G'Day! v0.1.0 is the biggest release since launch. The headline is async sub-agents โ writing-agent and data-analysis-agent now run on a managed langgraph dev subprocess, so 30-180s long-runners no longer block your chat, and you get proactive completion notifications without polling. We also ship an official multi-arch Docker image, a personal-WeChat (iLink) channel backend with QR login, the new /model-fallback command, a PruningCheckpointer that compacts multi-GB legacy sessions.db files in place, and a critical HITL fix that unblocks parallel sub-agents calling execute.
๐ Async Sub-Agents (default-on)
writing-agent and data-analysis-agent typically run 30-180s. With sync delegation, your main chat was blocked the whole time. v0.1.0 transparently deploys the main agent and any sub-agent flagged async: true onto an auto-managed langgraph dev subprocess; the supervisor returns immediately with a task_id and keeps talking (#200).
- Single source of truth โ per-agent YAML in
EvoScientist/subagents/(replaces the monolithicsubagent.yaml); both the in-process sync path and the deployed async path read from the same file. Adding a new async sub-agent: flipasync: true, add a binding inlanggraph_dev/graphs.py, register inlanggraph_dev/langgraph.json. - Auto-managed subprocess โ
langgraph_dev/manager.pyhandles start / health-check / stop. Usesfilelockto serialize concurrent CLI invocations andpsutilto detect stale PID files. Default port6174(Kaprekar's constant); onboarding has a "LangGraph Port" step that rejects already-occupied ports unless reused by EvoSci itself. - Workspace sync widget โ
cli/widgets/workspace_sync_widget.pyshows live progress when CLI workspace files sync to the langgraph dev subprocess. - Graceful fallback โ if
langgraph devfails to start, async sub-agents fall back to in-process synchronous delegation transparently. - Toggle โ set
enable_async_subagents: falsein~/.config/evoscientist/config.yamlto keep everything sync.
๐ Async Sub-Agent Auto-Notifications
The main agent now learns about async completions automatically โ no more "check status" polling (#214).
โญโโ โฆ Agent Teams โฆ โโโโโโโโโโโโโโโโโโโฎ
โ โ writing-agent success โ
โ โ data-analysis-agent error โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
- Watcher pattern โ every async launch spawns a background
asynciotask subscribed toclient.runs.join_stream(thread_id, run_id)(SDK-native SSE long-poll). Terminal state pushes anAsyncTaskNotificationonto a thread-safe queue; CLI / TUI / serve poll loops drain, dedup, batch, and inject as a syntheticHumanMessageto wake the supervisor. - Token-efficient โ LLM input gets compact JSON per task line (
{"agent": ..., "status": ..., "task_id": ...}); the decorative frame with colored โ/โ/โ icons lives in a separate render path. - Update support โ
update_async_task(continuing a conversation with a sub-agent) creates a newrun_idon the samethread_id._watcher_by_threaddict + replace semantics ensure the new watcher takes over; old watcher cancelled before the new run is awaited (no stale "success" race). - Per-thread routing โ notifications are scoped to their originating CLI thread, so
/newbetween sub-agent launch and completion no longer injects updates into the wrong thread. - False-positive watcher fix โ production-observed bug where the SSE long-poll closed cleanly while the run was still alive on the server (httpx timeout, HTTP/2 GOAWAY, proxy idle keep-alive) caused fake
โ successnotifications. Watcher now uses a bounded reconnect loop and verifies viaruns.getafter every clean close (#216). /modelpropagation โ/modelswitches now reach async sub-agents viaRunnableConfig.configurable. Previously the deployed graphs froze the boot-time model, so async agents kept the old one forever (#217).
๐ณ Official Docker Image
Multi-arch (amd64 / arm64) image at ghcr.io/evoscientist/evoscientist, built via GitHub Actions (#198 by @din0s, closes #175).
docker run -it --rm \
-v evosci-data:/home/evosci/.evoscientist \
-v $(pwd)/workspace:/workspace \
ghcr.io/evoscientist/evoscientist- Multi-stage build off
python:3.11-slim-bookworm, uv-managed venv with theall-channelsextra installed viauv sync --frozen --no-dev --no-editable. - Node.js 24 LTS (copied from
node:24-bookworm-slim) sonpxworks for the majority of MCP servers. uvbinary so the MCP registry can install Python MCP servers on demand at runtime.- One-volume persistence โ
/home/evosci/.evoscientistholds sessions DB, global skills, memories, and config (XDG_CONFIG_HOME points inside it;UV_TOOL_DIR/UV_TOOL_BIN_DIRredirect uv-tool artifacts there too). MCP servers installed during onboarding survivedocker run --rm. - Non-root user
evoscifor shell sandboxing. - See the README's Docker section for derivation recipes for unbundled extras (
stt,oauth, TinyTeX) and proxy / cert handling.
๐ฌ Personal WeChat (iLink) Channel Backend
A third WeChat backend โ personal โ rides Tencent's iLink Bot long-poll gateway, letting a personal WeChat account act as a bot alongside the existing WeCom and Official Account backends (#212 by @MuXinCG2004).
- No app-id / secret โ credentials come from a QR-code scan (
python -m EvoScientist.channels.wechat.serve --qr-login) and are persisted underDATA_DIR/wechat_personal/accounts/. The wizard can drive the QR scan inline. - Backend picker in onboard โ first-time users pick WeCom / ๅพฎไฟกๅ ฌไผๅท / ไธชไบบๅพฎไฟก without leaving the wizard.
- Group delivery is opt-in and disabled by default โ iLink rarely delivers group messages for QR-login bot identities.
- AES-128-ECB CDN media protocol against
novac2c.cdn.weixin.qq.com. - New deps:
qrcode>=7.4andcertifi>=2024.0added to thewechatandall-channelsextras.
Why personal vs the existing backends: WeCom and Official Account both require a registered organisation / app and issue stable credentials โ fine for production bots, awkward for hobbyists. The iLink personal backend lowers that barrier: scan a QR with a personal WeChat, get a session token, run the bot.
๐ฑ QQ Bot QR-Code Onboarding
The config wizard can now auto-fill qq_app_id and qq_app_secret after the developer scans a QR code with a bound QQ account โ no more manually copying credentials from q.qq.com (#213 by @MuXinCG2004).
qr_register()flow drivesq.qq.com'screate_bind_task/poll_bind_resultAPIs.- AES-256-GCM with a client-generated key โ the bot's
client_secretnever travels in plaintext. - Manual fallback on scan failure / cancel โ no behavior change for users who'd rather paste credentials.
- The bot must already be registered at
q.qq.com; scanning binds your QQ account to the existing app.
๐ /model-fallback Command
A new /model-fallback command (#196 by @mooshroom4422, closes #67) appends a model to the session-level fallback chain. If the primary model errors out (rate limit, transient 5xx, etc.), EvoSci automatically retries with the next model in the chain.
- Same picker as
/modelโ pick fallback models from any registered provider. /model-fallback savepersists the chain to~/.config/evoscientist/config.yaml./model-fallback helplists all sub-commands (add / remove / list / save).- Real-time status messages during fallback attempts.
๐๏ธ PruningCheckpointer โ Multi-GB sessions.db Compaction
LangGraph's AsyncSqliteSaver writes a full state snapshot per super-step. EvoScientist's resume / HITL / sub-agent paths only ever read the latest checkpoint per (thread_id, checkpoint_ns), so every intermediate snapshot is dead weight. Real-world observed: a single user's sessions.db grew to 2.6 GB with 8,337 checkpoints across 153 threads โ one thread alone held 1,086 checkpoints (#194).
- Automatic per-write pruning โ
PruningCheckpointer(subclass ofAsyncSqliteSaver) overridesaput()to retain at mostcheckpoint_keep_per_thread(default10) rows per(thread_id, checkpoint_ns). Outerasyncio.Lockmakessuper().aput()+ prune atomic; just-written row is always kept under concurrent writers. - One-time migration sweep โ gated by
PRAGMA user_versionand a 100 MB size threshold. Prunes every existing(thread, ns)pair to N, schedulesVACUUMviaatexit. Live progress bar with ETA:ยท Compacting sessions DB (14.53 GB, 99 thread-namespace pairs, est. ~2m 15s) โ น Compacting 27% (27/99 pairs ยท 38s elapsed ยท ~1m 41s remaining) ยท โ Compaction done in 2m 18s - Safe across providers โ filters by
metadata.agent_nameso co-located non-EvoSci agents are never touched. - Race fix (#195) โ sweep now runs synchronously inside
get_checkpointer()before yielding the saver, so concurrent channelaput()s no longer race the sweep for SQLite's EXCLUSIVE lock and triggerError: database is locked. - Diagnostic โ new
EvoSci sessions statsCLI command surfaces DB state.
๐ก๏ธ HITL Fix โ Parallel Sub-Agents No Longer Crash
When the main agent dispatched 2+ parallel sub-agents (via task) that each called execute, LangGraph used to raise:
When there are multiple pending interrupts, you must specify the interrupt id when resuming.
Root cause: DeepAgents 0.5.5 propagated the top-level interrupt_on={"execute": True} we passed to create_deep_agent(...) into every declarative sub-agent and the auto-injected general-purpose sub-agent. Parallel sub-agents then produced multiple pending interrupts in the same checkpoint, which a single resume decision could not disambiguate.
Fix (#202): stop passing interrupt_on= to create_deep_agent. Instead, append HumanInTheLoopMiddleware directly to the main agent's stack in both _get_default_agent() and create_cli_agent(). Sub-agent shell commands remain protected by CustomSandboxBackend static checks (blocks sudo / chmod / dd / path traversal, 300s timeout, 100KB cap). Semantic model: user authorizes the high-level task goal โ sub-agent executes within sandbox limits.
๐ช Context Window Patch Table
A small per-model context-window patch table (llm/context_window.py) for the newest models that LangChain provider packages haven't registered profile data for yet โ claude-opus-4-7 (1M), gpt-5.5 (1.05M), kimi-k2.6 (262K), deepseek-v4-pro (1.05M), qwen3.6-flash (1M), mimo-v2.5 (1.05M), glm-5 (203K) (#191).
Without this, three things were broken for those models:
- Status bar percentage showed
<used>/200Kinstead of the real<used>/1M(the only fallback wasDEFAULT_CONTEXT_WINDOW_FALLBACK = 200_000). - deepagents
SummarizationMiddlewaretriggered way too early โ its hardcoded170K tokensfallback meant a 1M-context model summarized at ~17% instead of ~85%. ContextEditingMiddlewareclear-tool-uses trigger fell back to a fixed 100K, similarly pessimistic.
The table is meant to be trimmed over time โ once an entry shows up upstream in sst/models.dev, the corresponding row can be removed (the profile-reading layer wins automatically).
๐ง DeepSeek Cross-Provider Reasoning Fix
Fixes a 400 from DeepSeek V4 Pro thinking mode (deepseek-chat) when the conversation history contains assistant messages from another provider, from DeepSeek Flash, or from a session that predates the v0.0.9 reasoning_content capture patch (#192).
Trigger scenarios (any one is enough to break):
- User chats with Anthropic / OpenAI for a few turns, then switches to DeepSeek V4 Pro
/resumeof a thread previously driven by a non-thinking model- Resuming a thread created on an older EvoSci version
Fix: drop the is_reasoner gate in _patch_deepseek_reasoning_passback. Empty-string fallback now applies to every assistant message that lacks reasoning_content, regardless of which DeepSeek model is making the call. Safe because the patch is mounted only when provider == "deepseek", so all callers are guaranteed DeepSeek endpoints.
๐งน Refactors & Cleanup
ChannelRuntimereplacescli/channel.pymodule globals (#197 by @din0s, closes #182)._cli_agent/_cli_thread_idwere poked directly by/model,/channel,_auto_start_channel, and the Rich/TUI session loops; tests needed_reset_channel_globalsautouse fixtures to keep state from leaking. Now: a singleChannelRuntime()per session, threaded throughCommandContext./modelrebinds viactx.channel_runtime.agent = new_agent. Behavior unchanged.- Skill install manifest (#199 by @din0s). Onboarding's "install skills" step kept showing already-installed skills as still-installable.
skills_managernow writes a per-tier.installed.yamlsidecar mapping installed dir name โ original install source (URL, shorthand, or local path)._step_skillschecks both the workspace and global tiers AND the manifest, so packs that explode into many child dirs (e.g.EvoScientist/EvoSkills@skillsโpaper-writing/,evo-memory/, โฆ) are correctly detected. - Dead
MessageBus.dispatch_outboundremoved (#205 by @MuXinCG2004). Two implementations shared the samebus.outboundqueue:MessageBus.dispatch_outbound(subscriber-based) andChannelManager._dispatch_outbound(registry lookup). Only the latter was wired up in production โ the former was a footgun for anyone readingMessageBusin isolation. Net change: +14 / โ139 across 3 files. No behavior change. - Markdown ATX heading spacing (#201) โ ensure proper spacing for ATX headings in Markdown rendering so streaming output renders correctly.
๐ฆ Dependency Bumps
deepagents0.5.5โ0.5.7(#211)- 0.5.6 โ
CompiledSubAgentnow propagateslc_agent_nameinto metadata (langchain-ai/deepagents#3045). Direct improvement for our 6-level sub-agent name resolution chain (priority 1 readsmetadata["lc_agent_name"]). - 0.5.7 โ auto-added GP sub-agent inherits parent permissions (langchain-ai/deepagents#3131). Preparatory for future
FilesystemPermissionadoption.
- 0.5.6 โ
langchain-openrouter0.2.1โ0.2.3โ includes upstream_merge_reasoning_details()(langchain-ai/langchain#36401) which consolidates streaming-fragmentedreasoning_detailsbefore serialization. Our local_patch_openrouter_reasoning_detailsworkaround is now removed (it dropped fragments lossily; upstream merges them losslessly).- Transitive bumps:
langchain-anthropic1.4.2 โ 1.4.3,langsmith0.7.38 โ 0.8.0.
What's Changed
- Feat/llm context window patch table by @X-iZhang in #191
- fix(deepseek): add empty-string fallback for reasoning_content in croโฆ by @X-iZhang in #192
- Implement PruningCheckpointer for efficient checkpoint management andโฆ by @X-iZhang in #194
- Fix/sessions migration sweep race by @X-iZhang in #195
- refactor(cli): replace channel module globals with ChannelRuntime by @din0s in #197
- fix(onboard): detect installed skill packs via install manifest by @din0s in #199
- feat(docker): official image with all runtime deps pre-installed by @din0s in #198
- fix/cli hitl by @X-iZhang in #202
- fix(markdown): ensure proper spacing for ATX headings in Markdown renโฆ by @X-iZhang in #201
- Refactor sub-agent architecture and introduce async support by @X-iZhang in #200
- chore(deps): bump deepagents 0.5.7 and langchain-openrouter 0.2.3, drop redundant patch by @X-iZhang in #211
- refactor(channels): remove dead MessageBus dispatcher by @MuXinCG2004 in #205
- feat(cmd): add /model-fallback command by @mooshroom4422 in #196
- feat(wechat): add personal-WeChat (iLink) backend with QR login by @MuXinCG2004 in #212
- feat: Implement async sub-agent auto-notification system by @X-iZhang in #214
- fix: Improve watcher logic to prevent false-positive notifications onโฆ by @X-iZhang in #216
- feat(qq): add QR-code scan-to-configure onboarding for QQ Bot by @MuXinCG2004 in #213
- Fix/async subagent model switch by @X-iZhang in #217
Full Changelog: v0.0.9...v0.1.0