What Changed
-
May 8, 2026 (v3.05.78): May 8, 2026: F-2/F-3 follow-ups + CI unblock (
feature/fix-f2). Main has been red since9c01237d(the trading-agent #99 merge) becausetests/test_packaging.py::test_required_module_imports[modular.trading.ml](issue #97 regression test) caught thatmodular/trading/ml/features.pyandmodular/trading/portfolio.pyimport numpy at module top while numpy is in the[trading]extra —pip install .shipped a broken wheel and #100 / #101 inherited the red. Two-commit fix on top of #101: (a)fix(ci)— drop the dead numpy import fromfeatures.py; defer numpy to insidestacker.py:train()/predict_proba()past their early-return paths; gateportfolio.py's numpy behindtry/except; addpytest.mark.skipifon the optimizer / managed-portfolio / ML-training / factor-scan tests so lean-install CI skips them cleanly. Verified: clean venv with only[web,autosuggest](the exact CI install) 1075 passed, 11 skipped; with full extras 1086 passed, no regressions. (b)fix(daemon)— five F-2/F-3 follow-ups: movemonitor.scheduler.start(...)past the listener bind incc_daemon/cli.py:cmd_serve(so a misconfigured fetch/deliver can't fail before the daemon is reachable); add_foreign_daemon_running()step-aside check at every scheduler loop tick to close the race where REPL/monitor startfires before the daemon writes its discovery file (both schedulers would otherwise race onlast_run_at); flipcc_daemon/schema.pytoPRAGMA synchronous=NORMAL(safe under WAL, 8× fasterEventBus.publish— 305 μs/event → 39 μs/event, important for streaming agent output); clarify injobs.py/monitor/store.py/docs/architecture.mdthat the JSON→SQLite migration is one-way (PR #101's wording implied a fallback read path that doesn't exist); updatedocs/RFC/0002-daemon-foundation-roadmap.mdF-2/F-3 status from OPEN → MERGED. Branch:feature/fix-f2. -
Research lab Phase A — autonomous multi-day research; WeChat smart-reply +
/draftsemi-auto reply; reliability + UX hardening across the lab pipeline. Two big surfaces shipped together: (a) the research lab is no longer single-shot —/lab resume <run_id> [<stage>]reconstructsLabStatefrom SQLite to continue or rewind a run;/lab iterate <run_id>runs a 3-reviewer self-review on the final report (novelty / rigor / clarity / evidence, 1-10), routes the lowest-scoring dimension to the corresponding stage (novelty→QUESTIONING, rigor→IMPLEMENTATION, clarity→DRAFTING, evidence→EXPERIMENT), rewinds + re-runs, loops untiltarget_score/max_iterations/ plateau / budget;/lab backlog add <topic> --iterate --target=N --max=N --prio=Nqueues many topics,/lab daemon startruns them 24/7 in a single-worker loop with crash-recovery (reset_running_backlogunsticks stale rows on next start);/lab modelsprints the effective per-role model + which API key drove each pick + warns when reviewers span <N families (homogeneous review = no meta-loop signal);/lab migrate-paths [--apply]renames legacylab_xxx/output dirs to the human-readable<date>_<time>_<topic-slug>_<run_id_short>form (e.g.2026-05-08_14-30_post-transformer-architectures-survey_b16036de/). (b) WeChat smart-reply panel — when a whitelisted contact sends an inbound message, an auxiliary cheap model drafts 3 candidate replies and pushes them as a panel to yourfilehelper(文件传输助手); reply with1/2/3/AA 1to send, freeform text to customise,xto skip,qfor queue. SQLite-persisted at~/.cheetahclaws/wx_smart_reply.db(in-memory fallback on init failure); contacts JSON at~/.cheetahclaws/wx_contacts.jsonis mtime-hot-reloaded; bot-owner self-uid is auto-recorded on first inbound and excluded from smart-reply unconditionally, so your own messages always reach the agent regardless of whitelist contents. (c)/draft <message>slash command — semi-automatic reply suggestion path for cases where the bot can't intercept the inbound directly (bot account ≠ user main account on iLink ClawBot). 3 candidates drafted via the auxiliary model, optionally tone-conditioned via@<contact_uid_or_label>againstwx_contacts.json; when invoked from a bridge channel (WeChat / Telegram / Slack), candidates are also echoed back to the originating uid + stashed inbridges.draft_cacheso a digit-only reply (1/2/3) consumes the chosen text one-shot, no agent invocation, no smart-reply panel triggered. Reliability hardening on top of #88's MCP work:research/http.pynow uses 429-aware backoff (10/30/60/120s vs 0.5/1/2/4s for 5xx) and honoursRetry-Afterheaders (capped at 180s); the lab surveyor stage grounds in realresearch.aggregator.research()hits before invoking the LLM (top-30 academic+tech results passed as context, persisted assurvey_search_hitsartifact for replay) — fabricated-citation rate drops sharply on tested topics;_dedupe_self_repeat()trims cheap-model degenerate sampling (text == text+text) before storage so reviewer prompts don't see doubled inputs;_extract_numbereddedupes by content (questioner emitting1..5\n1..5keeps 5, not 10); the citation verifier now has a per-citation 30sconcurrent.futureshard wall-clock (kills slow-loris sockets that urllib's socket-timeout ignores) + a 5-min stage-level cap with progress callbacks surfaced to/lab logs(the 11-min hang we saw in the field is gone). REPL ergonomics:/lab daemon startand/lab startnow print the eventual report.md path up front + live-stream stage transitions to the terminal as they happen;/lab status <run_id>shows both new + legacy paths so the user can find old reports too;/configparses JSON-style values (lists, dicts, signed numbers, quoted strings) —/config wechat_smart_reply_whitelist=["wxid_..."]no longer silently saved as a literal string; leading whitespace before/is now stripped before slash-dispatch (so a paste with a stray space still hits the dispatcher, not the agent). Tests: 884 passing (842 unit/integration + 22 e2e), zero regressions; ~80 new pytest cases covering iteration scoring, state reconstruction, backlog atomicity, verifier hard-timeout, slug edge cases, dedupe patterns, self-uid bypass.**