Releases: hoolulu/deep-research
v4.3.5 — Prompt optimization, pipeline improvements, config tuning | v4.3.5 Prompt 优化、流水线改进、配置调整
What's New
🔧 Prompt & Pipeline Optimization
Redesigned the chapter writing workflow to eliminate common quality issues at the source. Task 1 oracle now verifies outline completeness before proceeding. Task 3 chapter agent follows a structured subsection model (claim → evidence → analysis → counterpoint) and avoids redundant "In summary" filler. Per-section depth is now enforced naturally via tighter prompt constraints rather than post-hoc review.
🐛 Serial Progress Guard
Added a checkpoint file (serial_progress.txt) to the macOS serial mode workflow. When the main agent resumes after user interruption, it reads the file to determine the correct chapter to continue from — preventing accidental parallel dispatch or restart.
📏 Depth Balance Check
New depth-balance script command detects chapters with significantly fewer lines than average. Integrated into the QA pipeline as an informational check — flags thin chapters without blocking assembly.
📐 Character Count Alignment
Updated profiles.json with realistic character targets across all three modes (quick 15K / standard 25K / deep 45K). Chapter agents now write once without iterative trimming — a major source of latency. The stripped-char metric is maintained as the universal cross-language unit.
🧹 Word Count Display Fix
Fixed a bug where the final summary table displayed an incorrect word count value due to the agent extracting the wrong JSON field. Explicitly routed to checks.word_count.count from the qa-report JSON output.
🇨🇳 中文说明
更新内容
🔧 Prompt 与流水线优化
从源头消除常见的报告质量问题。Task 1 oracle 现先生成大纲后自动检查覆盖面完整性。Task 3 章节 agent 遵循结构化子节模型(结论→数据→因果→反方),不再写"综上所述"式重述。各节深度通过更精准的 prompt 约束保障,而非事后 Review。
🐛 串行进度文件保护
macOS 串行模式新增 checkpoints 文件。主 agent 中断恢复后读取进度文件确定从哪章继续,消除意外并行派发或重头开始的问题。
📏 章节深度均衡检查
新增 depth-balance 脚本命令,自动检测行数显著低于平均值的章节。集成到 QA 流水线中,告警但不阻断。
📐 字数目标合理化
更新 profiles.json 三档字数目标(quick 15K / standard 25K / deep 45K)。章节 agent 一次性写完,不再反复修剪——大幅降低耗时。去空格字符数作为全局统一标尺保持不变。
🧹 字数显示 bug 修复
修复最终汇报表中字数显示错误的问题——agent 原先误读了 JSON 的顶层字段。现已改为从 checks.word_count.count 读取。
v4.3.0 — Parallel chapter dispatch, report quality fixes, new reports | v4.3.0 Task 3 并行派发、报告质量修复、新增报告
What's New
🔧 Parallel Chapter Dispatch
Task 3 now supports parallel chapter writing on non-macOS platforms using OpenCode's native task(run_in_background=true). macOS automatically falls back to serial mode via platform detection. Task IDs are persisted to a JSON file for recovery after interruption.
🔍 Dual-Engine Search with Quality Triggered Fallback
SearXNG and Exa are now detected in parallel — both search simultaneously when available. Results are merged and deduplicated. When quality is insufficient (URL count, freshness, source diversity), Layer 3 free sources (DuckDuckGo, Bing, domestic Chinese sources) are automatically triggered as a fallback.
📊 Final Report Table
The completion summary now displays as a structured table with lang-adaptive labels. Includes quick-insights row from outline chapter 1, freshness data, and a search strategy description. The data_limited badge is inlined into the Data row.
📁 Report Management
Promo footer added to all 59 existing reports in one batch update. Showcase table in README lists featured reports with clickable titles. /research-update command now supports direct git pull and displays project Star History.
🐛 Quality Fixes
Fixed language contamination where non-English characters leaked into report metadata headers. Repaired TOC anchors in non-zh reports to match GitHub's auto-generated heading IDs. Restored missing promo footer on GenAI Enterprise report.
🇨🇳 中文说明
更新内容
🔧 Task 3 并行派发
支持非 macOS 平台并行撰写各章,使用 OpenCode 原生 task(run_in_background=true)。macOS 通过平台检测自动切换串行模式。task-id 持久化到 JSON 文件,中断后可恢复。
🔍 双引擎搜索 + 质量触发补强
SearXNG 和 Exa 并行检测,同时可用时同时搜索,结果合并去重。搜索结果质量不足时(URL 数量、时效性、来源多样性)自动触发 Layer 3 免费源补强。
📊 最终汇报表格
汇报总结改为结构化表格,标签按语言自适应。包含第 1 章观点速览行、数据新鲜度、搜索策略描述。data_limited 徽标并入 Data 行。
📁 报告管理
一次性为全部 59 份报告添加推广尾链。README 新增展示表陈列精选报告。/research-update 命令支持 git pull 直拉,显示项目 Star History。
🐛 质量修复
修复非英文字符泄漏到报告元数据的问题。修复非中文报告的 TOC 锚点与 GitHub 自动生成 ID 不匹配的问题。恢复 GenAI Enterprise 报告缺失的推广尾链。
v4.2.2 — platform-aware chapter dispatch (macOS serial workaround)
v4.2.2
What's Changed
- Platform detection: Task 3 now runs
uname -sbefore dispatching chapters- macOS (Darwin): serial dispatch —
task(run_in_background=false)in a loop, one chapter at a time. No dependency on oh-my-openagent's background-task notification system, which has known bugs on macOS (issues #4721, #4874, #5089 in code-yeongyu/oh-my-openagent) - Other platforms (Windows, Linux, Codex CLI): parallel dispatch —
task(run_in_background=true)as before, keeps full speed
- macOS (Darwin): serial dispatch —
- New report: added 1 new Chinese research report (AI programming tools trends 2026)
更新内容
v4.2.1 — update command, star history, minor fix
v4.2.1
What's Changed
- update command: now directly pulls latest code from main branch — no version check, no confirmation required
- README: added Star History chart at the bottom
- Reports: added 2 new Chinese research reports (toy industry, Douyin personal branding)
- AGENTS.md: added bilingual commit message requirement (English first, Chinese second)
Known Issue — macOS (oh-my-openagent only)
On macOS, due to a known bug in oh-my-openagent (issues #4721, #4874, #5089), after all chapters are written in parallel (Task 3), the main agent may not automatically proceed to assembly. Workaround: type 继续 or continue once to resume.
This only affects oh-my-openagent on macOS. Other platforms (Codex CLI, Windows omo) are not affected.
更新内容
- update 命令:改为直接从 main 分支拉取最新代码,不再比对版本号,无需用户确认
- README:底部添加 Star History 星标轨迹图
- 新报告:新增 2 篇中文调研报告(玩具行业、抖音个人自媒体生态)
- AGENTS.md:新增 git commit 双语提交要求(英文在前,中文在后)
已知问题 — macOS(仅 oh-my-openagent)
在 macOS 上,由于 oh-my-openagent 的已知 bug(#4721、#4874、#5089),章节并行撰写完毕后,主 agent 可能不会自动进入装配环节。临时解决方案:回复 继续 或 continue 即可恢复。
此问题仅影响 macOS 版 oh-my-openagent。其他平台(Codex CLI、Windows omo)不受影响。
v4.2.0
v4.2.0
What's Changed
Fix: Task 3 switched from parallel background dispatch to serial synchronous chapter writing to bypass OpenCode v1.16 background notification regression. Chapters are now written one at a time (run_in_background=false) — slightly slower but 100% reliable.
New Reports (all generated in this session):
- 8 new English reports: AI Chip, Cross-border E-commerce, EV Battery, GenAI Adoption, Data Center Energy, Autonomous Vehicles, Quantum Computing, Edge Computing
- 10 new non-English reports: Hindi/Indonesian/Italian/Dutch/Polish/Swedish/Thai/Turkish/Vietnamese — covering agriculture, digital economy, luxury fashion, semiconductors, IT outsourcing, biotech, tourism, energy, electronics
Other:
- Data-pool.json now uses Python json.dump (skip write tool size limit)
- Offline mode (Step 0.5) merged into main workflow
- Language output guard strengthened
- Documentation updated (parallel → serial references)
v4.2.0
改动内容
修复: Task 3 从并行后台派发改为串行同步撰写章节,绕过 OpenCode v1.16 后台通知回归问题。逐章同步写入(run_in_background=false)——稍慢但 100% 可靠。
新增报告(本次会话生成):
- 8 篇英文报告:AI 芯片、跨境电商、EV 电池、GenAI 采用、数据中心能耗、自动驾驶、量子计算、边缘计算
- 10 篇多语言报告:印地语/印尼语/意大利语/荷兰语/波兰语/瑞典语/泰语/土耳其语/越南语——覆盖农业、数字经济、奢侈品、半导体、IT外包、生物科技、旅游、能源、电子
其他优化:
- data-pool.json 改用 Python json.dump 写入(绕过 write 工具大小限制)
- 离线模式(Step 0.5)合并到主流程
- 语言输出加固
- 文档更新(并行→串行表述)
v4.1.0 — Offline Mode for Local File Research
v4.1.0 - Offline Mode for Local File Research
What's New
🔥 Offline Mode
- The skill can now work with local files only — no internet connection required
- Automatically detects user intent from the first prompt (mentions of local files, "don't search", "offline mode") via LLM semantic understanding
- Supports
.md,.txt,.pdf,.docxfile formats - Auto-installs PyPDF2 for PDF text extraction when model cannot read PDF natively
- Auto-installs python-docx for DOCX parsing
How it works
When a user includes local file paths and indicates no online search:
- Detects offline intent via LLM semantic understanding (Step 0.5)
- Skips all web search, scraping, and data collection (Task 2 offline branch)
- Reads local files directly via read/glob tools
- Builds the same structured data-pool.json as online mode
- Reuses existing Task 3 (chapter writing) and Task 4 (assembly/QA) unchanged
Bug Fixes
- Fix duplicate report files with same title in reports/ directory
- Fix README cost estimates with DeepSeek pricing source URL
- Fix English README unit error in cost section
🇨🇳 中文说明
v4.1.0 - 本地文件离线调研模式
新增功能
🔥 离线模式
- Skill 现支持纯本地文件调研,无需联网
- 自动通过 LLM 语义理解识别用户意图(提到本地文件、要求不联网等),非关键词匹配
- 支持
.md、.txt、.pdf、.docx四种文件格式 - AI 自动安装 PyPDF2 提取 PDF 文本(无需用户操作)
- AI 自动安装 python-docx 解析 DOCX 文件
工作原理
用户输入本地文件路径并指示不联网后,系统:
- 识别离线意图(Step 0.5,LLM 语义理解)
- 跳过所有搜索、抓取流程(Task 2 离线分支)
- 读取本地文件
- 构建标准 data-pool.json(与在线模式格式完全一致)
- 复用现有的章节撰写(Task 3)和装配质检(Task 4),零改动
Bug 修复
- 修复 reports/ 目录同标题报告重复问题
- 修复 README 成本估算数据,补充 DeepSeek 定价来源 URL
- 修复英文版 README 成本单位错误
Full Changelog: v4.0.0...v4.1.0
v4.0.0 — Multilingual pipeline, language-aware search, reports reorganization
v4.0.0 — Multilingual pipeline, language-aware search, reports reorganization
Changes from v3.0.2
🌍 Fully multilingual final report
- All summary labels (outline/data/report/chapters/sources/facts/lines/chars/min) now dynamically translate to
$LANG— zh→中文, en→English, fr→Français, ja→日本語, ru→Русский, etc. - Chapter list heading also translated. Previously only zh/en were supported; now all 19 languages.
🔍 Language-aware search source filtering
- Non-Chinese research (
$LANG != "zh") now skips Chinese-only search engines (cn.bing.com, sogou, 360) and all B-class Chinese sources (zhihu, 36kr, CSDN, etc.) to eliminate irrelevant results. - Generic search engines now get locale parameters: Brave
&country={COUNTRY}, Mojeek&lang={LANG}. - Regional engines added: Yandex for Russian (
ru), Yahoo JP for Japanese (ja). - LANG→COUNTRY mapping table added to SKILL.md for Task 2 variable replacement.
📁 Reports organized by language
- Reports now saved to
reports/$LANG/subdirectories (e.g.,reports/zh/,reports/en/,reports/fr/). - Existing 38 reports classified and moved into their respective language directories.
🔧 Windows compatibility improvements
- Filename sanitization (dr_gen.py): Windows-invalid characters (
<>:"/\|?*) replaced with-; trailing dots/spaces trimmed. - Zero-byte file cleanup: Task 4 deletes stale 0-byte stubs before assembly to prevent silent failures.
os.makedirs(dirname(output), exist_ok=True)added as safety net in dr_gen.py..gitignoreupdated:tmp/,language.txt,start_time.txtnow ignored.
🔄 Pipeline restructuring
- Setup phase extracted: TMPDIR creation, TOOLSDIR/PROMPTSDIR detection, and file reading now happen before Step 0 (language detection). Previously Step 0 referenced
{TMPDIR}before it was created, causinglanguage.txtto be written to the wrong location on first run. - Language detection now announces result:
🌐 Language detected: enafter completion. - "禁止" rule updated: clarifies handoff file reads (outline.json, manifest.json) are allowed between tasks; only search calls and data processing must stay within sub-agents.
✅ QA improvements
- TOC heading whitelist expanded in
dr_check.py: now includes all 19 language variants (目次/목차/Índice/Table des matières/Inhaltsverzeichnis/etc.) — previously only had English/Chinese/German. - Final report template requires all labels to be in
$LANG(no more Chinese labels appearing in French research output).
Files changed
SKILL.md— Setup phase, search source filtering table, reports/$LANG/, final report multilingual template, updated variable mappingsprompts/task2_data_collection.md— Search source language filtering, regional engines, LANG/COUNTRY variablesprompts/task4_assembly.md— Output path changed toreports/{LANG}/tools/dr_gen.py— Filename sanitization,os.makedirssafety nettools/dr_check.py— TOC heading whitelist expanded to 19 languages.gitignore— New ignores for tmp/ and temp filesVERSION— 3.0.2 → 4.0.0reports/— Existing 38 files reorganized by language subdirectory
v4.0.0 — 全链路多语言、搜索源按语言过滤、报告按语言分类
相对于 v3.0.2 的变更
🌍 最终汇报完全多语言化
- 所有摘要标签(大纲/数据/报告/章/来源/事实/行/字/分钟)根据
$LANG动态翻译——zh→中文、en→English、fr→Français、ja→日本語、ru→Русский…… - 章节列表标题同步翻译。此前仅支 zh/en 两种,现覆盖全部 19 种语言。
🔍 搜索源按语言过滤
- 非中文调研(
$LANG != "zh")跳过中文专用搜索引擎(cn.bing.com、搜狗、360)和 B 类中文源(知乎、36氪、CSDN 等),避免噪音结果。 - 通用搜索引擎加 locale 参数:Brave
&country={COUNTRY}、Mojeek&lang={LANG}。 - 新增区域引擎:俄语用 Yandex,日语用 Yahoo JP。
- 在 SKILL.md 中添加 LANG→COUNTRY 映射表用于 Task 2 变量替换。
📁 报告按语言分类
- 报告保存到
reports/$LANG/子目录(如reports/zh/、reports/en/、results/fr/)。 - 38 份现有报告已分类移入对应语言目录。
🔧 Windows 兼容性改进
- 文件名净化(dr_gen.py):Windows 非法字符(
<>:"/\|?*)替换为-;尾部句点和空格去除。 - 零字节残留清理:Task 4 在装配前删除所有 0 字节文件,防止静默失败。
- dr_gen.py 写文件前加
os.makedirs(dirname(output), exist_ok=True)兜底。 .gitignore更新:新增tmp/、language.txt、start_time.txt。
🔄 流程重构
- 分离出 Setup 阶段:TMPDIR 创建、TOOLSDIR/PROMPTSDIR 确定、文件读取,现在都在 Step 0 语言判定之前完成。此前 Step 0 引用
{TMPDIR}时目录还未创建,导致language.txt第一次被写到错误位置。 - 语言判定后向用户公告结果:
🌐 Language detected: en。 - 更新"禁止"规则:明确 Task 间 handoff 文件读取(outline.json、manifest.json)不受限;只有搜索引擎调用和数据处理必须在子 agent 内完成。
✅ QA 改进
dr_check.py的 TOC 标题白名单扩展到 19 种语言(目次/목차/Índice/Table des matières/Inhaltsverzeichnis 等),此前只有英文/中文/德语三项。- 最终汇报模板强制全部标签按
$LANG翻译(不再出现法语调研结果显示中文标签的问题)。
变更文件
SKILL.md— Setup 阶段、搜索源过滤表、reports/$LANG/、最终汇报多语言模板、变量映射更新prompts/task2_data_collection.md— 搜索源语言过滤、区域引擎、LANG/COUNTRY 变量prompts/task4_assembly.md— 输出路径改为reports/{LANG}/tools/dr_gen.py— 文件名净化、os.makedirs兜底tools/dr_check.py— TOC 标题白名单扩展到 19 种语言.gitignore— 新增 tmp/ 和临时文件忽略VERSION— 3.0.2 → 4.0.0reports/— 38 份现有报告按语言子目录重组
v3.0.2
v3.0.2 — Encoding fixes & faster language detection
Changes from v3.0.1
🔧 Encoding (Windows CP936 pipe fix)
- oracle no longer returns JSON via response body (which goes through shell pipe and corrupts non-ASCII text on Windows). Instead, oracle outputs JSON in its response → main agent uses
writetool to create the file, bypassing the CP936 encoding issue entirely. - This affects Task 1 (outline generation). Task 2/3/4 already used file-based I/O and were not affected.
⚡ Language detection speedup
- Replaced
detect_lang.py(213-line Python script) with LLM-based language detection. The main agent now determines the language directly from the topic text and writeslanguage.txtviawritetool. - Eliminates the 1-3 second
python3→pythonfallback delay on Windows. - Supports 19 languages with "en" default fallback.
📝 README improvements
- Non-OpenCode installation prompt updated: step 3 now adapts per-tool (Codex CLI → skill, Claude Code → Hook, Cursor → custom commands, others → auto-detect).
- Both Chinese and English READMEs updated.
Files changed
SKILL.md: Step 0 rewritten (LLM detection, removeddetect_lang.py); Task 1 flow fixed (oracle → response → main agent write); cross-platform encoding section updated.prompts/task1_oracle.md: Reverted to oracle outputting JSON in response body; added{LANG}variable.README.md/README_EN.md: Non-OpenCode install prompts adapted per tool.tools/detect_lang.py: Removed (replaced by LLM-based detection).VERSION: 3.0.0 → 3.0.2
v3.0.2 — 编码修复 & 语言检测提速
相对于 v3.0.1 的变更
🔧 编码修复(Windows CP936 pipe 问题)
- Task 1 oracle 不再在回答中用 JSON 代码块输出(回答通过 shell pipe 回传时,俄语/日语等非 ASCII 字符会被 Windows CP936 编码损坏)。改为 oracle 输出 JSON 到 response body → 主 agent 用
write工具写文件,完全绕过 pipe。 - Task 2/3/4 原本就走文件 I/O,无此问题。
⚡ 语言检测提速
- 删除
detect_lang.py(213 行 Python 脚本),改为主 agent 直接根据主题文本判定语言,用write工具写language.txt。 - 消除了 Windows 上
python3→python回退的 1-3 秒延迟。 - 支持 19 种语言,不确定时默认英文。
📝 README 改进
- 非 OpenCode 安装提示词按工具分路线(Codex CLI → skill、Claude Code → Hook、Cursor → 自定义指令)。
- 中英文版同步更新。
变更文件
SKILL.md:Step 0 重写(LLM 判定、删除 detect_lang.py);Task 1 流程修复(oracle → response → 主 agent write);跨平台编码章节更新。prompts/task1_oracle.md:恢复 oracle 输出 JSON 到回答 body;添加{LANG}变量。README.md/README_EN.md:非 OpenCode 安装提示词按工具分路线。tools/detect_lang.py:已删除(LLM 判定替代)。VERSION:3.0.0 → 3.0.2
v3.0.1
v3.0.1 — Windows CP936 Encoding Compatibility + BOM Tolerance
🐛 Bug fix: Windows non-English language encoding crash (核心修复)
Root cause: Windows PowerShell 5.1 console encoding CP936(GBK) cannot represent 18 non-English languages (Cyrillic, CJK, Arabic, Thai, etc.), causing:
detect_lang.pyreceiving non-ASCII topics via argv gets garbled- PowerShell
Set-Content -Encoding UTF8adds BOM → Pythonjson.load()crashes - Shell redirect/pipe encodes via CP936, corrupting non-ASCII text
修复方案(三层防御):
| Layer | Fix | File |
|---|---|---|
| 输入端 Input | detect_lang.py new --file/--output args, bypass shell argv entirely |
tools/detect_lang.py |
| 读容错 Read | All Python file reads use encoding='utf-8-sig' (BOM auto-stripped) |
tools/dr_tools.py dr_check.py dr_gen.py |
| 输出端 Output | sys.stdout.reconfigure(encoding='utf-8') for UTF-8 stdout |
tools/detect_lang.py |
📝 Documentation (文档)
- SKILL.md Step 0 updated to cross-platform
--file+--outputworkflow - New section 9: Cross-platform Encoding Standards (5 hard rules)
- SKILL.md 第 9 节新增「跨平台编码规范」5 条硬性规则
✅ Verified (验证)
10 languages end-to-end: zh/ja/ko/ru/ar/th/de/es/en/uk — all pass
All pass: True
Full changelog: git log v3.0.0..v3.0.1
v3.0.0: 多语言支持 + 并行验证 + 数据质量加固
v3.0.0 — 多语言支持 · 并行验证 · 数据质量加固 · 完整容错
🆕 多语言支持(核心新能力)
- Step 0 语言判定:
detect_lang.py自动检测主题语言,所有输出跟随$LANG - 19 种语言支持:通过
lang_config.py配置表覆盖中日韩英等主流语言 - 双语文档:新增
README_EN.md英文版 - 搜索关键词语言与主题一致:中文主题使用中文关键词,英文主题使用英文关键词
- 章节 agent 不硬编码语言:跟随
$LANG变量,不做任何语言假设
⚡ 并行验证 & 数据质量
validate-all-chapters:ThreadPoolExecutor 并行验证所有章节,替代逐章串行detect-engine:SearXNG / Exa / Web Sources 三种引擎自动探测check-datapool:返回source_count/fact_count结构化统计- 数据受限检测:自动识别数据受限场景,在报告中插入声明
- Prompt 强制脚本统计 + 兜底:避免遗漏或空数据
🛡 完整容错机制
- Task 2 数据收集失败后自动重试 1 次
- 步骤失败时先删除残留产物再重新执行
- 所有脚本/命令调用都有兜底路径(换
sys.executable/ 检查路径 / 直接 Python 实现) - 三次失败后向用户报告具体问题
📁 报告目录重组
案例报告/→reports/(20 份报告全部迁移,保留 git 历史)- 新增 3 份多语言报告:
- 🇬🇧 Quantum Computing Market Outlook 2026
- 🇯🇵 日本のアニメ産業のグローバル市場戦略
- 🇰🇷 한국 반도체 산업의 글로벌 경쟁력 분석
- 删除过时的 Global SaaS ERP Market 报告
🛠 工具链重构
dr_gen.py:重构命令解析,集成并行验证dr_check.py:新增多种质量检查命令(check-citations, check-headers, check-sections 等)dr_tools.py:新增validate-all-chapters/detect-engine/qa-report等高阶命令- 新增
json-get命令,铁律禁止内联代码
📝 文档 & 配置
SKILL.md:完全重写,双语 description,Step 0 语言判定完整流程README.md:FAQ 版本更新策略说明,命令列表表格化,流程图优化RULES.md:标题大众化,修复过期引用(Search Router → SearXNG)TYPES.md:更新分类标准与编号规范command/research.md/update.md:同步最新流程.gitignore:新增.DS_Store
🧹 规范清理
- prompts 统一编号(
task1_oracle/task2_data_collection/task3_chapter_agent/task4_assembly) - 批量优化:合并年度搜索、去掉冗余 manifest、修正编号、修复引用格式
- 章节标题统一从 outline 生成,目录锚点可点击跳转
- 引用格式改为
(N)可点击,修复所有旧报告
🔗 完整变更记录
详见 commit log:git log v2.1.1..v3.0.0