v0.33.0
Features
-
New
pensieve-searchagent skill ships in the repo by @arkohutskills/pensieve-search/SKILL.mdteaches Claude Code (and any agent that loads Skills) how to query/api/search, anchor on stable ids, and route citations through the entity detail URL- Previously lived under gitignored
.claude/skills; now distributed as the source of truth so users get the same search recipes the maintainer uses - Encodes an "explore-first" strategy: run one wide query and inspect the
active_app/url/windowdistribution before narrowing, since the topic the user names rarely shows up as its literal string in the captured corpus
-
Search now returns the real total result count by @arkohut
/api/searchemits a realout_ofcount via a new abstractcount_full_text_matchesonSearchProvider(SQLite + Postgres implementations) instead of only reporting the current page slice- Hybrid search and the total count run in parallel on the server; collection-size count cached for 10s to keep facet rendering snappy
- FTS scan, ranking, and stats aggregation unified at a 5000-row cap; the web UI renders "5,000+" once the cap clips so totals stay honest
-
Search UX upgrades by @arkohut
- Chinese queries are jieba-segmented to match the FTS tokenizer's indexed terms
- Default time window narrowed to the last 3 months for faster, more focused results
- New adaptive date-bucket facet — picks day / week / month buckets based on the result span
- Stats facets now aggregate over the full FTS hit set (FTS-only, excluding vector neighbors) for stable counts, with the sample ordered by recency
hybrid_searchexposes sub-phase timings, and OCR/VLM blobs are stripped from hit payloads to keep responses small
-
Web UI rework by @arkohut
- Repainted to a clean neutral + indigo accent palette; theme/language controls promoted to the header, slimmer footer, project slogan back in the home hero
- Entity navigation: arrow keys step through search results (position shown in the toolbar), neighbors are prefetched and image swaps fade to kill nav flash, the previous image stays visible until the next one decodes
- Figure modal removed — every entity click now routes through
/entities/$idfor a stable URL - Search summary distinguishes vector-only hits, surfaces in-flight queries with loading affordances, and folds near-duplicate adjacent hits into card stacks
- Entity viewer: pane toggles collapsed into a single layout dropdown with symmetric controls, responsive at narrow widths, compact OCR rendering with a text/table toggle, text-typed JSON metadata pretty-printed, long string values rendered as block layout
- Esc / Home explicitly navigate to home (restoring the prior search); search-session indicator stays stable across temporal detours
-
Apple Vision OCR
language_preferenceis configurable by @andy- Lets Apple Vision OCR honor a user-chosen language preference instead of the hard-coded default
-
structured_vlmwrites entries asdata_type=jsonby @arkohut- The structured-VLM payload is now stored as JSON so it can be queried and pretty-printed instead of treated as opaque text
-
Library
kinddiscriminator (record / static) by @arkohut- Distinguishes time-series record libraries from static-document libraries so future features can branch on library shape
Bug Fixes
-
Half-open time windows no longer silently drop the filter by @arkohut
- Time-range filters with only one bound (e.g. only
startor onlyend) previously slipped through unapplied
- Time-range filters with only one bound (e.g. only
-
Fresh-install migrations load the simple tokenizer and use batch ALTER by @arkohut
- Fixes a fresh-install failure where the FTS5 simple tokenizer wasn't loaded before the FTS table was created; also batches per-column ALTERs into a single migration
-
Search spinner rotates instead of bouncing by @arkohut
- CSS animation fix so the loading affordance is a steady rotation
Chores
- Drop dead
auth_username/auth_passwordfrom config by @arkohut- Unused settings still listed in the config file; removed to avoid confusion
Full Changelog: v0.32.0...v0.33.0
新特性
-
新增
pensieve-searchagent skill 随仓库分发 by @arkohutskills/pensieve-search/SKILL.md教 Claude Code(以及任何支持 Skills 的 agent)如何调用/api/search、锚定稳定的 entity id、并通过 entity 详情页 URL 引用结果- 之前位于被 gitignore 的
.claude/skills下,现在作为唯一来源随仓库分发,用户能直接拿到作者本人在用的搜索套路 - 内置「先探索后收窄」策略:先跑一次宽查询,观察
active_app/url/window的分布再决定锚点 —— 因为用户口中的话题在截图语料里很少以字面字符串直接出现
-
搜索接口现在返回真实的总命中数 by @arkohut
/api/search通过SearchProvider上新增的抽象方法count_full_text_matches(SQLite + Postgres 实现)返回真实的out_of,不再只反映当前分页切片- 服务端将 hybrid search 与总数查询并行执行;collection size 计数缓存 10 秒,让 facet 渲染保持流畅
- FTS 扫描、排序、统计聚合统一封顶在 5000 行;当封顶生效时 Web UI 显示 "5,000+",避免总数误导
-
搜索体验升级 by @arkohut
- 中文查询用 jieba 分词,对齐 FTS 分词器实际入库的 token
- 默认时间窗口收窄至最近 3 个月,结果更快也更聚焦
- 新增自适应日期 facet —— 根据结果跨度自动选择按天 / 周 / 月分桶
- 统计 facet 现在基于完整的 FTS 命中集(仅 FTS,不包含向量邻居)聚合,计数更稳定;采样按最近时间排序
hybrid_search暴露子阶段耗时,命中结果中剥离 OCR/VLM 大字段,缩小响应体
-
Web UI 重做 by @arkohut
- 切换到干净的中性灰 + 靛蓝点缀配色;主题与语言切换提升到顶部 header,footer 收窄,首页 hero 恢复项目 slogan
- Entity 导航:方向键在搜索结果间切换(工具栏显示当前位置),相邻项预取,图片切换以淡入淡出消除闪烁,下一张解码完成前保留上一张
- 移除 Figure 弹窗 —— 每次点击 entity 都走
/entities/$id,URL 更稳定 - 搜索摘要区分仅向量命中的项,对进行中的查询展示加载态,将相邻的近重复命中折叠成卡片堆
- Entity 查看器:侧栏开关合并为单一布局下拉、控件左右对称、窄屏自适应;OCR 渲染更紧凑并提供文本 / 表格切换;以文本存储的 JSON 元数据自动美化;长字符串值采用块状布局
- Esc / Home 明确返回首页(恢复之前的搜索);搜索 session 指示器在跳转时间上下文时保持稳定
-
Apple Vision OCR
language_preference可配置 by @andy- Apple Vision OCR 现在尊重用户选择的语言偏好,而不是写死的默认值
-
structured_vlm以data_type=json写入条目 by @arkohut- 结构化 VLM 的输出以 JSON 形式存储,可被查询和美化展示,不再被当作不透明文本
-
Library 引入
kind区分(record / static)by @arkohut- 区分时间序列型 record library 与静态文档型 static library,让后续功能能按 library 形态分支
Bug 修复
-
半开时间窗口不再静默丢弃过滤条件 by @arkohut
- 修复仅指定一端(如只设
start或只设end)时时间过滤条件被忽略的问题
- 修复仅指定一端(如只设
-
Fresh install 迁移加载 simple 分词器并批量 ALTER by @arkohut
- 修复全新安装时因 FTS5 simple 分词器未在创建 FTS 表前加载导致的失败;同时把按列 ALTER 合并为一次批量迁移
-
搜索 spinner 改为旋转,不再上下弹跳 by @arkohut
- 修复 CSS 动画,让加载提示稳定旋转
杂项
- 移除配置中失效的
auth_username/auth_passwordby @arkohut- 这两个配置项已不再被使用但仍残留在配置文件中,移除以避免误用
完整更新日志: v0.32.0...v0.33.0