Feat/fs count api#1989
Conversation
Adds a dedicated `count` endpoint that returns the exact number of files and sub-directories under a directory by traversing the filesystem, distinct from `stat`'s vector-index-based estimate. Wired through VikingFS, FSService, HTTP router and sync/async/local SDK clients.
Wires the new fs.count HTTP endpoint into the Rust CLI. Adds `-r/--recursive` and `-a/--all` flags. Documentation updated with CLI usage examples.
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Explore these optional code suggestions:
|
qin-ctx
left a comment
There was a problem hiding this comment.
本次 review 结论:REQUEST_CHANGES。
[Bug] (blocking) CI 的 lint / lint 失败,ruff check 报 openviking/storage/viking_fs.py:15 import block 未排序。这个位置不在 PR diff 的新增行里,无法作为 inline comment 锚定,但合并前应修复:运行 ruff check --fix openviking/storage/viking_fs.py 或手动整理 import。
其余问题已作为 inline comments 标在相关新增代码上。
| try: | ||
| entries = self._ls_entries(current_path) | ||
| except Exception: | ||
| return |
There was a problem hiding this comment.
[Bug] (blocking) count() 在递归遍历时吞掉 _ls_entries(current_path) 的所有异常并直接 return,会把失败的子树当作 0 个节点返回成功结果。
这个接口文档承诺的是基于真实文件系统遍历的 exact count;如果某个子目录读取失败、后端临时错误、挂载异常或权限异常,调用方会收到一个看似成功但实际偏小的 files/dirs/total。建议不要吞异常,至少把 current_path 转成 URI 后通过 map_exception() 映射并抛出,让调用方知道计数失败。
| entry_uri = self._path_to_uri(f"{current_path}/{name}", ctx=ctx) | ||
| if not self._is_accessible(entry_uri, real_ctx): | ||
| continue | ||
| if not is_dir and name.startswith(".") and not show_all_hidden: |
There was a problem hiding this comment.
[Design] (non-blocking) show_all_hidden 的实现和接口说明不完全一致。路由/文档说 include hidden files/directories,但这里仅在 not is_dir 时过滤隐藏名称,因此 .git、.cache 这类隐藏目录默认会被计入,并且 recursive=True 时还会继续统计它们下面的内容。
建议明确语义:如果参数应该像 ls -a 一样控制所有隐藏目录项,就在 name.startswith(".") and not show_all_hidden 时同时跳过文件和目录;如果只想控制隐藏文件,则需要更新接口描述和文档,避免调用方误解。
|
|
||
| return all_entries | ||
|
|
||
| async def count( |
There was a problem hiding this comment.
[Suggestion] (non-blocking) 这是一个新的公共 API,但 PR 没有新增回归测试。建议至少覆盖非目录报错、缺失 URI、递归/非递归、隐藏文件/隐藏目录,以及 HTTP 和 embedded SDK 返回结构一致性。
这些用例能锁住 count() 的精确计数语义,避免后续和 stat.count 的 vector-index 估算语义混淆。
* feat(fs): add count API for directory entry counting Adds a dedicated `count` endpoint that returns the exact number of files and sub-directories under a directory by traversing the filesystem, distinct from `stat`'s vector-index-based estimate. Wired through VikingFS, FSService, HTTP router and sync/async/local SDK clients. * feat(cli): add `ov count` command for directory entry counting Wires the new fs.count HTTP endpoint into the Rust CLI. Adds `-r/--recursive` and `-a/--all` flags. Documentation updated with CLI usage examples. * fix --------- Co-authored-by: dingben.db@bytedance.com <dingben.db@bytedance.com@bytedance.com>
增加 count 接口,可以用于统计文件数量