Skip to content

fix(storage): preserve semantic lock ownership#2140

Merged
zhoujh01 merged 1 commit into
mainfrom
fix/semantic-lock-ownership
May 20, 2026
Merged

fix(storage): preserve semantic lock ownership#2140
zhoujh01 merged 1 commit into
mainfrom
fix/semantic-lock-ownership

Conversation

@qin-ctx
Copy link
Copy Markdown
Collaborator

@qin-ctx qin-ctx commented May 20, 2026

Description

本 PR 引入类型化的锁租约(LockLease)和语义队列锁交接机制,替代之前通过裸 lifecycle_lock_handle_id 字符串传递锁句柄的方式。资源写入、语义处理、memory 处理和 reindex 流程可以明确区分“调用方借用锁”和“队列 worker 接管锁”,避免过早释放外层持有的锁。

Related Issue

N/A

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Refactoring (no functional changes)
  • Performance improvement
  • Test update

Changes Made

  • 新增 LockLease / OwnedLockLease / BorrowedLockLease / LockHandoffRef,将锁所有权、借用和队列交接建模为显式接口。
  • 语义队列、DAG、sidecar 写入和 summarizer 改为接收锁租约,确保 caller-owned lock 不会被 worker 错误释放。
  • reindex 在同一个 tree lock 下运行语义重建,并等待 embedding queue 结束后汇总失败状态。
  • 更新相关单测,覆盖 reindex run context、语义锁借用、content write 和 legacy 字段移除。

Testing

  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have tested this on the following platforms:
    • Linux
    • macOS
    • Windows

通过:

.venv/bin/python -m pytest tests/server/test_admin_rebuild_api.py tests/server/test_content_write_service.py tests/storage/test_memory_semantic_stall.py tests/storage/test_semantic_dag_incremental_missing_summary.py tests/storage/test_semantic_processor_lock_ownership.py -q

结果:67 passed, 4 warnings

额外尝试运行包含客户端文件操作的扩展集合:

.venv/bin/python -m pytest tests/server/test_admin_rebuild_api.py tests/server/test_content_write_service.py tests/storage/test_memory_semantic_stall.py tests/storage/test_semantic_dag_incremental_missing_summary.py tests/storage/test_semantic_processor_lock_ownership.py tests/client/test_file_operations.py -q

结果:71 passed, 5 failed, 4 warnings。失败集中在 tests/client/test_file_operations.py 的 grep/glob 用例,现象为 add_resource(wait=False) 后立刻查询时 matches 为空,并伴随 semantic queue drain timeout。

Checklist

  • My code follows the project's coding style
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

Screenshots (if applicable)

N/A

Additional Notes

本地 pre-commit hook 绑定到系统 Python,但该 Python 环境缺少 pre_commit 模块;提交时使用了 --no-verify,并以上述 pytest 命令完成验证。

Introduce typed lock leases for semantic queue handoff so resource, memory, and reindex flows can share or transfer lock ownership without releasing caller-owned locks prematurely.
@github-actions
Copy link
Copy Markdown

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
🏅 Score: 85
🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 No multiple PR themes
⚡ Recommended focus areas for review

Potential Backward Compatibility Gap

The new SemanticMsg no longer reads the old 'lifecycle_lock_handle_id' field from dictionaries. Existing queued messages from before this change may lose lock handoff information, potentially leading to unprotected operations.

agent_id=data.get("agent_id", "default"),
role=data.get("role", "root"),
skip_vectorization=data.get("skip_vectorization", False),
telemetry_id=data.get("telemetry_id", ""),
target_uri=data.get("target_uri", ""),
lock_handoff=LockHandoffRef.from_value(data.get("lock_handoff")),
is_code_repo=data.get("is_code_repo", False),
coalesce_key=data.get("coalesce_key", ""),
coalesce_version=data.get("coalesce_version", 0),

@github-actions
Copy link
Copy Markdown

PR Code Suggestions ✨

No code suggestions found for the PR.

@zhoujh01 zhoujh01 merged commit fd9ada9 into main May 20, 2026
4 of 5 checks passed
@zhoujh01 zhoujh01 deleted the fix/semantic-lock-ownership branch May 20, 2026 06:37
@github-project-automation github-project-automation Bot moved this from Backlog to Done in OpenViking project May 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants