Skip to content

fix(queuefs): vectorize at target URIs during incremental updates#746

Open
gilangjavier wants to merge 2 commits intovolcengine:mainfrom
gilangjavier:main
Open

fix(queuefs): vectorize at target URIs during incremental updates#746
gilangjavier wants to merge 2 commits intovolcengine:mainfrom
gilangjavier:main

Conversation

@gilangjavier
Copy link

@gilangjavier gilangjavier commented Mar 18, 2026

When adding resources with incremental update (temp_path), directory-level L0/L1 vectors were indexed under the temporary URI. After SyncDiff moves files to the final target URI, vectors remained at temp path, causing search returns 0 results.

This change adjusts SemanticDagExecutor to compute target URIs for vectorization tasks when incremental_update=True, applying to both files and directories.

Fixes #743

…ncremental updates

When adding resources with incremental update (e.g., using temp_path), the semantic DAG previously vectorized directory-level L0/L1 under the temporary URI. After SyncDiff moves files to the final target URI, the vectors remained attached to the temp path, causing searches under the actual resource directory to return no results.

Adjust SemanticDagExecutor to compute target URIs for vectorization tasks when incremental_update=True. This applies to both file and directory vectorization, ensuring vectors are indexed at the final destination.

Also added test to verify target URIs are used in incremental mode.

Fixes issue volcengine#743 in volcengine/OpenViking.

Signed-off-by: Hephaestus <hephestus@openclaw>
@CLAassistant
Copy link

CLAassistant commented Mar 18, 2026

CLA assistant check
All committers have signed the CLA.

Copy link
Contributor

@qin-ptr qin-ptr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

This PR correctly fixes a vector indexing bug in incremental update scenarios. The fix ensures that vectorization tasks use target URIs instead of temporary URIs, preventing search failures after SyncDiff moves files.

Key points:

  • ✅ Root cause correctly identified and addressed
  • ✅ Both file and directory vectorization logic updated
  • ✅ Test coverage added for incremental update mode
  • ✅ Safe fallback logic in place

Only one minor readability suggestion below.

🤖 I am a bot owned by @qin-ctx.

)
monkeypatch.setattr(
"openviking.storage.transaction.get_lock_manager",
lambda: MagicMock(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] (non-blocking)

The original test had a helpful comment # Mock lock layer: LockContext as no-op passthrough at line 85 that explained why these mocks were needed. After extracting _mock_transaction_layer, this context was lost.

Consider adding a docstring to _mock_transaction_layer:

def _mock_transaction_layer(monkeypatch):
    """Mock lock layer: LockContext as no-op passthrough."""
    ...

Or keep a brief comment at the call site:

# Mock transaction layer as no-op
_mock_transaction_layer(monkeypatch)

@gilangjavier
Copy link
Author

Thanks for the suggestion — addressed in b63244a.

I added a docstring to _mock_transaction_layer:

  • "Mock lock layer: LockContext as no-op passthrough."

@qin-ctx qin-ctx requested a review from myysy March 19, 2026 03:54
@qin-ctx
Copy link
Collaborator

qin-ctx commented Mar 19, 2026

cc @myysy

@myysy
Copy link
Collaborator

myysy commented Mar 19, 2026

Thank you for your PR. Considering the consistency of the overall incremental update mechanism, it's not suitable to directly index to the target URI in the DAG stage. We will later add index updates for dir_uri in the on_complete callback function to align the mechanism. This PR will add you as a contributor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Backlog

Development

Successfully merging this pull request may close these issues.

[Bug]: Directory vectors (L0/L1) retain temp URIs after SyncDiff, causing search to return 0 results

5 participants