Skip to content

fix: reload full records when updating uri mapping#2165

Merged
qin-ctx merged 1 commit into
mainfrom
fix/reload-full-records-uri-mapping
May 21, 2026
Merged

fix: reload full records when updating uri mapping#2165
qin-ctx merged 1 commit into
mainfrom
fix/reload-full-records-uri-mapping

Conversation

@zhoujh01
Copy link
Copy Markdown
Collaborator

@zhoujh01 zhoujh01 commented May 21, 2026

Description

Related Issue

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Refactoring (no functional changes)
  • Performance improvement
  • Test update

Changes Made

-update_uri_mapping 不再直接拿查询出来的精简记录做 URI 重写,而是先收集记录 id,再通过
self.get(...) 拉取完整记录后再 upsert。另外补了两层保护:
如果没查到有效 id,直接告警并返回;
如果完整记录里缺少 dense vector,就跳过该条并打 warning。
它解决的问题是:URI 改名时,原先用的是字段不全的记录,可能缺少向量等关键数据,导致新记录写入失败、写入残
缺数据,或者 URI 映射迁移不完整。现在改成先补全原记录再写,能保证 URI 映射更新基于完整数据执行,并且在关
键字段缺失时显式失败或跳过,不再静默出错。

Testing

  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have tested this on the following platforms:
    • Linux
    • macOS
    • Windows

Checklist

  • My code follows the project's coding style
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

Screenshots (if applicable)

Additional Notes

@github-actions
Copy link
Copy Markdown

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
🏅 Score: 92
🧪 No relevant tests
🔒 No security concerns identified
✅ No TODO sections
🔀 No multiple PR themes
⚡ No major issues detected

@github-actions
Copy link
Copy Markdown

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Check for missing fetched records

Verify that all requested record IDs were successfully retrieved from storage. If
any IDs are missing, log a warning with the missing IDs to avoid silent partial
updates that could leave data in an inconsistent state.

openviking/storage/viking_vector_index_backend.py [980-989]

 full_records = await self.get(record_ids, ctx=ctx)
 if not full_records:
     logger.warning(
         "update_uri_mapping failed to fetch full records: uri=%s new_uri=%s account_id=%s ids=%s",
         canonical_uri,
         canonical_new_uri,
         ctx.account_id,
         record_ids,
     )
     return False
+fetched_ids = {str(rec["id"]) for rec in full_records if rec.get("id")}
+missing_ids = set(record_ids) - fetched_ids
+if missing_ids:
+    logger.warning(
+        "update_uri_mapping missing some full records: uri=%s new_uri=%s account_id=%s missing_ids=%s",
+        canonical_uri,
+        canonical_new_uri,
+        ctx.account_id,
+        missing_ids,
+    )
Suggestion importance[1-10]: 7

__

Why: Adds a check for missing fetched record IDs, preventing silent partial updates and improving data consistency.

Medium

@qin-ctx qin-ctx merged commit 7e30f65 into main May 21, 2026
5 checks passed
@qin-ctx qin-ctx deleted the fix/reload-full-records-uri-mapping branch May 21, 2026 09:06
@github-project-automation github-project-automation Bot moved this from Backlog to Done in OpenViking project May 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants