fix: clamp inf/nan similarity scores at source to prevent JSON crash#882
Merged
zhoujh01 merged 1 commit intovolcengine:mainfrom Mar 23, 2026
Merged
Conversation
The C++ vector engine can produce infinity values from inner product overflow on non-normalized vectors (distance_type=ip with NormalizeVector=False). These inf scores propagate through the adapter and retriever layers, ultimately crashing JSON serialization with ValueError. Fix applied at two layers: 1. CollectionAdapter.query() — clamp non-finite scores to 0.0 immediately after reading from the C++ engine result, before they enter the JSON-serializable record chain. 2. HierarchicalRetriever — clamp _score values at all three entry points (_merge_starting_points, _prepare_initial_candidates, _recursive_search) before score propagation arithmetic can amplify inf into downstream computations. The existing isfinite guard in _convert_to_matched_contexts catches scores only at the final conversion step, which is too late — inf values already cause serialization failures in intermediate API responses and vectordb service endpoints. Closes volcengine#871
|
Failed to generate code suggestions for PR |
zhoujh01
approved these changes
Mar 23, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Closes #871
The C++ vector engine (
bruteforce.h) can produceinffloat values when computing inner product similarity on non-normalized vectors (distance_type=ipwithNormalizeVector=False). The raw dot product of high-dimensional vectors with large component values overflowsfloat32to+inf.These
infscores propagate through the Python adapter and retriever layers untouched, until JSON serialization fails:This crashes the entire server process, requiring a manual restart.
Root Cause
CollectionAdapter.query()readsitem.scorefrom the C++ engine result and assigns it directly torecord["_score"]— no finite checkHierarchicalRetriever._recursive_search()reads_scoreand applies score propagation:alpha * inf + (1 - alpha) * current_score = inf— amplifying the problemisfiniteguard in_convert_to_matched_contexts()catches scores only at the final conversion step — too late, becauseinfalready reaches JSON serialization in intermediate API responsesFix
Clamp non-finite scores to
0.0at the two layers where they enter the Python pipeline:Layer 1 —
openviking/storage/vectordb_adapters/base.pyCollectionAdapter.query(): checkmath.isfinite()immediately after reading from engine result, before the score enters any JSON-serializable recordLayer 2 —
openviking/retrieve/hierarchical_retriever.py_scoreat all three entry points where scores are read from search results:_merge_starting_points()— global search scores_prepare_initial_candidates()— initial candidate scores_recursive_search()— per-directory search scoresThis ensures
infvalues never propagate into score arithmetic or JSON responses.Why not PR #880?
PR #880 applies
SafeJSONResponseonly to error handlers. This prevents the server crash, but the search endpoint still returns an HTTP 500 error instead of results — the user gets nothing. This PR fixes the actual scores so search returns correct results.Changes
openviking/storage/vectordb_adapters/base.py— 1 import, 3 lines changedopenviking/retrieve/hierarchical_retriever.py— 9 lines changed (3 locations)