Skip to content

fix: clamp inf/nan similarity scores at source to prevent JSON crash#882

Merged
zhoujh01 merged 1 commit intovolcengine:mainfrom
snemesh:fix/clamp-inf-nan-scores
Mar 23, 2026
Merged

fix: clamp inf/nan similarity scores at source to prevent JSON crash#882
zhoujh01 merged 1 commit intovolcengine:mainfrom
snemesh:fix/clamp-inf-nan-scores

Conversation

@snemesh
Copy link
Contributor

@snemesh snemesh commented Mar 23, 2026

Problem

Closes #871

The C++ vector engine (bruteforce.h) can produce inf float values when computing inner product similarity on non-normalized vectors (distance_type=ip with NormalizeVector=False). The raw dot product of high-dimensional vectors with large component values overflows float32 to +inf.

These inf scores propagate through the Python adapter and retriever layers untouched, until JSON serialization fails:

ValueError: Out of range float values are not JSON compliant: inf

This crashes the entire server process, requiring a manual restart.

Root Cause

  1. CollectionAdapter.query() reads item.score from the C++ engine result and assigns it directly to record["_score"] — no finite check
  2. HierarchicalRetriever._recursive_search() reads _score and applies score propagation: alpha * inf + (1 - alpha) * current_score = inf — amplifying the problem
  3. The existing isfinite guard in _convert_to_matched_contexts() catches scores only at the final conversion step — too late, because inf already reaches JSON serialization in intermediate API responses

Fix

Clamp non-finite scores to 0.0 at the two layers where they enter the Python pipeline:

Layer 1 — openviking/storage/vectordb_adapters/base.py

  • CollectionAdapter.query(): check math.isfinite() immediately after reading from engine result, before the score enters any JSON-serializable record

Layer 2 — openviking/retrieve/hierarchical_retriever.py

  • Clamp _score at all three entry points where scores are read from search results:
    • _merge_starting_points() — global search scores
    • _prepare_initial_candidates() — initial candidate scores
    • _recursive_search() — per-directory search scores

This ensures inf values never propagate into score arithmetic or JSON responses.

Why not PR #880?

PR #880 applies SafeJSONResponse only to error handlers. This prevents the server crash, but the search endpoint still returns an HTTP 500 error instead of results — the user gets nothing. This PR fixes the actual scores so search returns correct results.

Changes

  • openviking/storage/vectordb_adapters/base.py — 1 import, 3 lines changed
  • openviking/retrieve/hierarchical_retriever.py — 9 lines changed (3 locations)

The C++ vector engine can produce infinity values from inner product
overflow on non-normalized vectors (distance_type=ip with
NormalizeVector=False). These inf scores propagate through the adapter
and retriever layers, ultimately crashing JSON serialization with
ValueError.

Fix applied at two layers:

1. CollectionAdapter.query() — clamp non-finite scores to 0.0
   immediately after reading from the C++ engine result, before
   they enter the JSON-serializable record chain.

2. HierarchicalRetriever — clamp _score values at all three entry
   points (_merge_starting_points, _prepare_initial_candidates,
   _recursive_search) before score propagation arithmetic can
   amplify inf into downstream computations.

The existing isfinite guard in _convert_to_matched_contexts catches
scores only at the final conversion step, which is too late — inf
values already cause serialization failures in intermediate API
responses and vectordb service endpoints.

Closes volcengine#871
@CLAassistant
Copy link

CLAassistant commented Mar 23, 2026

CLA assistant check
All committers have signed the CLA.

@github-actions
Copy link

Failed to generate code suggestions for PR

@zhoujh01 zhoujh01 merged commit 2a93185 into volcengine:main Mar 23, 2026
2 checks passed
@github-project-automation github-project-automation bot moved this from Backlog to Done in OpenViking project Mar 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Server crashes with ValueError: Out of range float values are not JSON compliant: inf

3 participants