Skip to content

Update docstrings to reflect reranking endpoint change#1808

Merged
ChrisJar merged 3 commits intoNVIDIA:mainfrom
ChrisJar:rerank-endpoint
Apr 8, 2026
Merged

Update docstrings to reflect reranking endpoint change#1808
ChrisJar merged 3 commits intoNVIDIA:mainfrom
ChrisJar:rerank-endpoint

Conversation

@ChrisJar
Copy link
Copy Markdown
Collaborator

@ChrisJar ChrisJar commented Apr 7, 2026

Description

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.
  • If adjusting docker-compose.yaml environment variables have you ensured those are mimicked in the Helm values.yaml file.

@ChrisJar ChrisJar requested review from a team as code owners April 7, 2026 16:40
@ChrisJar ChrisJar requested a review from edknv April 7, 2026 16:40
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 7, 2026

Greptile Summary

This PR updates docstrings across the reranking module to reflect the already-implemented switch from the OpenAI-compatible /rerank endpoint to the NIM-specific /v1/ranking endpoint (with rankings/logit response shape and query: {text} / passages: [{text}] payload). Tests are also updated to assert the new payload fields (truncate, passages[0]) and response structure.

Confidence Score: 5/5

Safe to merge — all findings are minor P2 docstring polish with no functional impact.

The PR is a documentation-only change with corresponding test fixture updates. Both remaining findings are P2 style suggestions (stale top_n wording and a vLLM version qualifier), neither of which affects runtime behavior.

nemo_retriever/src/nemo_retriever/rerank/rerank.py — two minor docstring nits in the module header and _rerank_via_endpoint Returns section.

Vulnerabilities

No security concerns identified. The PR is documentation-only with aligned test fixture updates; no credential handling, input validation, or network logic was changed.

Important Files Changed

Filename Overview
nemo_retriever/src/nemo_retriever/rerank/init.py Docstring-only update: /rerank/v1/ranking; correct and complete.
nemo_retriever/src/nemo_retriever/rerank/rerank.py Docstring updates align with the /v1/ranking endpoint and rankings/logit response shape; stale top_n reference in the Returns section and potentially misleading vLLM version hint remain.
nemo_retriever/src/nemo_retriever/retriever.py One-line docstring update for reranker_endpoint; accurate and complete.
nemo_retriever/tests/test_nemotron_rerank_v2.py Test fixtures updated to rankings/logit response shape; new assertions for truncate and passages[0] fields added; all changes consistent with implementation.

Sequence Diagram

sequenceDiagram
    participant caller as Caller
    participant rh as rerank_hits()
    participant ep as _rerank_via_endpoint()
    participant nim as NIM Server

    caller->>rh: query, hits, invoke_url
    rh->>ep: query, documents, endpoint
    ep->>ep: normalize URL → /v1/ranking
    ep->>nim: POST /v1/ranking {model, query:{text}, passages:[{text}], truncate:END}
    nim-->>ep: {rankings:[{index, logit}]}
    ep-->>rh: List[float] scores
    rh-->>caller: hits sorted by _rerank_score
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: nemo_retriever/src/nemo_retriever/rerank/rerank.py
Line: 114-115

Comment:
**Stale `top_n` reference in Returns docstring**

The phrase "Documents not returned by `top_n` truncation receive `-inf`" is stale. The payload no longer includes `top_n` — server-side truncation was removed when the endpoint changed. The `-inf` sentinel now means only that the server didn't return a score for that index (e.g. the server returned fewer items than sent), not that it was `top_n`-truncated.

```suggestion
    List[float]
        Scores aligned with *documents* (higher = more relevant).
        Documents for which the server returns no score receive ``-inf``.
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: nemo_retriever/src/nemo_retriever/rerank/rerank.py
Line: 14-17

Comment:
**Potentially misleading vLLM version hint after endpoint change**

The module docstring still opens with "calls a vLLM (>=0.14) or NIM server". The vLLM 0.14 reference was meaningful when the endpoint was `/rerank` (vLLM's OpenAI-compatible rerank API). The new `/v1/ranking` endpoint is NIM-specific; it's worth clarifying whether vLLM also exposes this path, or dropping the vLLM qualifier to avoid implying it does.

```suggestion
When ``invoke_url`` is set the actor/function calls a NIM (or compatible vLLM)
server that exposes the NIM ranking REST API. The helper accepts
either a fully qualified ``.../reranking`` URL or a base URL and appends
``/v1/ranking`` automatically::
```

How can I resolve this? If you propose a fix, please make it concise.

Reviews (3): Last reviewed commit: "Address greptile comment" | Re-trigger Greptile

@ChrisJar ChrisJar requested a review from charlesbluca April 7, 2026 17:31
Copy link
Copy Markdown
Collaborator

@charlesbluca charlesbluca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - mind applying the remaining Greptile suggestion?

@ChrisJar ChrisJar merged commit acedb85 into NVIDIA:main Apr 8, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants