feat(reranker): support alibaba qwen3-rerank by quicklyfast · Pull Request #1501 · vectorize-io/hindsight

quicklyfast · 2026-05-07T08:44:04Z

support alibaba qwen3-rerank

nicoloboschi · 2026-05-08T07:39:22Z

A few things before this can land:

1. Reuse _CohereCompatibleRerankClient instead of re-implementing it

The DashScope compatible endpoint speaks the same wire format as the existing helper at hindsight_api/engine/cross_encoder.py:530 ({model, query, documents, top_n} → {results: [{index, relevance_score}]}). _rerank_compatible duplicates that logic. Compose the helper the way SiliconFlowCrossEncoder (line 768) and ZeroEntropyCrossEncoder (line 726) do — you get the query-grouping and request shape for free.

2. Drop _COMPATIBLE_MODELS

Hardcoding frozenset({"qwen3-rerank"}) means any new variant (qwen3-rerank-plus, future qwen3-reranker-*, etc.) silently routes to the native endpoint and fails with a shape mismatch. Pick one:

Only support the compatible endpoint and remove the native path entirely (simplest — qwen3-rerank is the headline model here).
Or expose endpoint selection as an explicit config flag rather than inferring it from the model name.

3. Documentation is missing

hindsight-docs/docs/developer/configuration.md:556 — add alibaba to the HINDSIGHT_API_RERANKER_PROVIDER value list, plus rows for HINDSIGHT_API_RERANKER_ALIBABA_API_KEY / _MODEL next to the SiliconFlow block (~line 583).
hindsight-docs/docs/developer/models.mdx:475 — provider table entry and an example env-var block (mirror the SiliconFlow one at line 553).

Also: the class docstring says auth comes from DASHSCOPE_API_KEY or HINDSIGHT_API_RERANKER_ALIBABA_API_KEY, but from_env only reads the latter — either drop the claim or add the fallback (Cohere does it at config.py:1581).

quicklyfast · 2026-05-19T10:34:27Z

@nicoloboschi Please review it.

nicoloboschi

LGTM

Notable upstream additions pulled in: - feat(api): clear endpoint for mental model content (vectorize-io#1706) - feat(api): per-operation LLM concurrency caps (vectorize-io#1738) - feat(typescript-client): concrete generated types (replace Promise<any>) - feat(reranker): Alibaba Qwen3-Rerank support (vectorize-io#1501) - feat: opencode-go LLM provider (vectorize-io#1652) - feat(extensions): OperationValidator.precheck pre-body-parse hook (vectorize-io#1548) - feat(right-agent): new Right Agent integration (vectorize-io#1599) - fix(ollama): ollama-cloud provider + native API auth (vectorize-io#1734) - fix(reflect): hide disabled tools from agent system prompt (vectorize-io#1740) - fix(retain): split oversized single items in batch retain (vectorize-io#1736) - fix: escape literal braces in user-supplied prompt fields (vectorize-io#1728) - fix(mental-models): full refresh pending delta baselines (vectorize-io#1684) - fix(api): lazy load reflect tiktoken encoding (vectorize-io#1654) - fix(api): reject blank retain content (vectorize-io#1685) - fix(api): auto-refresh openai-codex OAuth access_token (vectorize-io#1637) - fix(api): gzip middleware for graph payloads (vectorize-io#1731) - fix(reranker): detect pre-normalized scores; rank-based fallback (vectorize-io#1512) Conflicts: only package-lock.json files (took upstream, npm install verified) Fork customizations verified intact (all 14 checks): - duplicate_checker_fn streaming Phase 1.5 in orchestrator - FallbackLLMProvider + CircuitBreaker (fallback_llm.py) - Single-fact consolidation mode (is_fallback_active routing) - recallExp + Jaccard dedup + compact memory formatter (plugin) - Codex 5.1-codex-mini reasoning guard - Infinity reranker /models fallback in cross_encoder.py - diversity.py + deduplication.py fork-only modules retained Tests: - openclaw vitest: 267/267 pass - ruff: clean - tsc --noEmit: clean - pytest: pre-existing env-config flakes (need HINDSIGHT_API_LLM_API_KEY); upstream commit 90cb145 acknowledged as pre-existing CI flakes Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

feat(reranker): support alibaba qwen3-rerank

25ecaa0

quicklyfast force-pushed the main branch from 002b068 to 25ecaa0 Compare May 8, 2026 01:20

quicklyfast and others added 3 commits May 18, 2026 09:19

feat(reranker): support alibaba qwen3-rerank

aa60cf0

Merge branch 'vectorize-io:main' into main

948a7f2

Fix formatting of Alibaba API key export line

278c04a

nicoloboschi approved these changes May 25, 2026

View reviewed changes

nicoloboschi merged commit b83bb87 into vectorize-io:main May 25, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(reranker): support alibaba qwen3-rerank#1501

feat(reranker): support alibaba qwen3-rerank#1501
nicoloboschi merged 4 commits into
vectorize-io:mainfrom
quicklyfast:main

quicklyfast commented May 7, 2026

Uh oh!

nicoloboschi commented May 8, 2026

Uh oh!

quicklyfast commented May 19, 2026

Uh oh!

nicoloboschi left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

quicklyfast commented May 7, 2026

Uh oh!

nicoloboschi commented May 8, 2026

Uh oh!

quicklyfast commented May 19, 2026

Uh oh!

nicoloboschi left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants