Skip to content

feat(reranker): support alibaba qwen3-rerank#1501

Merged
nicoloboschi merged 4 commits into
vectorize-io:mainfrom
quicklyfast:main
May 25, 2026
Merged

feat(reranker): support alibaba qwen3-rerank#1501
nicoloboschi merged 4 commits into
vectorize-io:mainfrom
quicklyfast:main

Conversation

@quicklyfast
Copy link
Copy Markdown
Contributor

support alibaba qwen3-rerank

@nicoloboschi
Copy link
Copy Markdown
Collaborator

A few things before this can land:

1. Reuse _CohereCompatibleRerankClient instead of re-implementing it

The DashScope compatible endpoint speaks the same wire format as the existing helper at hindsight_api/engine/cross_encoder.py:530 ({model, query, documents, top_n}{results: [{index, relevance_score}]}). _rerank_compatible duplicates that logic. Compose the helper the way SiliconFlowCrossEncoder (line 768) and ZeroEntropyCrossEncoder (line 726) do — you get the query-grouping and request shape for free.

2. Drop _COMPATIBLE_MODELS

Hardcoding frozenset({"qwen3-rerank"}) means any new variant (qwen3-rerank-plus, future qwen3-reranker-*, etc.) silently routes to the native endpoint and fails with a shape mismatch. Pick one:

  • Only support the compatible endpoint and remove the native path entirely (simplest — qwen3-rerank is the headline model here).
  • Or expose endpoint selection as an explicit config flag rather than inferring it from the model name.

3. Documentation is missing

  • hindsight-docs/docs/developer/configuration.md:556 — add alibaba to the HINDSIGHT_API_RERANKER_PROVIDER value list, plus rows for HINDSIGHT_API_RERANKER_ALIBABA_API_KEY / _MODEL next to the SiliconFlow block (~line 583).
  • hindsight-docs/docs/developer/models.mdx:475 — provider table entry and an example env-var block (mirror the SiliconFlow one at line 553).

Also: the class docstring says auth comes from DASHSCOPE_API_KEY or HINDSIGHT_API_RERANKER_ALIBABA_API_KEY, but from_env only reads the latter — either drop the claim or add the fallback (Cohere does it at config.py:1581).

@quicklyfast
Copy link
Copy Markdown
Contributor Author

@nicoloboschi Please review it.

Copy link
Copy Markdown
Collaborator

@nicoloboschi nicoloboschi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nicoloboschi nicoloboschi merged commit b83bb87 into vectorize-io:main May 25, 2026
r0gig0r added a commit to r0gig0r/hindsight that referenced this pull request May 26, 2026
Notable upstream additions pulled in:
- feat(api): clear endpoint for mental model content (vectorize-io#1706)
- feat(api): per-operation LLM concurrency caps (vectorize-io#1738)
- feat(typescript-client): concrete generated types (replace Promise<any>)
- feat(reranker): Alibaba Qwen3-Rerank support (vectorize-io#1501)
- feat: opencode-go LLM provider (vectorize-io#1652)
- feat(extensions): OperationValidator.precheck pre-body-parse hook (vectorize-io#1548)
- feat(right-agent): new Right Agent integration (vectorize-io#1599)
- fix(ollama): ollama-cloud provider + native API auth (vectorize-io#1734)
- fix(reflect): hide disabled tools from agent system prompt (vectorize-io#1740)
- fix(retain): split oversized single items in batch retain (vectorize-io#1736)
- fix: escape literal braces in user-supplied prompt fields (vectorize-io#1728)
- fix(mental-models): full refresh pending delta baselines (vectorize-io#1684)
- fix(api): lazy load reflect tiktoken encoding (vectorize-io#1654)
- fix(api): reject blank retain content (vectorize-io#1685)
- fix(api): auto-refresh openai-codex OAuth access_token (vectorize-io#1637)
- fix(api): gzip middleware for graph payloads (vectorize-io#1731)
- fix(reranker): detect pre-normalized scores; rank-based fallback (vectorize-io#1512)

Conflicts: only package-lock.json files (took upstream, npm install verified)

Fork customizations verified intact (all 14 checks):
- duplicate_checker_fn streaming Phase 1.5 in orchestrator
- FallbackLLMProvider + CircuitBreaker (fallback_llm.py)
- Single-fact consolidation mode (is_fallback_active routing)
- recallExp + Jaccard dedup + compact memory formatter (plugin)
- Codex 5.1-codex-mini reasoning guard
- Infinity reranker /models fallback in cross_encoder.py
- diversity.py + deduplication.py fork-only modules retained

Tests:
- openclaw vitest: 267/267 pass
- ruff: clean
- tsc --noEmit: clean
- pytest: pre-existing env-config flakes (need HINDSIGHT_API_LLM_API_KEY);
  upstream commit 90cb145 acknowledged as pre-existing CI flakes

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants