Skip to content

feat: Try with reranker#19

Open
GoodbyePlanet wants to merge 1 commit intomainfrom
feat/reranking-eval-harness
Open

feat: Try with reranker#19
GoodbyePlanet wants to merge 1 commit intomainfrom
feat/reranking-eval-harness

Conversation

@GoodbyePlanet
Copy link
Copy Markdown
Owner

@GoodbyePlanet GoodbyePlanet commented May 2, 2026

What was built:

  • eval/ — a new evaluation framework with a query dataset (14 queries across Go/HTML/CSS), a runner that hits the search API, and metrics (Recall@K, MRR)
  • server/rerank/ — pluggable reranker abstraction with a tei_reranker.py implementation (calls a Text Embeddings Inference service) and a noop.py passthrough
  • Docker Compose update to wire in the reranker service
image

The bi-encoder (jina-embeddings-v2-base-code) already achieves Recall@5 = 100% on this codebase, meaning the correct result is always in the top 5 without any reranking. Adding a cross-encoder on top made things worse.
The root cause: both rerankers are trained on natural language pairs, not code. They don't understand code semantics well enough to improve on a purpose-built code embedding model.

Reranking would be worth revisiting if:

  • The codebase grows significantly and Recall@5 starts dropping
  • You collect real usage logs and fine-tune a cross-encoder on your own (query, correct chunk) pairs
  • You're willing to use an LLM (e.g. Claude) to rerank the top-5 candidates — expensive per query but genuinely code-aware

For now, the bi-encoder alone is the right choice.

NOTE: This will not be merged it just here for the reference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant