feat: Try with reranker by GoodbyePlanet · Pull Request #19 · GoodbyePlanet/code-search

GoodbyePlanet · 2026-05-02T19:27:26Z

What was built:

eval/ — a new evaluation framework with a query dataset (14 queries across Go/HTML/CSS), a runner that hits the search API, and metrics (Recall@K, MRR)
server/rerank/ — pluggable reranker abstraction with a tei_reranker.py implementation (calls a Text Embeddings Inference service) and a noop.py passthrough
Docker Compose update to wire in the reranker service

The bi-encoder (jina-embeddings-v2-base-code) already achieves Recall@5 = 100% on this codebase, meaning the correct result is always in the top 5 without any reranking. Adding a cross-encoder on top made things worse.
The root cause: both rerankers are trained on natural language pairs, not code. They don't understand code semantics well enough to improve on a purpose-built code embedding model.

Reranking would be worth revisiting if:

The codebase grows significantly and Recall@5 starts dropping
You collect real usage logs and fine-tune a cross-encoder on your own (query, correct chunk) pairs
You're willing to use an LLM (e.g. Claude) to rerank the top-5 candidates — expensive per query but genuinely code-aware

For now, the bi-encoder alone is the right choice.

NOTE: This will not be merged it just here for the reference.

feat: Try with reranker

f2e0130

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Try with reranker#19

feat: Try with reranker#19
GoodbyePlanet wants to merge 1 commit intomainfrom
feat/reranking-eval-harness

GoodbyePlanet commented May 2, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

GoodbyePlanet commented May 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

GoodbyePlanet commented May 2, 2026 •

edited

Loading