Added reranker api by mwrothbe · Pull Request #36 · SearchSavior/OpenArc

mwrothbe · 2025-10-13T20:13:28Z

Adds '/v1/rerank' reranking service. API can be used for RAG flows to refine document retrieval ahead of LLM. Tested with Qwen3-Reranker but should support other models that can be used with optimum. A few things to note:

The API accepts a PreTrainedTokenizerConfig optional input. Initially I thought this would provide max flexibility for other models that might require different config options. As it turned out, the tokenization is done it two stages with config options differing, so the options are currently hard coded. If the hard coded options are universal for all models, we can remove the PreTrainedTokenizerConfig input, but I left it there for now in case it ends up being needed for more flexibility.
The API accepts optional 'prefix' and 'suffix' inputs that I believe are model specific instruction. The default strings (set by the models\optimum.py\RerankerConfig) are for the Qwen3-Reranker model. If you don't want this set as the default assumption, they could be set as empty strings and make the parameters required by user.
The 'task' parameter is also optional, but I think the default is pretty generic, and I would think would work with all models.
Collect metrics is not yet implemented
This PR includes a cli option to load all models in the config

SearchSavior · 2025-10-15T16:57:28Z

@mwrothbe

Found an issue with some discussion on how they implement tokenizers in transformers.

huggingface/transformers#31375

Zooming out, inheritance pattern extends into model definitions as well... explains how some optimum model PRs can be so small-

huggingface/optimum-intel#1401

I don't think it's feasible to scope out model requirements and build them into codebase before we have support targets using your pretrainedtokenizer approach.

Instead I think we should just add models as we go, and take care of flexibility once it becomes a problem if it does. Notes in this review will reflect this in review.

Maybe using Tokenizers rust bindings directly could be a good approach?

https://github.com/huggingface/tokenizers/tree/main/bindings/python/py_src/tokenizers

Please open a discussion thread on this and give some motivation for your approach. In the meantime, AutoTokenizers takes many arguments that are overrides, maybe explore these

SearchSavior · 2025-10-20T20:19:30Z

@mwrothbe Ok, to get this merged before release target later this week please

use your ReRankConfig with AutoTokenizers for now. Maybe revise emb while you are in there. Target qwen embedding and rerank for simplicity, at least for now.
for ease, don't commit the cli file and merge should be clean.
came up with a solution for queuing model loads, will add notes to Auto load models at start #34 and merge tonight.

Great work on a cool feature, appreciate your time!! If you won't be able to make changes this week lmk and I can make them so we can push 2.0 with rerank

mwrothbe · 2025-10-21T00:21:59Z

Just looking at this now. Trying to figure out how to undo the cli commit....

mwrothbe · 2025-10-21T00:43:19Z

Alright, well, I couldn't figure out how to undo the cli commit, so I just added a new commit to my fork that reverted the original CLI code. Looking at the PR code changes, that seem to have the same effect. Hope this works OK for you. If not, let me know.

copy and pasted cli from main 1.0.6 into fork. essentially I've used a screwdriver as a chisel

SearchSavior · 2025-10-21T01:53:24Z

ok, I'm making some quick changes after resolving conflicts

SearchSavior · 2025-10-21T02:02:13Z

@mwrothbe what is --task for rerank?

mwrothbe · 2025-10-21T04:11:02Z

@mwrothbe what is --task for rerank?

Ah. It's an optional input as part of the prompt instruction given to the model. The 'task' is handed to the format_instruction function in optimum_rr.py line 37 as 'instruction'. ...probably should have kept the naming consistent there, but it made sense to me at the time to have an interface name more descriptive (even if it's not.) The default is "Given a search query, retrieve relevant passages that answer the query" and the idea with making this an input is you might want to tweak the command for performance or model dependencies.

mwrothbe added 3 commits October 13, 2025 13:07

added reranker api

1af5eb7

Bug fix when no tok config is provided

bc50a70

Added load all option to cli

22d4284

SearchSavior self-assigned this Oct 20, 2025

Reverting CLI to pre-commit

e8d6e71

SearchSavior added 2 commits October 20, 2025 21:20

Refactor OpenArc CLI for JSON config and model handling

b547cac

copy and pasted cli from main 1.0.6 into fork. essentially I've used a screwdriver as a chisel

Merge branch '1.0.6' into 1.0.6

0839688

SearchSavior mentioned this pull request Oct 22, 2025

1.0.6 rerank #39

Merged

SearchSavior added a commit that referenced this pull request Oct 22, 2025

- commit changes from PR #36 by @mwrothbe

3725e87

SearchSavior merged commit 0839688 into SearchSavior:1.0.6 Oct 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added reranker api#36

Added reranker api#36
SearchSavior merged 6 commits intoSearchSavior:1.0.6from
mwrothbe:1.0.6

mwrothbe commented Oct 13, 2025 •

edited

Loading

Uh oh!

SearchSavior commented Oct 15, 2025

Uh oh!

SearchSavior commented Oct 20, 2025

Uh oh!

mwrothbe commented Oct 21, 2025

Uh oh!

mwrothbe commented Oct 21, 2025

Uh oh!

SearchSavior commented Oct 21, 2025

Uh oh!

SearchSavior commented Oct 21, 2025

Uh oh!

mwrothbe commented Oct 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mwrothbe commented Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SearchSavior commented Oct 15, 2025

Uh oh!

SearchSavior commented Oct 20, 2025

Uh oh!

mwrothbe commented Oct 21, 2025

Uh oh!

mwrothbe commented Oct 21, 2025

Uh oh!

SearchSavior commented Oct 21, 2025

Uh oh!

SearchSavior commented Oct 21, 2025

Uh oh!

mwrothbe commented Oct 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mwrothbe commented Oct 13, 2025 •

edited

Loading