Support reranker inference #5883

tastelikefeet · 2025-09-20T05:01:38Z

PR type

Bug Fix
New Feature
Document Updates
More Models or Datasets Support

PR information

Write the detail information belongs to this PR.

Experiment results

Paste your experiment result here(if needed).

…anker_infer

gemini-code-assist · 2025-09-20T05:01:58Z

Summary of Changes

Hello @tastelikefeet, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the system's capability to perform reranker inference by integrating support across both PyTorch and vLLM inference engines. It introduces a new argument for controlling activation post-scoring, adds specific model and template support for BGE rerankers, and refines data handling for reranker tasks, thereby broadening the range of supported models and improving flexibility for sequence classification and scoring applications.

Highlights

Reranker Activation Control: Introduced a new command-line argument reranker_use_activation to control whether a sigmoid activation function is applied after the reranker score calculation during inference. This provides flexibility for different reranker models.
PyTorch Engine Reranker Support: The PyTorch inference engine (PtEngine) now fully supports reranker models, including handling reranker and generative_reranker task types and applying the specified activation.
vLLM Engine Reranker Integration: Extended the vLLM inference engine (VllmEngine) to support reranker models. This includes passing relevant task types ('classify', 'score') and activation parameters to vLLM's PoolingParams for efficient reranker inference.
BGE Reranker Model and Template: Added bge_reranker as a new LLMModelType and a corresponding template, enabling direct support for BAAI's BGE reranker models within the system.
Qwen Reranker vLLM Overrides: Updated the Qwen template to provide specific hf_overrides for vLLM when using Qwen reranker models, ensuring correct architecture and classifier token handling.
Unified Reranker Data Handling: Modified the _reranker_encode and _reranker_data_collator methods in the base template to differentiate between training and inference modes, optimizing data preparation for reranker tasks.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request adds support for reranker inference to both the PyTorch and vLLM backends. This includes new command-line arguments, model types (bge_reranker), and templates. The changes are quite extensive and touch upon inference engines, model registration, and data processing logic.

My review focuses on a few key areas:

Configuration Management: I've pointed out several places where os.environ.get is used for configuration. This is not a good practice as it relies on implicit global state. I've recommended moving these to more explicit configuration mechanisms.
Correctness: I found a potential batch processing bug in the PyTorch engine for generative_reranker models and suggested a fix.
Maintainability: I've suggested refactoring some repetitive code blocks in the vLLM engine and argument parsing to improve code clarity and reduce duplication.

Overall, this is a solid contribution that significantly expands the framework's capabilities. The suggested changes should improve the robustness and maintainability of the new functionality.

swift/llm/infer/infer_engine/pt_engine.py

swift/llm/template/base.py

swift/llm/infer/infer_engine/vllm_engine.py

swift/trainers/arguments.py

examples/train/seq_cls/qwen2_5/sft.sh

Jintao-Huang · 2025-09-20T06:01:55Z

\gemini review

tastelikefeet and others added 6 commits September 19, 2025 17:12

wip

2956bfc

Merge branch 'main' of github.com:tastelikefeet/swift

32721d4

fix

c7c0071

Merge commit 'c25d275c930dcaa1358c3fe72c8d52423b00e32b' into feat/rer…

cecd056

…anker_infer

lint

4914e42

fix

8cae1bb

docs

a026592

gemini-code-assist bot reviewed Sep 20, 2025

View reviewed changes

tastelikefeet added 3 commits September 20, 2025 13:04

add new files

d4acbf4

liknt

e65440c

fix

2073bd4

0russwest0 reviewed Sep 20, 2025

View reviewed changes

examples/train/seq_cls/qwen2_5/sft.sh Show resolved Hide resolved

0russwest0 approved these changes Sep 20, 2025

View reviewed changes

Jintao-Huang approved these changes Sep 20, 2025

View reviewed changes

hjh0119 approved these changes Sep 20, 2025

View reviewed changes

fix

1d51e2d

tastelikefeet merged commit 8887789 into modelscope:main Sep 20, 2025
2 checks passed

YushunXiang mentioned this pull request Oct 14, 2025

[BUG] sequence classification tasks (PyTorch/vLLM) #6129

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support reranker inference #5883

Support reranker inference #5883

Uh oh!

tastelikefeet commented Sep 20, 2025

Uh oh!

gemini-code-assist bot commented Sep 20, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Jintao-Huang commented Sep 20, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Support reranker inference #5883

Support reranker inference #5883

Uh oh!

Conversation

tastelikefeet commented Sep 20, 2025

PR type

PR information

Experiment results

Uh oh!

gemini-code-assist bot commented Sep 20, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Jintao-Huang commented Sep 20, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants