Encoder model support for the Transformers backend #25174

hmellor · 2025-09-18T14:06:10Z

Adds support for encoder models to the Transformers backend.

Depends on

I have conditioned this feature on the Transformers version, so as soon as the following dependency is merged we can merge this PR and users installing vLLM and Transformers from main can use this feature:

🔴[Attention] Bert-based Models Attention Refactor huggingface/transformers#38301 - so that encoder models use the ALL_ATTENTION_FUNCTIONS interface necessary for the Transformers backend

Changes

Use EncoderOnlyAttention if an encoder model is detected
Skip position_ids buffers if they're found in the checkpoint because vLLM always passes position_ids anyway

Testing

python examples/offline_inference/basic/embed.py --model BAAI/bge-base-en-v1.5 --model-impl transformers

pytest tests/models/test_transformers.py -k test_embed_correctness

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

gemini-code-assist

Code Review

This pull request adds support for encoder models to the Transformers backend. The main changes include detecting encoder models by checking for is_causal=False and using EncoderOnlyAttention accordingly. It also adds logic to skip loading position_ids from checkpoints for encoder models, as vLLM handles this. The changes look good, but I have a critical suggestion to improve the type hint for create_attention_instances to reflect that it can return EncoderOnlyAttention instances, which will improve code clarity and maintainability.

vllm/model_executor/models/transformers.py

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Isotr0py

Just a nit. Otherwise LGTM!

vllm/model_executor/models/transformers.py

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: charlifu <charlifu@amd.com>

hmellor added 2 commits September 18, 2025 16:00

Ignore position_ids in encoder model checkpoints

a214aa9

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Use EncoderOnlyAttention for encoder models

b240c6a

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

hmellor marked this pull request as draft September 18, 2025 14:06

gemini-code-assist bot reviewed Sep 18, 2025

View reviewed changes

vllm/model_executor/models/transformers.py Outdated Show resolved Hide resolved

hmellor added 4 commits September 18, 2025 16:48

Add test

60bd49f

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Guard feature on new Transformers version

93dff01

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Review comment

e2d605e

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Update doc

d6945db

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

hmellor requested a review from Isotr0py September 18, 2025 15:05

hmellor marked this pull request as ready for review September 18, 2025 15:05

hmellor requested review from DarkLight1337 and ywang96 as code owners September 18, 2025 15:05

mergify bot added the documentation Improvements or additions to documentation label Sep 18, 2025

per-commit

b2a2c14

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Isotr0py approved these changes Sep 18, 2025

View reviewed changes

vllm/model_executor/models/transformers.py Outdated Show resolved Hide resolved

vllm/model_executor/models/transformers.py Outdated Show resolved Hide resolved

hmellor added 2 commits September 18, 2025 18:58

Use attn_type argument for Attention

a2b14c4

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Update doc on how what's needed for encoder-model support

8098c12

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

hmellor requested review from WoosukKwon, zhuohan123, youkaichao, alexm-redhat, comaniac and njhill as code owners September 18, 2025 17:05

Merge branch 'main' into transformers-backend-encoders

47c138c

Isotr0py added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 18, 2025

hmellor added 3 commits September 19, 2025 15:51

Move attn_type check to TransformersModel

5f3d1d1

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Merge branch 'main' into transformers-backend-encoders

1c1c056

Skip test if correct transformers version not installed

b629a29

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

hmellor merged commit 12aed7e into vllm-project:main Sep 19, 2025
50 checks passed

hmellor deleted the transformers-backend-encoders branch September 19, 2025 18:15

debroy-rh pushed a commit to debroy-rh/vllm that referenced this pull request Sep 19, 2025

Encoder model support for the Transformers backend (vllm-project#25174)

2a2aca9

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

hmellor added this to Transformers backend Sep 24, 2025

hmellor moved this to Done in Transformers backend Sep 24, 2025

FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025

Encoder model support for the Transformers backend (vllm-project#25174)

6f4f45b

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

charlifu pushed a commit to ROCm/vllm that referenced this pull request Sep 25, 2025

Encoder model support for the Transformers backend (vllm-project#25174)

bd54073

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: charlifu <charlifu@amd.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Encoder model support for the Transformers backend #25174

Encoder model support for the Transformers backend #25174

Uh oh!

hmellor commented Sep 18, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Isotr0py left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Encoder model support for the Transformers backend #25174

Encoder model support for the Transformers backend #25174

Uh oh!

Conversation

hmellor commented Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Depends on

Changes

Testing

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Isotr0py left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hmellor commented Sep 18, 2025 •

edited

Loading