Add apply_chat_template to HF vllm Ray deployment by athitten · Pull Request #581 · NVIDIA-NeMo/Export-Deploy

athitten · 2026-02-04T07:33:58Z

#575 missed to add apply_chat_template functionality explicitly to ray_infer_fn in vLLMExporter class which led to incorrect eval accuracy on chat benchmarks for HF deployment with Ray via vllm backend. This PR fixes it.

Signed-off-by: Abhishree <abhishreetm@gmail.com>

copy-pr-bot · 2026-02-04T07:34:01Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

athitten · 2026-02-04T07:34:18Z

/ok to test 1c76096

Signed-off-by: Abhishree <abhishreetm@gmail.com>

athitten · 2026-02-04T07:39:50Z

/ok to test db80f60

athitten · 2026-02-04T21:51:37Z

/ok to test b1876c3

Signed-off-by: Abhishree <abhishreetm@gmail.com> Signed-off-by: NeMo Bot <nemo-bot@nvidia.com>

Add apply_chat_template to HF vllm Ray deployment

1c76096

Signed-off-by: Abhishree <abhishreetm@gmail.com>

athitten requested review from oyilmaz-nvidia and pthombre as code owners February 4, 2026 07:33

github-actions bot added export tests vLLM labels Feb 4, 2026

athitten added the r0.4.0 Cherry-pick PR to r0.4.0 release branch label Feb 4, 2026

Fix lint check errors

db80f60

Signed-off-by: Abhishree <abhishreetm@gmail.com>

github-actions bot added the deploy label Feb 4, 2026

copy-pr-bot bot temporarily deployed to test February 4, 2026 07:40 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci February 4, 2026 07:40 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci February 4, 2026 08:22 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci February 4, 2026 08:26 Inactive

oyilmaz-nvidia approved these changes Feb 4, 2026

View reviewed changes

Merge branch 'main' into athitten/fix_hf_vllm_chat

b1876c3

copy-pr-bot bot temporarily deployed to test February 4, 2026 21:52 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci February 4, 2026 21:58 Inactive

athitten enabled auto-merge (squash) February 4, 2026 22:33

copy-pr-bot bot temporarily deployed to nemo-ci February 4, 2026 22:40 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci February 4, 2026 22:41 Inactive

athitten merged commit 68c46b2 into main Feb 4, 2026
26 checks passed

athitten deleted the athitten/fix_hf_vllm_chat branch February 4, 2026 23:14

ko3n1g pushed a commit that referenced this pull request Feb 4, 2026

Add apply_chat_template to HF vllm Ray deployment (#581)

f1ac1a6

Signed-off-by: Abhishree <abhishreetm@gmail.com> Signed-off-by: NeMo Bot <nemo-bot@nvidia.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add apply_chat_template to HF vllm Ray deployment#581

Add apply_chat_template to HF vllm Ray deployment#581
athitten merged 3 commits intomainfrom
athitten/fix_hf_vllm_chat

athitten commented Feb 4, 2026

Uh oh!

copy-pr-bot bot commented Feb 4, 2026

Uh oh!

athitten commented Feb 4, 2026

Uh oh!

athitten commented Feb 4, 2026

Uh oh!

athitten commented Feb 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

athitten commented Feb 4, 2026

Uh oh!

copy-pr-bot bot commented Feb 4, 2026

Uh oh!

athitten commented Feb 4, 2026

Uh oh!

athitten commented Feb 4, 2026

Uh oh!

athitten commented Feb 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants