Skip to content

Add apply_chat_template to HF vllm Ray deployment#581

Merged
athitten merged 3 commits intomainfrom
athitten/fix_hf_vllm_chat
Feb 4, 2026
Merged

Add apply_chat_template to HF vllm Ray deployment#581
athitten merged 3 commits intomainfrom
athitten/fix_hf_vllm_chat

Conversation

@athitten
Copy link
Contributor

@athitten athitten commented Feb 4, 2026

#575 missed to add apply_chat_template functionality explicitly to ray_infer_fn in vLLMExporter class which led to incorrect eval accuracy on chat benchmarks for HF deployment with Ray via vllm backend. This PR fixes it.

Signed-off-by: Abhishree <abhishreetm@gmail.com>
@copy-pr-bot
Copy link

copy-pr-bot bot commented Feb 4, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@athitten
Copy link
Contributor Author

athitten commented Feb 4, 2026

/ok to test 1c76096

@athitten athitten added the r0.4.0 Cherry-pick PR to r0.4.0 release branch label Feb 4, 2026
Signed-off-by: Abhishree <abhishreetm@gmail.com>
@github-actions github-actions bot added the deploy label Feb 4, 2026
@athitten
Copy link
Contributor Author

athitten commented Feb 4, 2026

/ok to test db80f60

@athitten
Copy link
Contributor Author

athitten commented Feb 4, 2026

/ok to test b1876c3

@athitten athitten merged commit 68c46b2 into main Feb 4, 2026
26 checks passed
@athitten athitten deleted the athitten/fix_hf_vllm_chat branch February 4, 2026 23:14
ko3n1g pushed a commit that referenced this pull request Feb 4, 2026
Signed-off-by: Abhishree <abhishreetm@gmail.com>
Signed-off-by: NeMo Bot <nemo-bot@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

deploy export r0.4.0 Cherry-pick PR to r0.4.0 release branch tests vLLM

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants