Skip to content

Conversation

namanlalitnyu
Copy link
Contributor

@namanlalitnyu namanlalitnyu commented Sep 10, 2025

Changes:

  • This PR involves adding facebook/opt-125m model in the benchmarking tests (serving, latency, and throughput) for both vLLM and SGLang frameworks.
  • Based on our discussion, we need to remove the serving tests for qwen model for SGLang benchmarking.
  • Fixing issue related to the SGLang benchmarks failing.

Testing:

  • Github Action for SGLang benchmarking containing the facebook model: link
  • Github Action for vLLM benchmarking containing the facebook model: link
  • HUD Dashboard also showing the facebook model benchmarking results: link
Screenshot 2025-09-10 at 10 06 53 PM Screenshot 2025-09-10 at 10 06 37 PM

@meta-cla meta-cla bot added the cla signed label Sep 10, 2025
@namanlalitnyu
Copy link
Contributor Author

One of the Github actions for qwen model is failing, and its unrelated to our changes, as this PR doesn't touch the vllm-benchmark workflow, and also the test fails due to the vllm server not starting. Hence, we are good with merging these changes.

@namanlalitnyu namanlalitnyu merged commit 774075f into main Sep 12, 2025
76 of 80 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants