[NVIDIA] Set EP_SIZE = 1 for B200 measurements by kaixih · Pull Request #242 · SemiAnalysisAI/InferenceX

kaixih · 2025-11-17T18:55:40Z

This PR disable the ep for B200 measurements.

Copilot

Pull Request Overview

This PR disables endpoint parallelism (EP) for B200 measurements by setting EP_SIZE to 1 across all configurations. The change affects both FP4 and FP8 precision test configurations for the B200 platform using sglang.

Changed EP values from matching TP values (4 or 8) to a uniform value of 1
Applied consistently across all input/output sequence length combinations

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

cquil11 · 2025-11-18T04:44:27Z

does this lead to perf improvement?

kaixih · 2025-11-18T17:55:55Z

does this lead to perf improvement?

I think a previous commit enables the EP accidentally. This PR tries to recover it.

functionstackx · 2025-11-18T18:00:08Z

@kaixih wasn't EP always enabled?

before PR https://github.com/InferenceMAX/InferenceMAX/pull/204 , --enable-ep-moe & --enable-flashinfer-trtllm-moe was set

according to SGLang docs, since --enable-ep-moe flag has been removed, the equivalent is to set it as --ep-size "Please set --ep-size to the same value as --tp-size instead"

Copilot

Pull Request Overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated no new comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

kaixih · 2025-11-18T19:50:47Z

@functionstackx Right, I re-enabled the EP for the fp4 workloads. We noticed that the EP was off only for FP8 and was accidentally enabled from this InferenceMAX/InferenceMAX@d8fe8f7.

functionstackx · 2025-11-18T19:58:42Z

@functionstackx Right, I re-enabled the EP for the fp4 workloads. We noticed that the EP was off only for FP8 and was accidentally enabled from this d8fe8f7.

@kaixih ah gotcha, i see. it was only fp4 that previously had --enable-ep-moe and fp8 didn't have --enable-ep-moe

I see now

functionstackx

lgtm

Set ep = 1 for b200

f307282

kaixih requested a review from a team as a code owner November 17, 2025 18:55

Copilot AI review requested due to automatic review settings November 17, 2025 18:55

Copilot AI reviewed Nov 17, 2025

View reviewed changes

cquil11 added b200_dsr1 and removed b200-trt_dsr1 labels Nov 18, 2025

Re-enable EP for fp4

7dfc7e4

kaixih force-pushed the disable_b200_ep_size branch from 1585a4a to 7dfc7e4 Compare November 18, 2025 19:49

Copilot AI review requested due to automatic review settings November 18, 2025 19:49

Copilot AI reviewed Nov 18, 2025

View reviewed changes

functionstackx approved these changes Nov 18, 2025

View reviewed changes

functionstackx merged commit 189ae51 into SemiAnalysisAI:main Nov 18, 2025
1 of 7 checks passed

functionstackx added this to InferenceMAX Board Dec 7, 2025

github-project-automation Bot moved this to Done in InferenceMAX Board Dec 7, 2025

cquil11 added the NVIDIA label Apr 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[NVIDIA] Set EP_SIZE = 1 for B200 measurements#242

[NVIDIA] Set EP_SIZE = 1 for B200 measurements#242
functionstackx merged 2 commits intoSemiAnalysisAI:mainfrom
kaixih:disable_b200_ep_size

kaixih commented Nov 17, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

cquil11 commented Nov 18, 2025

Uh oh!

kaixih commented Nov 18, 2025 •

edited

Loading

Uh oh!

functionstackx commented Nov 18, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

kaixih commented Nov 18, 2025

Uh oh!

functionstackx commented Nov 18, 2025

Uh oh!

functionstackx left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

kaixih commented Nov 17, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

cquil11 commented Nov 18, 2025

Uh oh!

kaixih commented Nov 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

functionstackx commented Nov 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

kaixih commented Nov 18, 2025

Uh oh!

functionstackx commented Nov 18, 2025

Uh oh!

functionstackx left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

kaixih commented Nov 18, 2025 •

edited

Loading

functionstackx commented Nov 18, 2025 •

edited

Loading