Skip to content

Adjust MiniMax MI355X block size for TP8 EP8#1228

Closed
jiacao-amd wants to merge 3 commits intoSemiAnalysisAI:mainfrom
jiacao-amd:minimax-block16-tp8ep8-block32
Closed

Adjust MiniMax MI355X block size for TP8 EP8#1228
jiacao-amd wants to merge 3 commits intoSemiAnalysisAI:mainfrom
jiacao-amd:minimax-block16-tp8ep8-block32

Conversation

@jiacao-amd
Copy link
Copy Markdown
Collaborator

Summary

  • default MiniMax MI355X vLLM runs to block size 16 with shuffled KV cache layout enabled
  • special-case TP8/EP8 to disable shuffled KV cache layout and use block size 32

Testing

  • bash -n benchmarks/single_node/minimaxm2.5_fp8_mi355x.sh

Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@jiacao-amd jiacao-amd force-pushed the minimax-block16-tp8ep8-block32 branch from c01e0b6 to d66409b Compare April 29, 2026 16:41
@github-actions
Copy link
Copy Markdown
Contributor

@jiacao-amd Kicking off a sweep.

Run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/25121853688
Command: test-config --config-files .github/configs/amd-master.yaml --config-keys minimaxm2.5-fp8-mi355x-vllm
Pinned ref: d66409b
Approval: not required (trusted collaborator).

@jiacao-amd
Copy link
Copy Markdown
Collaborator Author

/sweep test-config --config-files .github/configs/amd-master.yaml --config-keys minimaxm2.5-fp8-mi355x-vllm

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 4, 2026

@jiacao-amd Kicking off a sweep.

Run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/25330781481
Command: test-config --config-files .github/configs/amd-master.yaml --config-keys minimaxm2.5-fp8-mi355x-vllm
Pinned ref: a34fb25
Approval: not required (trusted collaborator).

@jiacao-amd jiacao-amd force-pushed the minimax-block16-tp8ep8-block32 branch from 9c6bfd2 to 37962d4 Compare May 4, 2026 18:32
@jiacao-amd
Copy link
Copy Markdown
Collaborator Author

/sweep test-config --config-files .github/configs/amd-master.yaml --config-keys minimaxm2.5-fp8-mi355x-vllm

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 4, 2026

@jiacao-amd Kicking off a sweep.

Run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/25336346089
Command: test-config --config-files .github/configs/amd-master.yaml --config-keys minimaxm2.5-fp8-mi355x-vllm
Pinned ref: 37962d4
Approval: not required (trusted collaborator).

@jiacao-amd
Copy link
Copy Markdown
Collaborator Author

Superseded by #1276. The replacement PR uses the same MiniMax MI355X vLLM scheduling change, but the branch is pushed directly to SemiAnalysisAI/InferenceX instead of the fork so CI/automation should avoid the fork-PR permission issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Development

Successfully merging this pull request may close these issues.

1 participant