[None][doc] Refactor blog18#13956
Conversation
📝 WalkthroughWalkthroughThis PR updates a technical blog post documenting NVIDIA's NVLink one-sided AlltoAll optimization for MoE communication. Changes include renaming concepts (push/pull instead of dispatch/combine), introducing expanded raw-token data layout explanation, rewriting performance benchmarking with detailed methodology and updated metrics, adding post-quantization dispatch analysis, and restructuring future work guidance. ChangesMoE Communication Blog Article Update
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@docs/source/blogs/tech_blog/blog18_Optimizing_MoE_Communication_with_One_Sided_AlltoAll_Over_NVLink.md`:
- Around line 175-177: The fenced code block containing the formula "bandwidth =
batch_size × min(ep_size, top_k) × bytes_per_token / latency" lacks a language
identifier; update the block delimiter from ``` to include a language (e.g.,
```text or ```python) so Markdown lint (MD040) and syntax highlighting work
correctly for the formula, ensuring the line with the variables bandwidth,
batch_size, ep_size, top_k, bytes_per_token, and latency remains unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 7acf4322-7919-4306-abf6-3e7ba1d94713
⛔ Files ignored due to path filters (9)
docs/source/blogs/media/tech_blog18_bandwidth.pngis excluded by!**/*.pngdocs/source/blogs/media/tech_blog18_dispatch_moe_combine.pngis excluded by!**/*.pngdocs/source/blogs/media/tech_blog18_dispatch_moe_combine_R0.pngis excluded by!**/*.pngdocs/source/blogs/media/tech_blog18_one_sided_vs_two_sided.pngis excluded by!**/*.pngdocs/source/blogs/media/tech_blog18_post_quant_dispatch.pngis excluded by!**/*.pngdocs/source/blogs/media/tech_blog18_quant_formats.pngis excluded by!**/*.pngdocs/source/blogs/media/tech_blog18_rank_major_vs_expert_major.pngis excluded by!**/*.pngdocs/source/blogs/media/tech_blog18_raw_tokens_vs_permuted_tokens.pngis excluded by!**/*.pngdocs/source/blogs/media/tech_blog18_token_major_vs_expert_major.pngis excluded by!**/*.png
📒 Files selected for processing (1)
docs/source/blogs/tech_blog/blog18_Optimizing_MoE_Communication_with_One_Sided_AlltoAll_Over_NVLink.md
6e62b8c to
ea43673
Compare
|
/bot run |
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
ea43673 to
59590cc
Compare
|
PR_Github #47610 [ run ] triggered by Bot. Commit: |
|
PR_Github #47610 [ run ] completed with state |
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
Summary
Restructures the Performance Benchmark section of blog18 into focused subsections (Methodology / Scaling With EP Size / Post-Quant Dispatch / Latency Floor / Reproduction) and adds new MXFP8 and NVFP4 results so the post-quant story is no longer hypothetical.
Bandwidth chart re-rendered as a landscape side-by-side panel using BF16 byte counts throughout. Adds reference figures for quant formats and the dispatch-MoE-combine R0 detail; re-renders the rank-major vs expert-major figure.
Test plan
Summary by CodeRabbit