Conversation
…scade complete on main Promotes status PROPOSED → ACTIVE_ALGORITHM_LEVEL after all 5 cascade PRs land. M-MOE-SUB-1, M-MOE-SUB-2 (a + b + c + c.gpu), M-MOE-SUB-3 (harness) status: PENDING → SHIPPED. M-MOE-SUB-4 stays PENDING (optional, only needed if M-MOE-SUB-3's diff doesn't pinpoint at MoeRouter / MoeFfnOut granularity). Cited PRs (chronological) ========================= - #1507 — moe_ffn_forward_layer_with_router (CPU helper, step c) - #1516 — forward_qwen3_moe_traced_with_plan (CPU body, step a) - #1521 — apr trace --save-tensor GGUF MoE CLI wireup (step a CLI) - #1522 — moe_ffn_forward_layer_cuda_with_router (GPU helper, step c.gpu) - #1523 — forward_qwen3_moe_cuda_traced (GPU body, step b) - #1524 — heavy diff harness (M-MOE-SUB-3) What's left =========== - Operator-dispatched run of `falsify_moe_sub_002_cpu_gpu_traced_per_stage_diff` on lambda-vector RTX 4090 + cached 17.3 GB Qwen3-Coder GGUF (~30-60 min wall) → produces layer-by-layer divergence table. - M-MOE-SUB-3 ALGORITHM_LEVEL → FUNCTIONAL upon operator run. - FALSIFY-MOE-SUB-003 → DISCHARGED gated on M-GPU-MOE-1.4 root-cause fix. Refs: contracts/trace-moe-gpu-sub-stages-v1.yaml Refs: qwen3-moe-forward-gpu-v1 v1.4.0 M-GPU-MOE-1.4 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Promotes contract
trace-moe-gpu-sub-stages-v1status from PROPOSED → ACTIVE_ALGORITHM_LEVEL after all five cascade PRs land. Records SHIPPED status for M-MOE-SUB-1, M-MOE-SUB-2 (a + b + c + c.gpu), and M-MOE-SUB-3 (harness). M-MOE-SUB-4 stays PENDING (optional).Cited PRs (all merged)
moe_ffn_forward_layer_with_router(CPU helper)forward_qwen3_moe_traced_with_plan(CPU body)apr trace --save-tensorGGUF MoE CLI wireupmoe_ffn_forward_layer_cuda_with_router(GPU helper)forward_qwen3_moe_cuda_traced[_with_plan](GPU body)What's left
falsify_moe_sub_002_cpu_gpu_traced_per_stage_diffon lambda-vector RTX 4090 + cached 17.3 GB Qwen3-Coder GGUF (~30-60 min wall) → produces the layer-by-layer divergence table.Test plan
pv validate contracts/trace-moe-gpu-sub-stages-v1.yaml→ 0 errors, 0 warningsgh pr view <id> --json mergedAt🤖 Generated with Claude Code