What is your request?
The MoE layer does not support quantized weights. Seems like the main thing missing is a quant version for the kernel grouped_matmul_ragged.
What is your motivation for this change?
A MoE layer with quant support allows the implementation of popular MoE models like Qwen3-30B-A3B-GPTQ (Text, Coder & Omni).
Any other details?
No response
What is your request?
The
MoElayer does not support quantized weights. Seems like the main thing missing is a quant version for the kernelgrouped_matmul_ragged.What is your motivation for this change?
A
MoElayer with quant support allows the implementation of popular MoE models like Qwen3-30B-A3B-GPTQ (Text, Coder & Omni).Any other details?
No response