Hi authors,
Thanks for sharing this codebase! In your paper you mention:
"we design a specialized computation kernel for expert processing on the CPU using the AVX512_BF16 instruction set, which is not supported in the native PyTorch implementation."
However, in the released repository I only see the following Python function:
def run_expert_at_cpu(self, i_layer, i_expert, inps, routing_weights):
"""Run the expert at CPU"""
return self.model.layers[i_layer].block_sparse_moe.experts[i_expert](
inps, routing_weights
)
May I ask when will AVX512_BF16 be supported?
Hi authors,
Thanks for sharing this codebase! In your paper you mention:
However, in the released repository I only see the following Python function:
May I ask when will AVX512_BF16 be supported?