Since its transformer model, the activation is in 3d. but the kernel seems like only support 2d matmul. Am I missing anything or misunderstand?
Since its transformer model, the activation is in 3d. but the kernel seems like only support 2d matmul. Am I missing anything or misunderstand?