Skip to content
This repository has been archived by the owner on Dec 22, 2021. It is now read-only.

Integer multiply-add instructions #224

Open
bjacob opened this issue May 11, 2020 · 0 comments
Open

Integer multiply-add instructions #224

bjacob opened this issue May 11, 2020 · 0 comments

Comments

@bjacob
Copy link

bjacob commented May 11, 2020

Unlike the float case where the fused-vs-unfused issue creates complications (PR #79) in the integer case there is no downside to using single-instruction multiply-add. These are vital to getting above 50% of peak performance in key use cases such as matrix multiplication.

In general, these will support different combinations of bit-widths for the accumulator vs the mul operands.

A variant of this is the dot-product instructions discussed in PR #127. We need both these dot-product instructions, and general element-wise integer multiply-add.

Note that these are often used in kernels that are using nearly all available SIMD registers. That is why an approach of not exposing mul-add instructions in WebAsm and trying to let the compiler still transform code to use them, would often result in unwanted spillage. In fact, the source code will often be tailored to use a specific number of SIMD registers in the first place; not offering a multiply-add instruction to the source, requiring it to use separate Mul and Add with intermediate registers, would hinder that.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant