Multiplying Matrices with AVX. For Fun.
Assembly C++ C
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
mat4f_mul_avx
x64/Release
README.md
mat4f_mul_avx.sln

README.md

Multiplying Matrices with AVX. For Fun*

Having previously tinkered only very briefly, in assembly, I was keen to try my hand at more.

I do best with a practical, defined problem to solve; having used more or less the same unrolled-loop implementation of a 4x4 matrix multiplication I wrote in university, it seemed a good candidate for a 21st Century update, using Advanced Vector Extensions (AVX) which first shipped with Sandy Bridge processors in 2011. Non-trivial, but tractable.

*Performance was never a motivation of this side project - the problem is too small - but there wouldn't be much point if the output were slower. And it isn't: on my (Ivy Bridge) Macbook Pro, it executes in half as many cycles as my previous unrolled-loop implementation and in slightly more than two-thirds as many cycles on a Haswell Ultrabook.

But not faster than XMMatrixMultiply.

P.S. The built executable has a dependency on the Visual C++ 2012 Update 4 runtime and does not check that the host CPU supports AVX instructions.