It performs highly parallelized matrix multiplication using Intel SIMD intrinsics and OpenMP. It's 45 times faster than the naïve version (1.2 gigaFLOPS increased to 55 gigaFLOPS). I wrote this without a skeleton in C.
tanner-wauchope/Parallelized-Matrix-Multiplier
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|