_ | | __ _ ___ _ __ ___ _ __ ___ ___ | | ___ __ _ _ _ / _` |/ _ \ '_ ` _ \| '_ ` _ \ / _ \| |/ _ \ / _` | | | | | (_| | __/ | | | | | | | | | | (_) | | (_) | (_| | |_| | \__, |\___|_| |_| |_|_| |_| |_|\___/|_|\___/ \__, |\__, | __/ | __/ | __/ | |___/ |___/ |___/ version 0.1
Gemmology is a rewrite of intgemm with a focus on 8-bit integer matrix multiplication and using xsimd as an abstract vector instructrion set when possible.
The original algorithm and API are left mostly untouched, appart from a few namespace changes.
Gemmology consists in a single header file, just drop it in your project to use it, then mostly follow intgemm API:
#include "gemmology.h"
float alpha = 25;
float quant_mult = 127/alpha;
gemmology::Shift::PrepareA(A, A_prepared, quant_mult, A_rows, width);
gemmology::PrepareB(B, B_prepared, quant_mult, width, B_cols);
/* Prepare the bias (inplace) */
float unquant_mult_forprep = (-1)*(alpha)*(alpha)/(127.0f);
gemmology::Shift::PrepareBias(B_prepared, width, B_cols,
callbacks::UnquantizeAndAddBiasAndWrite(unquant_mult_forprep, inputBias, inputBias));
/* Multiply */
gemmology::Shift::Multiply(A_prepared, B_prepared, A_rows, width, B_cols,
callbacks::UnquantizeAndAddBiasAndWrite(unquant_mult_forprep, bias, C));
Difference with intgemm
Gemmology only handles quantized matrix of 8-bit integers.
Gemmology provides an SSE2 implementation of the original algorithm, while intgemm stops at SSSE3. The SSE2 version is roughly 2.5 slower than the SSSE3 version.
Gemmology provides a suboptimal implementation using NEON instructions for arm32 and aarch64.
All Gemmology functions are parametrized by a target architecture (e.g.
xsimd::sse4_2
) which is set to the best available at compile time. It's up
to the user to handle the dynamic dispatch (eventually using xsimd generic
mechanism to do so.
All tests lie in the test
directory, a sample test invocation (provided
xsimd and sde64
are available on your system.
make -C test XSIMD_INCLUDE_DIR=~/source/xsimd/include SDE64=~/Downloads/sde-external-9.14.0-2022-10-25-lin/sde64
This is really mostly a portage of intgemm to xsimd. So big thanks to intgemm authors for the original work.