Skip to content

mozilla/gemmology

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

                                      _
                                     | |
  __ _  ___ _ __ ___  _ __ ___   ___ | | ___   __ _ _   _
 / _` |/ _ \ '_ ` _ \| '_ ` _ \ / _ \| |/ _ \ / _` | | | |
| (_| |  __/ | | | | | | | | | | (_) | | (_) | (_| | |_| |
 \__, |\___|_| |_| |_|_| |_| |_|\___/|_|\___/ \__, |\__, |
  __/ |                                        __/ | __/ |
 |___/                                        |___/ |___/

                                                version 0.1

Small Integer Matrix Multiply

Gemmology is a rewrite of intgemm with a focus on 8-bit integer matrix multiplication and using xsimd as an abstract vector instructrion set when possible.

The original algorithm and API are left mostly untouched, appart from a few namespace changes.

Usage

Gemmology consists in a single header file, just drop it in your project to use it, then mostly follow intgemm API:

#include "gemmology.h"

 float alpha = 25;
 float quant_mult = 127/alpha;
 gemmology::Shift::PrepareA(A, A_prepared, quant_mult, A_rows, width);
 gemmology::PrepareB(B, B_prepared, quant_mult, width, B_cols);

 /* Prepare the bias (inplace) */
 float unquant_mult_forprep = (-1)*(alpha)*(alpha)/(127.0f);
 gemmology::Shift::PrepareBias(B_prepared, width, B_cols,
                               callbacks::UnquantizeAndAddBiasAndWrite(unquant_mult_forprep, inputBias, inputBias));
 /* Multiply */
 gemmology::Shift::Multiply(A_prepared, B_prepared, A_rows, width, B_cols,
                            callbacks::UnquantizeAndAddBiasAndWrite(unquant_mult_forprep, bias, C));

Difference with intgemm

Gemmology only handles quantized matrix of 8-bit integers.

Gemmology provides an SSE2 implementation of the original algorithm, while intgemm stops at SSSE3. The SSE2 version is roughly 2.5 slower than the SSSE3 version.

Gemmology provides a suboptimal implementation using NEON instructions for arm32 and aarch64.

All Gemmology functions are parametrized by a target architecture (e.g. xsimd::sse4_2) which is set to the best available at compile time. It's up to the user to handle the dynamic dispatch (eventually using xsimd generic mechanism to do so.

Testing

All tests lie in the test directory, a sample test invocation (provided xsimd and sde64 are available on your system.

make -C test XSIMD_INCLUDE_DIR=~/source/xsimd/include SDE64=~/Downloads/sde-external-9.14.0-2022-10-25-lin/sde64

Acknowledgments

This is really mostly a portage of intgemm to xsimd. So big thanks to intgemm authors for the original work.

Releases

No releases published

Packages

No packages published