Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integer matrices #24

Open
SuperFluffy opened this issue Nov 20, 2018 · 9 comments
Open

Integer matrices #24

SuperFluffy opened this issue Nov 20, 2018 · 9 comments

Comments

@SuperFluffy
Copy link
Contributor

Would you consider also implementing matrix multiplication for integer matrices, or do you want to keep this purely floating point?

@bluss
Copy link
Owner

bluss commented Nov 21, 2018

It's pretty far from what we are focusing on, maybe it's simple to plug into the existing code?

@bluss
Copy link
Owner

bluss commented Nov 21, 2018

@bluss
Copy link
Owner

bluss commented Nov 22, 2018

@SuperFluffy do you have any good docs on integer gemm? It seems a bit fraught, like the wraparound problems especially with large matrices, there must be many good reasons it's not often implemented.

@SuperFluffy
Copy link
Contributor Author

SuperFluffy commented Nov 22, 2018

@bluss Here is the doc for the cblas_gemm_*: https://software.intel.com/en-us/mkl-developer-reference-c-cblas-gemm-1#2A58B860-609A-44CC-9812-E47BD01810CC At the bottom you have implementation details.

One of the few documents talking about it is this here: http://www.netlib.org/utk/people/JackDongarra/WEB-PAGES/Batched-BLAS-2017/talk12-gurney.pdf

Two relevant implementation details are probably (all from page 11/15):

  • They implement only GEMM_S16S16S32 and GEMM_S16S16S16, with S16=i16 and S32=i32. respectively.
  • Internal summation done with at least 16 bits (that's probably quite important!).

They note:

Only saturation variants are implemented

And then on page 13/15:

Saturate instead of overflowing or underflowing

The arraymancer library for nim has implemented integer gemm here: mratsim/Arraymancer@654c89e. Discussions can be found here: mratsim/Arraymancer#25, mratsim/Arraymancer#6. They also have integer gemv here: mratsim/Arraymancer@a5e79d9

EDIT: Intel MKL implements cblas_gemm_s8u8s32 and cblas_gemm_s16s16s32.

Note, that's a u8 in the first function!

@bluss
Copy link
Owner

bluss commented Nov 22, 2018

Oh saturation! Good to know. Thanks for the details!

@SuperFluffy
Copy link
Contributor Author

SuperFluffy commented Nov 22, 2018

Note the comment at the bottom of the API doc (emphasis mine):

After computing these four multiplication terms separately, they are summed from left to right. The results from the matrix-matrix product and the C matrix are scaled with alpha and beta floating-point values respectively using double-precision arithmetic. Before storing the results to the output c array, the floating-point values are rounded to the nearest integers. In the event of overflow or underflow, the results depend on the architecture . The results are either unsaturated (wrapped) or saturated to maximum or minimum representable integer values for the data type of the output matrix.

When using cblas_gemm_s8u8s32 with row-major layout, the data types of A and B must be swapped. That is, you must provide an 8-bit unsigned integer array for matrix A and an 8-bit signed integer array for matrix B.

Intermediate integer computations in cblas_gemm_s8u8s32 on 64-bit Intel® Advanced Vector Extensions 2 (Intel® AVX2) and Intel® Advanced Vector Extensions 512 (Intel® AVX-512) architectures without Vector Neural Network Instructions (VNNI) extensions can saturate. This is because only 16-bits are available for the accumulation of intermediate results. You can avoid integer saturation by maintaining all integer elements of A or B matrices under 8 bits.

Also, I edited my comment above: Intel only supports s8u8s32 (i8, u8(!), i32) and s16s16s32 (i16, i16, i32).

@bluss
Copy link
Owner

bluss commented Nov 22, 2018

What a bunch of hacks upon hacks

@SuperFluffy
Copy link
Contributor Author

I have found mention of integer gemm in the context of BLIS, but it looks like nothing came of it: https://groups.google.com/forum/#!topic/blis-devel/qA00lB2yGY0

@SolidTux
Copy link

SolidTux commented Oct 2, 2019

Would it be possible to make just the fallback implementation available for more types as a first step?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants