add int8 matmul support to CUDA backend #3508

verstatx · 2023-10-04T10:52:33Z

Description

Adds support for int8 matmul in the CUDA backend using cublasGemmEx functions. This modifies the gemm functions' api to support a different output array type, so all backends were modified.

This PR depends on s8 support: #3507
Fixes: #1656

Checklist

Rebased on latest master with signed 8-bit integer support #3507
Code compiles
Tests pass
Functions documented

TODO squash before merge

int8 matmul support in cuda backend

8f1f5c9

verstatx marked this pull request as draft October 4, 2023 10:52

verstatx added 2 commits October 14, 2023 16:52

fix formatting

d17a1b2

TODO squash before merge

add notes about s8 matmul support

6580141

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add int8 matmul support to CUDA backend #3508

add int8 matmul support to CUDA backend #3508

verstatx commented Oct 4, 2023 •

edited

add int8 matmul support to CUDA backend #3508

Are you sure you want to change the base?

add int8 matmul support to CUDA backend #3508

Conversation

verstatx commented Oct 4, 2023 • edited

Description

Checklist

verstatx commented Oct 4, 2023 •

edited