-
Notifications
You must be signed in to change notification settings - Fork 967
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How can I create a matmul primitive with A16W8 (active 16bits, weight 8bits) configuration? #1895
Comments
$ ./examples/tutorials-matmul-inference-int8-matmul-cpp gpu |
Hi @Teaonly , here is an example: https://github.com/oneapi-src/oneDNN/blob/main/examples/tutorials/matmul/weights_decompression_matmul.cpp (or https://oneapi-src.github.io/oneDNN/page_weights_decompression_matmul_cpp.html#doxid-weights-decompression-matmul-cpp) For more information please review a discussion on the same topic: #1893 |
The configure for creating primitive_desc of Matrix Multiplication
'''
memory::desc a_md({M, K}, memory::data_type::f16, {K, 1}); // M x K layout
memory::desc b_md({K, N}, memory::data_type::s8, {N, 1}); // K x M layout
memory::desc c_md({M, N}, memory::data_type::f16, {N, 1}); // M x N layout
primitive_attr attr;
attr.set_scales_mask(DNNL_ARG_WEIGHTS, 1); // channel based quantized int8
// Create a MatMul primitive descriptor
auto pd = matmul::primitive_desc(eng, a_md, b_md, c_md, attr);
'''
This code will cause a unimplemented exception:
"Message: could not create a primitive descriptor for a matmul primitive"
How can i create matmul with A16W8 ?
The text was updated successfully, but these errors were encountered: