You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have access to a Sapphire Rapids CPU and I would like to test the libxsmm gemms performance using bfloat16 as input and float32 as output so that AMX-BF16/AVX512-BF16 instructions are used. However, the documentation only includes the following example:
#include<libxsmm.h>
#include<vector>intmain(int argc, char* argv[]) {
typedefdouble T;
int batchsize = 1000, m = 13, n = 5, k = 7;
std::vector<T> a(batchsize * m * k), b(batchsize * k * n), c(m * n, 0);
/* C/C++ and Fortran interfaces are available */typedef libxsmm_mmfunction<T> kernel_type;
/* generates and dispatches a matrix multiplication kernel (C++ functor) */
kernel_type kernel(LIBXSMM_GEMM_FLAG_NONE, m, n, k, 1.0/*alpha*/, 1.0/*beta*/);
assert(kernel);
for (int i = 0; i < batchsize; ++i) { /* initialize input */for (int ki = 0; ki < k; ++ki) {
for (int j = 0; j < m; ++j) a[i * j * ki] = static_cast<T>(1) / ((i + j + ki) % 25);
for (int j = 0; j < n; ++j) b[i * j * ki] = static_cast<T>(7) / ((i + j + ki) % 75);
}
}
/* kernel multiplies and accumulates matrices: C += Ai * Bi */for (int i = 0; i < batchsize; ++i) kernel(&a[i * m * k], &b[i * k * n], &c[0]);
}
How should I modify the above example so that libxsmm performs a mixed-precision bfloat16/float32 gemm?
Generally speaking, it would be helpful if the documentation had more examples.
Thank you very much!
The text was updated successfully, but these errors were encountered:
We are right now prepping the version 2.0 release and the C++ interface is no longer covering mix edprecision as all the low precision types are not defined in the language. We hadn't had a chance to update the documentation... sorry :-(
As @stefan0re mentioned, samples/xgemm has example apps for GEMM and samples/etlwise for eltwise unary/binary operation. This small apps aim to serve as the C only API documentation as the simple C codes show what the highly optimized implementations of libxsmm do mathematically.
In terms of the samples/xgemm codes, as someone that is new to the library I am finding these indecipherable: they have no comments that explain what they do and there is no real matching documentation with examples. As these are all gemms it should be relatively easy to write a few easy-to-understand examples?
I have access to a Sapphire Rapids CPU and I would like to test the libxsmm gemms performance using bfloat16 as input and float32 as output so that AMX-BF16/AVX512-BF16 instructions are used. However, the documentation only includes the following example:
How should I modify the above example so that libxsmm performs a mixed-precision bfloat16/float32 gemm?
Generally speaking, it would be helpful if the documentation had more examples.
Thank you very much!
The text was updated successfully, but these errors were encountered: