Enable context caching for cuBLAS backend to improve performance #111

mmeterel · 2021-07-13T18:14:00Z

Summary

As noted in #106, performance with cuBLAS backend is low and can be improved if CUDA context is cached.
Opening this issue as a tracker.

Appears in latest.

cuBLAS backend

Running cuBLAS backend through oneMKL is slower than running cuBLAS directly.

Running cuBLAS backend through oneMKL should match or perform very close to pure cuBLAS.

mmeterel mentioned this issue Jul 13, 2021

A better way to measure performance of SGEMM using cuBLAS backend #106

Closed

aelizaro mentioned this issue Feb 24, 2022

[BLAS] Fix cublas perf #169

Merged

2 tasks

mmeterel closed this as completed in #169 Feb 27, 2022