We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
As noted in #106, performance with cuBLAS backend is low and can be improved if CUDA context is cached. Opening this issue as a tracker.
Appears in latest.
cuBLAS backend
See #106
Running cuBLAS backend through oneMKL is slower than running cuBLAS directly.
Running cuBLAS backend through oneMKL should match or perform very close to pure cuBLAS.
The text was updated successfully, but these errors were encountered:
Successfully merging a pull request may close this issue.
Summary
As noted in #106, performance with cuBLAS backend is low and can be improved if CUDA context is cached.
Opening this issue as a tracker.
Version
Appears in latest.
Environment
cuBLAS backend
Steps to reproduce
See #106
Observed behavior
Running cuBLAS backend through oneMKL is slower than running cuBLAS directly.
Expected behavior
Running cuBLAS backend through oneMKL should match or perform very close to pure cuBLAS.
The text was updated successfully, but these errors were encountered: