CNugteren / CLBlast Public

Notifications
Fork 201
Star 1k

Code
Issues 31
Pull requests 1
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Security
Insights

Issues: CNugteren/CLBlast

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

31 Open 280 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

Accuracy problem on Apple M1 and Intel(R) UHD Graphics 770 correctness

#542 opened May 17, 2024 by fengyuentau

Banded matrices required buffer size calculated incorrectly (GBMV, HBMV, SBMV & TBMV) correctness

#538 opened Apr 12, 2024 by BadgerKing7

SGEMM broken with 1.6.2 on Intel ARC correctness

#533 opened Feb 20, 2024 by 0cc4m

tunner transpose fails on various specific sizes correctness

#532 opened Feb 10, 2024 by baryluk

Unparsed options to tunner are ignored, and better handling of platform/device options feature request

#528 opened Feb 10, 2024 by baryluk

ruby numo-linalg + clblast: OpenCL error: clCreateContext: -6 question

#524 opened Jan 16, 2024 by dpblnt

Consider add SVM Buffer interface support? feature request

#523 opened Jan 15, 2024 by engineer1109

Segmentation fault with OpenCL 3.0 CUDA (CUDA 12.3) correctness

#521 opened Jan 15, 2024 by fengyuentau

gemm performance downgrade for small size M and big size N&K performance

#520 opened Jan 10, 2024 by abcdrm

HGEMM performance in Adreno(tm) 740 is not faster than SGEMM performance

#513 opened Oct 31, 2023 by cunyangwei

Do I have to cross-compile both opencl and clblast for android? question

#511 opened Oct 25, 2023 by DavdGao

Is it a good idea to use GCN cross lane instruction for optimization? performance

#510 opened Oct 24, 2023 by fancyIX

Inconsistent GEMM Tuner Execution Times question

#428 opened Nov 27, 2021 by huttered40

How to implement a new routine CLBlastSDgemm() to get the float and double type of Matrix C question

#427 opened Nov 10, 2021 by TaihuLight

Variable Length Matrix Multiplication Batched Processing feature request

#409 opened Dec 22, 2020 by akumaburn

Would it be possible to upgrade newer minimum required cmake version ? like 3.0 or 3.5 question

#407 opened Nov 6, 2020 by 9prady9

Performance sluggish on AMD RX-Vega 64 performance

#403 opened Oct 13, 2020 by JSav87

Does anyone have Snapdragon 865 or 855 GEMM results help wanted question

#390 opened May 27, 2020 by tingxingdong

Batched operations for Python bindings? feature request

#384 opened May 4, 2020 by ethanhs

Cannot build with NDK 21 help wanted question

#380 opened Apr 7, 2020 by abrarmatin

Huge performance degradation when calling matrix multiply in a loop performance

#370 opened Sep 21, 2019 by blueberry

Extending CLBlast to include BLAS symbols expected by Armadillo feature request

#365 opened Jul 19, 2019 by stevepur

Sub-optimal performance with Vega FE in FP32 SGEMM performance

#350 opened Feb 14, 2019 by SandboChang

New routine for the stride of 0 for C in CLBlastSgemmStridedBatched() is need feature request

#347 opened Jan 9, 2019 by TaihuLight

Enqueue several calls feature request

#337 opened Nov 21, 2018 by cabezabuque

Previous 1 2 Next

Previous Next

ProTip! What’s not been updated in a month: updated:<2024-04-19.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly