-
Notifications
You must be signed in to change notification settings - Fork 94
Insights: ROCm/hipBLASLt
Overview
-
- 19 Merged pull requests
- 17 Open pull requests
- 1 Closed issue
- 0 New issues
Could not load contribution data
Please try again later
19 Pull requests merged by 13 people
-
code-gen: improved tail loop and edge tile of swizzled A
#1421 merged
Dec 20, 2024 -
Fix incorrect type casting for alpha and beta in f16 compute type
#1462 merged
Dec 20, 2024 -
Support UseSgprForGRO for dtva
#1437 merged
Dec 19, 2024 -
Update gfx942 NT/TN/NN FP8/BF8/BF16 Equality
#1463 merged
Dec 19, 2024 -
gfx942_80cu BBS NN NT TN Tuning Release
#1459 merged
Dec 18, 2024 -
Revert "Fix incorrect type casting for alpha and beta in f16 compute type"
#1461 merged
Dec 18, 2024 -
Fix invalid string printed when running hipblaslt-test
#1428 merged
Dec 17, 2024 -
Fix incorrect type casting for alpha and beta in f16 compute type
#1454 merged
Dec 17, 2024 -
Update solutions for hstu bmm 3 sizes.
#1453 merged
Dec 17, 2024 -
Update gfx942_80cu NT/TN/NN f16/f32 Equality
#1452 merged
Dec 17, 2024 -
gfx942 38cu HSS/BSS NN TN NT grid tune
#1448 merged
Dec 16, 2024 -
gfx942 38cu SGEMM NN TN NT grid tune
#1439 merged
Dec 16, 2024 -
gfx942 38cu F8HS NN TN NT grid tune
#1440 merged
Dec 16, 2024 -
Modified the Sparse Test in Tensile Lite
#1450 merged
Dec 16, 2024 -
[Hotfix] correct occupancy calculation
#1451 merged
Dec 16, 2024 -
Patch client writer
#1447 merged
Dec 16, 2024 -
Bump rocm-docs-core from 1.10.0 to 1.11.0 in /docs/sphinx
#1427 merged
Dec 14, 2024 -
Remove functions for generating file paths
#1402 merged
Dec 13, 2024 -
Update gfx942 BBS NN/NT/TN Equality/GridBased yamls for 1204 bbs bmm dynamic
#1446 merged
Dec 13, 2024
17 Pull requests opened by 14 people
-
correct USE_HIP_FP8_DEF define condition
#1455 opened
Dec 16, 2024 -
Fix garbage value of bias_type
#1457 opened
Dec 17, 2024 -
Bump rocm-docs-core from 1.11.0 to 1.12.0 in /docs/sphinx
#1458 opened
Dec 17, 2024 -
Use archVGPR when accVGPR is not enough.
#1460 opened
Dec 18, 2024 -
gfx12 - enlarging tolerance is not needed
#1464 opened
Dec 18, 2024 -
Add BBS support for find_exact.py
#1465 opened
Dec 19, 2024 -
Create README.md for TensileLite
#1466 opened
Dec 19, 2024 -
Update BBS NN/NT/TN equality tuning for gfx942_80cu
#1467 opened
Dec 19, 2024 -
Update README.md
#1468 opened
Dec 19, 2024 -
Provide unsurprising behavior for tensile threads
#1469 opened
Dec 19, 2024 -
Aquavanjaram 20CU equality GEMMM tuning updates for TF32 NN and TN data type
#1470 opened
Dec 19, 2024 -
Changes to exclude StreamK by default
#1471 opened
Dec 19, 2024 -
Build prototype
#1472 opened
Dec 19, 2024 -
Tune Aquavanjaram942 Grid sizes for BBS & HHS NN
#1473 opened
Dec 19, 2024 -
Gfx942 80cu grid based and equality tuning for HHS NN
#1474 opened
Dec 19, 2024 -
Update rocm-docs-core to 1.12.0
#1475 opened
Dec 19, 2024 -
update 942 7 range F8HS Gridebase
#1476 opened
Dec 20, 2024
1 Issue closed by 1 person
-
[Issue]: "unused" dependencies in requirements.txt
#1441 closed
Dec 16, 2024
12 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
[Issue]: [Documentation] Document gfx908 support
#1299 commented on
Dec 19, 2024 • 0 new comments -
[Issue]: how to make sure the gemm operation using hipblasLtMatmul()
#1442 commented on
Dec 16, 2024 • 0 new comments -
dot2 fp16 mac kernel for gfx942
#1258 commented on
Dec 14, 2024 • 0 new comments -
Added multiple devices support for matrix transform
#1338 commented on
Dec 17, 2024 • 0 new comments -
Remove merge-files option
#1407 commented on
Dec 19, 2024 • 0 new comments -
Update gfx942_80cu TN/NN f16 gridbased
#1419 commented on
Dec 20, 2024 • 0 new comments -
Update gfx942_80cu BF16 TN equality
#1420 commented on
Dec 20, 2024 • 0 new comments -
Optimize generator
#1424 commented on
Dec 19, 2024 • 0 new comments -
Add reject states for failing streamk params
#1425 commented on
Dec 20, 2024 • 0 new comments -
Support conjugate-transpose as equivalent to transpose
#1429 commented on
Dec 19, 2024 • 0 new comments -
[Draft] Distributed tuning
#1435 commented on
Dec 19, 2024 • 0 new comments -
Tcl cleanup
#1444 commented on
Dec 19, 2024 • 0 new comments