[ROCm] set hipblas workspace #138791

jeffdaily · 2024-10-24T05:32:00Z

This brings hipblas behavior in line with cublas behavior with respect to setting the workspace to an allocation from the caching allocator as well as the env var HIPBLAS_WORKSPACE_CONFIG.

cc @sunway513 @jithunnair-amd @pruthvistony @ROCmSupport @dllehr-amd @jataylo @hongxiayang @naromero77amd

Fixes pytorch#138532. This brings hipblas behavior in line with cublas behavior with respect to setting the workspace to an allocation from the caching allocator as well as the env var HIPBLAS_WORKSPACE_CONFIG.

pytorch-bot · 2024-10-24T05:32:03Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/138791

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 8f09561 with merge base 8aedc64 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

test/test_cuda.py

docs/source/notes/hip.rst

aten/src/ATen/cuda/CublasHandlePool.cpp

docs/source/notes/hip.rst

aten/src/ATen/cuda/CublasHandlePool.cpp

naromero77amd

I verified that this fix on an MI-300.

naromero77amd · 2024-10-28T20:14:45Z

We have some urgency in getting this upstream since its a show stopper bug for some users. Pinging a couple of upstream maintainers @malfet @eqy.

I have already reviewed and approved it.

malfet

LGTM, but please consider formatting changes

aten/src/ATen/cuda/CublasHandlePool.cpp

nit: missed lint Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>

jeffdaily · 2024-10-28T21:54:08Z

@pytorchbot merge

pytorchmergebot · 2024-10-28T21:55:55Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Fixes pytorch#138532. This brings hipblas behavior in line with cublas behavior with respect to setting the workspace to an allocation from the caching allocator as well as the env var HIPBLAS_WORKSPACE_CONFIG. Pull Request resolved: pytorch#138791 Approved by: https://github.com/naromero77amd, https://github.com/eqy, https://github.com/malfet Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>

[ROCm] set hipblas workspace

79c3031

Fixes pytorch#138532. This brings hipblas behavior in line with cublas behavior with respect to setting the workspace to an allocation from the caching allocator as well as the env var HIPBLAS_WORKSPACE_CONFIG.

jeffdaily added module: rocm AMD GPU support for Pytorch release notes: rocm mandatorylabel rocm This tag is for PRs from ROCm team rocm priority high priority ROCm PRs from performance or other aspects ciflow/rocm Trigger "default" config CI on ROCm labels Oct 24, 2024

jeffdaily requested review from eqy, jithunnair-amd and syed-ahmed as code owners October 24, 2024 05:32

pytorchbot added the open source label Oct 24, 2024

lint

ec6330c

naromero77amd reviewed Oct 24, 2024

View reviewed changes

naromero77amd approved these changes Oct 24, 2024

View reviewed changes

jeffdaily added 2 commits October 25, 2024 17:57

respond to review; increase workspace default to match rocblas internal

f77afa0

fix additional unit test

736a319

jeffdaily added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 25, 2024

soulitzer requested review from malfet and ngimel October 28, 2024 15:03

soulitzer added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Oct 28, 2024

lint

fdb34ba

eqy approved these changes Oct 28, 2024

View reviewed changes

malfet approved these changes Oct 28, 2024

View reviewed changes

aten/src/ATen/cuda/CublasHandlePool.cpp Outdated Show resolved Hide resolved

jeffdaily and others added 2 commits October 28, 2024 14:40

Update aten/src/ATen/cuda/CublasHandlePool.cpp

bf3b2f4

nit: missed lint Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>

update hip.rst with latest rocBLAS workspace default sizes

8f09561

pytorchmergebot added the merging label Oct 28, 2024

hliuca mentioned this pull request Oct 28, 2024

[ROCm] PyTorch TunableOps results in Memory Access Fault #139116

Closed

pytorchmergebot added the Merged label Oct 29, 2024

pytorchmergebot closed this in 7c7b2d8 Oct 29, 2024

pytorchmergebot removed the merging label Oct 29, 2024

jeffdaily mentioned this pull request Nov 14, 2024

[release/2.5] set hipblas workspace (#138791) ROCm/pytorch#1715

Merged

jeffdaily mentioned this pull request Nov 14, 2024

[release/2.4] set hipblas workspace (#138791) ROCm/pytorch#1716

Merged

jeffdaily mentioned this pull request Feb 3, 2025

add hipblasSetWorkspace ROCm/hipBLAS#959

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ROCm] set hipblas workspace #138791

[ROCm] set hipblas workspace #138791

Uh oh!

jeffdaily commented Oct 24, 2024 •

edited by pytorch-bot bot

Loading

Uh oh!

pytorch-bot bot commented Oct 24, 2024 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

naromero77amd left a comment

Uh oh!

naromero77amd commented Oct 28, 2024

Uh oh!

malfet left a comment

Uh oh!

Uh oh!

jeffdaily commented Oct 28, 2024

Uh oh!

pytorchmergebot commented Oct 28, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

[ROCm] set hipblas workspace #138791

[ROCm] set hipblas workspace #138791

Uh oh!

Conversation

jeffdaily commented Oct 24, 2024 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Oct 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/138791

✅ No Failures

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

naromero77amd left a comment

Choose a reason for hiding this comment

Uh oh!

naromero77amd commented Oct 28, 2024

Uh oh!

malfet left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jeffdaily commented Oct 28, 2024

Uh oh!

pytorchmergebot commented Oct 28, 2024

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

jeffdaily commented Oct 24, 2024 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Oct 24, 2024 •

edited

Loading