Skip to content

Conversation

jeffdaily
Copy link
Collaborator

@jeffdaily jeffdaily commented Oct 24, 2024

Fixes #138532.

This brings hipblas behavior in line with cublas behavior with respect to setting the workspace to an allocation from the caching allocator as well as the env var HIPBLAS_WORKSPACE_CONFIG.

cc @sunway513 @jithunnair-amd @pruthvistony @ROCmSupport @dllehr-amd @jataylo @hongxiayang @naromero77amd

Fixes pytorch#138532.

This brings hipblas behavior in line with cublas behavior with respect
to setting the workspace to an allocation from the caching allocator as
well as the env var HIPBLAS_WORKSPACE_CONFIG.
@jeffdaily jeffdaily added module: rocm AMD GPU support for Pytorch release notes: rocm mandatorylabel rocm This tag is for PRs from ROCm team rocm priority high priority ROCm PRs from performance or other aspects ciflow/rocm Trigger "default" config CI on ROCm labels Oct 24, 2024
Copy link

pytorch-bot bot commented Oct 24, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/138791

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 8f09561 with merge base 8aedc64 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Copy link
Collaborator

@naromero77amd naromero77amd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I verified that this fix on an MI-300.

@jeffdaily jeffdaily added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 25, 2024
@soulitzer soulitzer requested review from malfet and ngimel October 28, 2024 15:03
@soulitzer soulitzer added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Oct 28, 2024
@naromero77amd
Copy link
Collaborator

We have some urgency in getting this upstream since its a show stopper bug for some users. Pinging a couple of upstream maintainers @malfet @eqy.

I have already reviewed and approved it.

Copy link
Contributor

@malfet malfet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but please consider formatting changes

jeffdaily and others added 2 commits October 28, 2024 14:40
nit: missed lint

Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
@jeffdaily
Copy link
Collaborator Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

rahulsingh-intel pushed a commit to rahulsingh-intel/pytorch that referenced this pull request Oct 29, 2024
Fixes pytorch#138532.

This brings hipblas behavior in line with cublas behavior with respect to setting the workspace to an allocation from the caching allocator as well as the env var HIPBLAS_WORKSPACE_CONFIG.

Pull Request resolved: pytorch#138791
Approved by: https://github.com/naromero77amd, https://github.com/eqy, https://github.com/malfet

Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
rahulsingh-intel pushed a commit to rahulsingh-intel/pytorch that referenced this pull request Nov 5, 2024
Fixes pytorch#138532.

This brings hipblas behavior in line with cublas behavior with respect to setting the workspace to an allocation from the caching allocator as well as the env var HIPBLAS_WORKSPACE_CONFIG.

Pull Request resolved: pytorch#138791
Approved by: https://github.com/naromero77amd, https://github.com/eqy, https://github.com/malfet

Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
jeffdaily added a commit to ROCm/pytorch that referenced this pull request Nov 14, 2024
Fixes pytorch#138532.

This brings hipblas behavior in line with cublas behavior with respect to setting the workspace to an allocation from the caching allocator as well as the env var HIPBLAS_WORKSPACE_CONFIG.

Pull Request resolved: pytorch#138791
Approved by: https://github.com/naromero77amd, https://github.com/eqy, https://github.com/malfet

Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
jeffdaily added a commit to ROCm/pytorch that referenced this pull request Nov 14, 2024
Fixes pytorch#138532.

This brings hipblas behavior in line with cublas behavior with respect to setting the workspace to an allocation from the caching allocator as well as the env var HIPBLAS_WORKSPACE_CONFIG.

Pull Request resolved: pytorch#138791
Approved by: https://github.com/naromero77amd, https://github.com/eqy, https://github.com/malfet

Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
pruthvistony pushed a commit to ROCm/pytorch that referenced this pull request Nov 15, 2024
Fixes pytorch#138532.

This brings hipblas behavior in line with cublas behavior with respect
to setting the workspace to an allocation from the caching allocator as
well as the env var HIPBLAS_WORKSPACE_CONFIG.

Pull Request resolved: pytorch#138791
Approved by: https://github.com/naromero77amd, https://github.com/eqy,
https://github.com/malfet

Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
pruthvistony pushed a commit to ROCm/pytorch that referenced this pull request Nov 15, 2024
Fixes pytorch#138532.

This brings hipblas behavior in line with cublas behavior with respect
to setting the workspace to an allocation from the caching allocator as
well as the env var HIPBLAS_WORKSPACE_CONFIG.

Pull Request resolved: pytorch#138791
Approved by: https://github.com/naromero77amd, https://github.com/eqy,
https://github.com/malfet

Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
jithunnair-amd pushed a commit to ROCm/pytorch that referenced this pull request Mar 17, 2025
Fixes pytorch#138532.

This brings hipblas behavior in line with cublas behavior with respect
to setting the workspace to an allocation from the caching allocator as
well as the env var HIPBLAS_WORKSPACE_CONFIG.

Pull Request resolved: pytorch#138791
Approved by: https://github.com/naromero77amd, https://github.com/eqy,
https://github.com/malfet

Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/rocm Trigger "default" config CI on ROCm ciflow/trunk Trigger trunk jobs on your pull request Merged module: rocm AMD GPU support for Pytorch open source release notes: rocm mandatorylabel rocm priority high priority ROCm PRs from performance or other aspects rocm This tag is for PRs from ROCm team triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ROCm] MI300X Tunable ops causes 100GB of Memory Leak leading to OOM

7 participants