Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ROCm] hipGraph support for pytorch mainline #88202

Closed
wants to merge 1 commit into from

Conversation

dllehr-amd
Copy link
Contributor

@dllehr-amd dllehr-amd commented Nov 1, 2022

With the release of ROCm 5.3 hip now supports a hipGraph implementation.

All necessary backend work and hipification is done to support the same functionality as cudaGraph.

Unit tests are modified to support a new TEST_GRAPH feature which allows us to create a single check for graph support instead of attempted to gather the CUDA level in annotations for every graph test

cc @jeffdaily @sunway513 @jithunnair-amd @pruthvistony @ROCmSupport

@pytorch-bot
Copy link

pytorch-bot bot commented Nov 1, 2022

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/88202

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 Failures

As of commit 394a1b6:

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Nov 1, 2022

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: dllehr-amd / name: Douglas Lehr (55e2dff)

@pruthvistony
Copy link
Collaborator

@pytorchbot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a rebase job. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Rebase failed due to Command git -C /home/runner/work/pytorch/pytorch rebase refs/remotes/origin/viable/strict pull/88202/head returned non-zero exit code 1

Rebasing (1/1)
Auto-merging c10/cuda/CUDACachingAllocator.cpp
CONFLICT (content): Merge conflict in c10/cuda/CUDACachingAllocator.cpp
Auto-merging cmake/Dependencies.cmake
Auto-merging test/test_cuda.py
Auto-merging torch/utils/hipify/cuda_to_hip_mappings.py
error: could not apply 55e2dff055... hipGraph support for pytorch mainline
hint: Resolve all conflicts manually, mark them as resolved with
hint: "git add/rm <conflicted_files>", then run "git rebase --continue".
hint: You can instead skip this commit: run "git rebase --skip".
hint: To abort and get back to the state before "git rebase", run "git rebase --abort".
Could not apply 55e2dff055... hipGraph support for pytorch mainline

Raised by https://github.com/pytorch/pytorch/actions/runs/3408082704

@dllehr-amd dllehr-amd marked this pull request as ready for review November 21, 2022 17:42
@pruthvistony pruthvistony added module: rocm AMD GPU support for Pytorch ciflow/trunk Trigger trunk jobs on your pull request ciflow/periodic Trigger jobs ran periodically on master (periodic.yml) on the PR rocm This tag is for PRs from ROCm team rocm priority high priority ROCm PRs from performance or other aspects labels Nov 21, 2022
@pruthvistony pruthvistony changed the title hipGraph support for pytorch mainline [ROCm] hipGraph support for pytorch mainline Nov 21, 2022
@zou3519 zou3519 added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Nov 28, 2022
Copy link
Collaborator

@jithunnair-amd jithunnair-amd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes have been tested in our fork of PyTorch with ROCm5.3.

pruthvistony pushed a commit to ROCm/pytorch that referenced this pull request Dec 9, 2022
pruthvistony pushed a commit to ROCm/pytorch that referenced this pull request Jan 7, 2023
pruthvistony pushed a commit to ROCm/pytorch that referenced this pull request Jan 16, 2023
@jithunnair-amd jithunnair-amd marked this pull request as draft January 24, 2023 22:43
@jithunnair-amd
Copy link
Collaborator

@dllehr-amd Let's take this PR back out of draft when the CI is green and it's ready to be reviewed and merged by PyTorch reviewers.

jithunnair-amd pushed a commit to jithunnair-amd/pytorch that referenced this pull request Feb 1, 2023
@dllehr-amd dllehr-amd force-pushed the dllehr_graph branch 2 times, most recently from 1caa540 to 6ffcf62 Compare February 6, 2023 19:27
@dllehr-amd dllehr-amd marked this pull request as ready for review February 6, 2023 19:30
@pruthvistony
Copy link
Collaborator

@pytorchbot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a rebase job. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Successfully rebased dllehr_graph onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout dllehr_graph && git pull --rebase)

@dllehr-amd
Copy link
Contributor Author

@pytorchbot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a rebase job. Check the current status here

With the release of ROCm 5.3 hip now supports a hipGraph implementation.

All necessary backend work and hipification are done to support the
same functionality as cudaGraph.

Unit tests are modified to support a new TEST_GRAPH feature
which allows us to create a single check for graph support instead of
attempted to gather the CUDA level in annotations for every graph test
@pytorchmergebot
Copy link
Collaborator

Successfully rebased dllehr_graph onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout dllehr_graph && git pull --rebase)

@jithunnair-amd
Copy link
Collaborator

jithunnair-amd commented Feb 13, 2023

@jithunnair-amd
Copy link
Collaborator

@pytorchbot merge -f "Unrelated CI failures"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/periodic Trigger jobs ran periodically on master (periodic.yml) on the PR ciflow/trunk Trigger trunk jobs on your pull request Merged module: rocm AMD GPU support for Pytorch open source rocm priority high priority ROCm PRs from performance or other aspects rocm This tag is for PRs from ROCm team triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants