Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update triton pin for release 2.3 #121781

Open
wants to merge 7 commits into
base: release/2.3
Choose a base branch
from

Conversation

shunting314
Copy link
Contributor

@shunting314 shunting314 commented Mar 13, 2024

The current status of the PR:

  • CI is clean. The only failures not related to building a wheel also happens on the base commit.
  • Perf test is mostly neutral link. The is an extra failed model in torchbench. But the error type 'eager_two_runs_differ' indicates it's not related to triton upgrade. We've double checked that the perf test job is using the correct triton pin (thanks to @huydhn and @atalman )

I'll now clean up the PR to only keep the necessary PyTorch side change to make it work with tirton 2.2.x . Other hacks coming up to do the tests will be removed (e.g. pointing to my triton fork).

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @aakhundov @ColinPeppler @amjames @desertfire @chauhang

Copy link

pytorch-bot bot commented Mar 13, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/121781

Note: Links to docs will display an error until the docs builds have been completed.

❌ 21 New Failures, 7 Unrelated Failures

As of commit d0933e6 with merge base 86a2d67 (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added the topic: not user facing topic category label Mar 13, 2024
@shunting314 shunting314 added ciflow/inductor ciflow/trunk Trigger trunk jobs on your pull request labels Mar 13, 2024
@shunting314 shunting314 force-pushed the shunting-update-triton-release branch from 5e36c05 to b81ea2a Compare March 13, 2024 21:51
@shunting314 shunting314 force-pushed the shunting-update-triton-release branch 2 times, most recently from 1747abd to 87b3a3a Compare March 14, 2024 00:05
@shunting314 shunting314 requested a review from a team as a code owner March 14, 2024 00:05
@shunting314
Copy link
Contributor Author

cc @atalman for the error regarding building triton wheels

@shunting314
Copy link
Contributor Author

We also see a bunch of GLIBCXX related failures. I'm worried that they may hide other triton related issues. @atalman

@peterbell10
Copy link
Collaborator

GLIBCXX error is fixed by #121556

@atalman atalman added the ciflow/binaries_wheel Trigger binary build and upload jobs for wheel on the PR label Mar 14, 2024
shunting314 and others added 5 commits March 14, 2024 13:35
…release (#115379) (#121901)

Cherry pick of #115379 from Release 2.2 that should be applied to main and Release 2.3 as well

Pull Request resolved: #121901
Approved by: https://github.com/DanilBaibak, https://github.com/jeanschmidt
Without this we get mismatches between the GLIBC and GLIBCXX ABI used
by conda packages vs pytorch.
Pull Request resolved: #121556
Approved by: https://github.com/isuruf, https://github.com/malfet
@shunting314 shunting314 force-pushed the shunting-update-triton-release branch from 87b3a3a to e34c79e Compare March 14, 2024 20:38
@shunting314 shunting314 force-pushed the shunting-update-triton-release branch 2 times, most recently from 2f83d1c to 748343a Compare March 15, 2024 05:48
@shunting314 shunting314 force-pushed the shunting-update-triton-release branch from 748343a to d0933e6 Compare March 15, 2024 05:50
@shunting314
Copy link
Contributor Author

CI is finally clean! The only failures not related to building a wheel also happens on the base commit. Will check how perf looks

@shunting314
Copy link
Contributor Author

I've create a separate PR to only include the necessary PyTorch side change to make it work with the triton release (#122139). Keep this PR around since all the tests are done in this PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/binaries_wheel Trigger binary build and upload jobs for wheel on the PR ciflow/inductor ciflow/rocm ciflow/trunk Trigger trunk jobs on your pull request module: inductor topic: not user facing topic category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants