[Inductor Cutlass backend] Set INDUCTOR_TEST_DISABLE_FRESH_CACHE in test setup #124574

kadeng · 2024-04-21T22:51:00Z

Stack from ghstack (oldest at bottom):

The diff #122661 introduces a new automatic cache refresh mechanism during all inductor-derived test cases.

But this refresh mechanism seems not to work properly across process boundaries, specifically when using autotune_in_subproc, which many tests in test_cutlass_backend.py rely on.

Solution: Set the env var INDUCTOR_TEST_DISABLE_FRESH_CACHE=1
early during test setup within test_cutlass_backend.py

Test Plan:
This is a change to unit tests only.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @yf225 @chenyang78 @muchulee8 @ColinPeppler @amjames @desertfire @chauhang

[ghstack-poisoned]

pytorch-bot · 2024-04-21T22:51:03Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/124574

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (3 Unrelated Failures)

As of commit 69dcc1d with merge base failed to retrieve merge base, please contact dev infra:

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

inductor / cuda12.1-py3.10-gcc9-sm86 / test (dynamic_inductor_timm, 2, 2, linux.g5.4xlarge.nvidia.gpu) (gh)
sebotnet33ts_256
pull / linux-focal-py3.11-clang10 / test (default, 1, 3, linux.2xlarge) (gh)
RuntimeError: profiler/test_profiler 1/1 failed
trunk / linux-focal-cuda12.1-py3.10-gcc9 / test (nogpu_AVX512, 1, 1, linux.2xlarge) (gh)
RuntimeError: profiler/test_profiler 1/1 failed

This comment was automatically generated by Dr. CI and updates every 15 minutes.

aakhundov · 2024-04-21T23:18:09Z

test/inductor/test_cutlass_backend.py

+        # interacts badly with persistent subprocesses during
+        # autotuning. So we need to disable automatic cache refresh
+        # before calling setUp() on the parent class.
+        os.environ["INDUCTOR_TEST_DISABLE_FRESH_CACHE"] = "1"


Should we revert this after the call to super().setUp()? Otherwise, the env var will stick around and may affect other tests running in the same process in CI?

Yes makes sense.

[ghstack-poisoned]

kadeng · 2024-04-23T18:09:00Z

@pytorchbot merge

pytorchmergebot · 2024-04-23T18:11:47Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2024-04-23T21:24:11Z

Merge failed

Reason: 3 mandatory check(s) failed. The first few are:

Dig deeper by viewing the failures on hud

Details for Dev Infra team

Raised by workflow job

Failing merge rule: Core Maintainers

kadeng · 2024-04-24T13:56:29Z

@pytorchbot merge -f "Failing profiler tests are known to be broken"

pytorchmergebot · 2024-04-24T13:58:11Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

…emm_choices (#124575) Clean up CUTLASSGemmTemplate.add_cutlass_gemm_choices, removing code that became unneccessary by removing EVT-based epilogue fusion. Test Plan: Already covered by test_cutlass_backend.py Pull Request resolved: #124575 Approved by: https://github.com/jansel ghstack dependencies: #121497, #123930, #123932, #121734, #124107, #124574

…est setup (pytorch#124574) The diff pytorch#122661 introduces a new automatic cache refresh mechanism during all inductor-derived test cases. But this refresh mechanism seems not to work properly across process boundaries, specifically when using autotune_in_subproc, which many tests in test_cutlass_backend.py rely on. Solution: Set the env var INDUCTOR_TEST_DISABLE_FRESH_CACHE=1 early during test setup within test_cutlass_backend.py Test Plan: This is a change to unit tests only. Pull Request resolved: pytorch#124574 Approved by: https://github.com/aakhundov ghstack dependencies: pytorch#121497, pytorch#123930, pytorch#123932, pytorch#121734, pytorch#124107

…emm_choices (pytorch#124575) Clean up CUTLASSGemmTemplate.add_cutlass_gemm_choices, removing code that became unneccessary by removing EVT-based epilogue fusion. Test Plan: Already covered by test_cutlass_backend.py Pull Request resolved: pytorch#124575 Approved by: https://github.com/jansel ghstack dependencies: pytorch#121497, pytorch#123930, pytorch#123932, pytorch#121734, pytorch#124107, pytorch#124574

Update

1808119

[ghstack-poisoned]

pytorch-bot bot added module: inductor topic: not user facing topic category labels Apr 21, 2024

kadeng marked this pull request as ready for review April 21, 2024 22:59

kadeng requested review from aakhundov, eellison and peterbell10 April 21, 2024 23:00

aakhundov reviewed Apr 21, 2024

View reviewed changes

kadeng added the ciflow/inductor label Apr 22, 2024

kadeng added 2 commits April 22, 2024 11:30

Update

79a4b41

[ghstack-poisoned]

Update

69dcc1d

[ghstack-poisoned]

aakhundov approved these changes Apr 22, 2024

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Apr 23, 2024

pytorchmergebot added the merging label Apr 23, 2024

pytorchmergebot removed the merging label Apr 23, 2024

pytorchmergebot added the merging label Apr 24, 2024

pytorchmergebot added the Merged label Apr 24, 2024

pytorchmergebot closed this in a47f425 Apr 24, 2024

pytorchmergebot removed the merging label Apr 24, 2024

github-actions bot deleted the gh/kadeng/53/head branch June 2, 2024 02:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Inductor Cutlass backend] Set INDUCTOR_TEST_DISABLE_FRESH_CACHE in test setup #124574

[Inductor Cutlass backend] Set INDUCTOR_TEST_DISABLE_FRESH_CACHE in test setup #124574

Uh oh!

kadeng commented Apr 21, 2024 •

edited

Loading

Uh oh!

pytorch-bot bot commented Apr 21, 2024 •

edited

Loading

Uh oh!

aakhundov Apr 21, 2024

Uh oh!

kadeng Apr 22, 2024 •

edited

Loading

Uh oh!

kadeng Apr 22, 2024

Uh oh!

kadeng commented Apr 23, 2024

Uh oh!

pytorchmergebot commented Apr 23, 2024

Uh oh!

pytorchmergebot commented Apr 23, 2024

Uh oh!

kadeng commented Apr 24, 2024

Uh oh!

pytorchmergebot commented Apr 24, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[Inductor Cutlass backend] Set INDUCTOR_TEST_DISABLE_FRESH_CACHE in test setup #124574

[Inductor Cutlass backend] Set INDUCTOR_TEST_DISABLE_FRESH_CACHE in test setup #124574

Uh oh!

Conversation

kadeng commented Apr 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Apr 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/124574

✅ You can merge normally! (3 Unrelated Failures)

Uh oh!

aakhundov Apr 21, 2024

Choose a reason for hiding this comment

Uh oh!

kadeng Apr 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kadeng Apr 22, 2024

Choose a reason for hiding this comment

Uh oh!

kadeng commented Apr 23, 2024

Uh oh!

pytorchmergebot commented Apr 23, 2024

Merge started

Uh oh!

pytorchmergebot commented Apr 23, 2024

Merge failed

Uh oh!

kadeng commented Apr 24, 2024

Uh oh!

pytorchmergebot commented Apr 24, 2024

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kadeng commented Apr 21, 2024 •

edited

Loading

pytorch-bot bot commented Apr 21, 2024 •

edited

Loading

kadeng Apr 22, 2024 •

edited

Loading