Skip to content

Conversation

coconutruben
Copy link
Contributor

@coconutruben coconutruben commented Dec 6, 2024

Summary:

Why

  • sampling the same config multiple times is wasteful, especially on exhaustive
  • for AMD we rewrite the configs to have a specific number of stages, which might lead to some configs appearing multiple times

What

cast the configs, already defined as a tuple, through a set to remove duplicates

Test Plan:
taken from the mm_kernel_configs logic in the same file

>>> mm_kernel_configs = [        {"config": (BLOCK_M, BLOCK_N, BLOCK_K, num_stages, num_warps), "cond": True}        for BLOCK_M, BLOCK_N, BLOCK_K in itertools.product(            [16, 32, 64, 128, 256], repeat=3        )        for num_stages in [1, 2, 3, 4, 5]        for num_warps in [2, 4, 8]    ]
>>> configs = [c['config'] for c in mm_kernel_configs]
>>> a = tuple((c[0], c[1], c[2], 0, c[4]) for c in configs)
>>> len(set(a))
375
>>> len(a)
1875
>>>

Differential Revision: D66893774

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov

Copy link

pytorch-bot bot commented Dec 6, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/142254

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (3 Unrelated Failures)

As of commit 5cc9e03 with merge base 960a81f (image):

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D66893774

Copy link

netlify bot commented Dec 6, 2024

Deploy Preview for chimerical-cranachan-793287 ready!

Name Link
🔨 Latest commit facf5c62ab249dfaeddb85587ce66d1d04f5839e
🔍 Latest deploy log https://app.netlify.com/sites/chimerical-cranachan-793287/deploys/675381620f558b00089283ea
😎 Deploy Preview https://deploy-preview-142254--chimerical-cranachan-793287.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

coconutruben added a commit to coconutruben/pytorch that referenced this pull request Dec 6, 2024
)

Summary:

# Why

- sampling the same config multiple times is wasteful, especially on exhaustive
- for AMD we rewrite the configs to have a specific number of stages, which might lead to some configs appearing multiple times

# What

cast the configs, already defined as a tuple, through a set to remove duplicates

Test Plan:
taken from the `mm_kernel_configs` logic in the same file
```
>>> mm_kernel_configs = [        {"config": (BLOCK_M, BLOCK_N, BLOCK_K, num_stages, num_warps), "cond": True}        for BLOCK_M, BLOCK_N, BLOCK_K in itertools.product(            [16, 32, 64, 128, 256], repeat=3        )        for num_stages in [1, 2, 3, 4, 5]        for num_warps in [2, 4, 8]    ]
>>> configs = [c['config'] for c in mm_kernel_configs]
>>> a = tuple((c[0], c[1], c[2], 0, c[4]) for c in configs)
>>> len(set(a))
375
>>> len(a)
1875
>>>
```

Differential Revision: D66893774
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D66893774

@coconutruben coconutruben added the topic: not user facing topic category label Dec 6, 2024
coconutruben added a commit to coconutruben/pytorch that referenced this pull request Dec 6, 2024
)

Summary:

# Why

- sampling the same config multiple times is wasteful, especially on exhaustive
- for AMD we rewrite the configs to have a specific number of stages, which might lead to some configs appearing multiple times

# What

cast the configs, already defined as a tuple, through a set to remove duplicates

Test Plan:
taken from the `mm_kernel_configs` logic in the same file
```
>>> mm_kernel_configs = [        {"config": (BLOCK_M, BLOCK_N, BLOCK_K, num_stages, num_warps), "cond": True}        for BLOCK_M, BLOCK_N, BLOCK_K in itertools.product(            [16, 32, 64, 128, 256], repeat=3        )        for num_stages in [1, 2, 3, 4, 5]        for num_warps in [2, 4, 8]    ]
>>> configs = [c['config'] for c in mm_kernel_configs]
>>> a = tuple((c[0], c[1], c[2], 0, c[4]) for c in configs)
>>> len(set(a))
375
>>> len(a)
1875
>>>
```

Differential Revision: D66893774
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D66893774

)

Summary:

# Why

- sampling the same config multiple times is wasteful, especially on exhaustive
- for AMD we rewrite the configs to have a specific number of stages, which might lead to some configs appearing multiple times

# What

cast the configs, already defined as a tuple, through a set to remove duplicates

Test Plan:
taken from the `mm_kernel_configs` logic in the same file
```
>>> mm_kernel_configs = [        {"config": (BLOCK_M, BLOCK_N, BLOCK_K, num_stages, num_warps), "cond": True}        for BLOCK_M, BLOCK_N, BLOCK_K in itertools.product(            [16, 32, 64, 128, 256], repeat=3        )        for num_stages in [1, 2, 3, 4, 5]        for num_warps in [2, 4, 8]    ]
>>> configs = [c['config'] for c in mm_kernel_configs]
>>> a = tuple((c[0], c[1], c[2], 0, c[4]) for c in configs)
>>> len(set(a))
375
>>> len(a)
1875
>>>
```

Differential Revision: D66893774
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D66893774

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Dec 9, 2024
@facebook-github-bot
Copy link
Contributor

@pytorchbot merge

(Initiating merge automatically since Phabricator Diff has merged)

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants