Redo add parallel_for to torch/csrc/stable #166695

mikaylagawarecki · 2025-10-31T04:48:27Z

Stack from ghstack (oldest at bottom):

[ghstack-poisoned]

pytorch-bot · 2025-10-31T04:48:30Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/166695

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

ROCm failures during provisioning step due to network issues

❌ 1 New Failure

As of commit b8cb398 with merge base 4295a9a ():

NEW FAILURE - The following job has failed:

Check Labels / Check labels (gh)
RuntimeError: Error checking labels: PR does not have required labels

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: aecb681 Pull Request resolved: #166695

github-actions · 2025-10-31T04:49:12Z

This PR needs a `release notes:` label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

[ghstack-poisoned]

ghstack-source-id: a1dab1c Pull Request resolved: #166695

[ghstack-poisoned]

ghstack-source-id: 0bc9afd Pull Request resolved: #166695

[ghstack-poisoned]

ghstack-source-id: c918b82 Pull Request resolved: #166695

[ghstack-poisoned]

ghstack-source-id: f439b80 Pull Request resolved: #166695

mikaylagawarecki · 2025-11-03T21:35:52Z

torch/csrc/stable/ops.h

+    const int64_t end,
+    const int64_t grain_size,
+    const F& f) {
+  if (end - begin < grain_size) {


@malfet want to double check this matches what you asked for on chat (I interpreted that you meant end - start < grain_size)

May be the logic we want to preserve in header only if range is trivial (i.e. end-start < 1) we just run this thing sequentially

If I misunderstood and you actually meant we want to preserve this in the header I'll redo this

pytorch/aten/src/ATen/Parallel-inl.h

Lines 23 to 31 in 11f73d7

const bool use_parallel =

(numiter > grain_size && numiter > 1 && !at::in_parallel_region() &&

at::get_num_threads() > 1);

if (!use_parallel) {

internal::ThreadIdGuard tid_guard(0);

c10::ParallelGuard guard(true);

f(begin, end);

return;

}

IMO this is an optimization that could be landed later, and may be not needed here, i.e. often when dev calls at::parallel_for their regions should be reasonably large

mikaylagawarecki · 2025-11-03T21:42:15Z

torch/csrc/stable/ops.h

+    const int64_t grain_size,
+    const F& f) {
+  if (end - begin < grain_size) {
+    f(begin, end);


Also I didn't add this as I wasn't sure if it's actually needed

internal::ThreadIdGuard tid_guard(0); c10::ParallelGuard guard(true);

Let me know if I should shim and add these here

swolchok

this looks fine to me except possibly the TODO. leaving for @malfet since it's in response to a comment of his IIUC

swolchok · 2025-11-03T21:56:44Z

test/cpp_extensions/libtorch_agnostic_extension/libtorch_agnostic/csrc/kernel.cpp

+  int64_t sizes[] = {size};
+  int64_t strides[] = {1};


you might find that you get internal lint complaints about C arrays because of this. Since you specifically need an array of size 1, you can just not make these arrays:

int64_t stride = 1; aoti_torch_empty_strided( 1, &size, &stride, // ...

malfet · 2025-11-05T21:25:05Z

torch/csrc/stable/c/shim.h

+
+// Get the current thread index in a parallel region
+// Returns 0 if not in a parallel region
+AOTI_TORCH_EXPORT int32_t torch_get_thread_idx();


Is there a document on signed vs unsigned types? I.e. what thread_idx == -1 means?
Is there an expectations that it should be in range [0, torch_get_max_threads())?

malfet · 2025-11-05T21:29:11Z

torch/csrc/stable/ops.h

+    const int64_t end,
+    const int64_t grain_size,
+    const F& f) {
+  if (end - begin < grain_size) {


IMO this is an optimization that could be landed later, and may be not needed here, i.e. often when dev calls at::parallel_for their regions should be reasonably large

Redo add parallel_for to torch/csrc/stable

a5f9534

[ghstack-poisoned]

mikaylagawarecki mentioned this pull request Oct 30, 2025

Add torch::stable::Device #166579

Open

mikaylagawarecki mentioned this pull request Oct 31, 2025

Add stable::Tensor.device() #166694

Open

mikaylagawarecki added a commit that referenced this pull request Oct 31, 2025

Redo add parallel_for to torch/csrc/stable

1030055

ghstack-source-id: aecb681 Pull Request resolved: #166695

Update on "Redo add parallel_for to torch/csrc/stable"

88e0dcc

[ghstack-poisoned]

mikaylagawarecki added a commit that referenced this pull request Oct 31, 2025

Redo add parallel_for to torch/csrc/stable

615ba37

ghstack-source-id: a1dab1c Pull Request resolved: #166695

Update on "Redo add parallel_for to torch/csrc/stable"

ff2d70b

[ghstack-poisoned]

mikaylagawarecki added a commit that referenced this pull request Oct 31, 2025

Redo add parallel_for to torch/csrc/stable

64f8d99

ghstack-source-id: 0bc9afd Pull Request resolved: #166695

Update on "Redo add parallel_for to torch/csrc/stable"

7cdbecb

[ghstack-poisoned]

mikaylagawarecki added a commit that referenced this pull request Oct 31, 2025

Redo add parallel_for to torch/csrc/stable

35e009e

ghstack-source-id: c918b82 Pull Request resolved: #166695

Update on "Redo add parallel_for to torch/csrc/stable"

b8cb398

[ghstack-poisoned]

mikaylagawarecki added a commit that referenced this pull request Nov 3, 2025

Redo add parallel_for to torch/csrc/stable

ab4d892

ghstack-source-id: f439b80 Pull Request resolved: #166695

mikaylagawarecki commented Nov 3, 2025

View reviewed changes

mikaylagawarecki mentioned this pull request Nov 3, 2025

Add stable parallel_for #161320

Closed

mikaylagawarecki requested review from malfet and swolchok November 3, 2025 21:45

mikaylagawarecki marked this pull request as ready for review November 3, 2025 21:45

mikaylagawarecki requested a review from janeyx99 as a code owner November 3, 2025 21:45

swolchok reviewed Nov 3, 2025

View reviewed changes

malfet approved these changes Nov 5, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Redo add parallel_for to torch/csrc/stable #166695

Redo add parallel_for to torch/csrc/stable #166695

Uh oh!

mikaylagawarecki commented Oct 31, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Oct 31, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Oct 31, 2025

Uh oh!

mikaylagawarecki Nov 3, 2025 •

edited

Loading

Uh oh!

malfet Nov 5, 2025

Uh oh!

mikaylagawarecki Nov 3, 2025 •

edited

Loading

Uh oh!

swolchok left a comment

Uh oh!

swolchok Nov 3, 2025

Uh oh!

malfet Nov 5, 2025

Uh oh!

malfet Nov 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	const bool use_parallel =
	(numiter > grain_size && numiter > 1 && !at::in_parallel_region() &&
	at::get_num_threads() > 1);
	if (!use_parallel) {
	internal::ThreadIdGuard tid_guard(0);
	c10::ParallelGuard guard(true);
	f(begin, end);
	return;
	}

Redo add parallel_for to torch/csrc/stable #166695

Are you sure you want to change the base?

Redo add parallel_for to torch/csrc/stable #166695

Uh oh!

Conversation

mikaylagawarecki commented Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/166695

❗ 1 Active SEVs

❌ 1 New Failure

Uh oh!

github-actions bot commented Oct 31, 2025

This PR needs a release notes: label

Uh oh!

mikaylagawarecki Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

malfet Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

mikaylagawarecki Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

swolchok left a comment

Choose a reason for hiding this comment

Uh oh!

swolchok Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

malfet Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

malfet Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mikaylagawarecki commented Oct 31, 2025 •

edited

Loading

pytorch-bot bot commented Oct 31, 2025 •

edited

Loading

This PR needs a `release notes:` label

mikaylagawarecki Nov 3, 2025 •

edited

Loading

mikaylagawarecki Nov 3, 2025 •

edited

Loading