[pallas backend] Implementing Strided/Scatter Access #167426

oulgen · 2025-11-09T18:19:11Z

Stack from ghstack (oldest at bottom):

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben

[ghstack-poisoned]

ghstack-source-id: 7dd6f45 Pull-Request: #167426

pytorch-bot · 2025-11-09T18:19:16Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/167426

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 5b7b216 with merge base 04a85b4 ():

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

trunk / linux-jammy-py3-clang12-executorch / test (executorch, 1, 1, lf.linux.2xlarge, unstable) (gh) (#166072)
backends/xnnpack/test/recipes/test_xnnpack_recipes.py::TestXnnpackRecipes::test_int8_static_quant_recipe

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]

ghstack-source-id: fae5994 Pull-Request: #167426

[ghstack-poisoned]

ghstack-source-id: c143b78 Pull-Request: #167426

yarongmu-google

This could have performance implications but may not be of concern right now. (sorry posted at wrong place. Please ignore.)

yarongmu-google

There seems to have more focus on strided but less on scatter?

yarongmu-google · 2025-11-09T19:12:44Z

torch/_inductor/codegen/pallas.py

            for inp in input_params:
-                code.writeline(f"{inp}_jax = jax.dlpack.from_dlpack({inp})")
+                code.writeline(
+                    f"{inp}_jax = jax.dlpack.from_dlpack({inp}.contiguous())"


This may have performance implications but may not be if concern right now.

Yep agreed, I’m prioritizing passing all the unit tests first (there’s a lot!!) and then we can look at perf

you can see me enabling unit tests one by one at the end of the test file

[ghstack-poisoned]

ghstack-source-id: 143e916 Pull-Request: #167426

jansel · 2025-11-10T02:28:47Z

torch/_inductor/codegen/pallas.py

+        dtype_map = {
+            torch.float32: "jnp.float32",
+            torch.float64: "jnp.float64",
+            torch.float16: "jnp.float16",
+            torch.bfloat16: "jnp.bfloat16",
+            torch.int32: "jnp.int32",
+            torch.int64: "jnp.int64",
+            torch.int16: "jnp.int16",
+            torch.int8: "jnp.int8",
+            torch.uint8: "jnp.uint8",
+            torch.bool: "jnp.bool_",
+        }
+        jax_dtype = dtype_map.get(dtype, f"jnp.{dtype}")


Create a helper to map torch types to jnp.*.

I think you had a similar dict in the prior PR (in the output code), let's combine this into one place.

jansel · 2025-11-10T02:29:39Z

torch/_inductor/codegen/pallas.py

+        # Get iteration variables from range_tree_nodes (these are the actual symbols used in indices)
+        iter_vars = (
+            OrderedSet(self.range_tree_nodes.keys())
+            if hasattr(self, "range_tree_nodes")


When will this be false?

jansel · 2025-11-10T02:31:25Z

torch/_inductor/codegen/pallas.py

+        # Find which iteration variable(s) are used
+        used_vars = free_symbols & iter_vars
+
+        if len(used_vars) == 0:
+            # No iteration variables, this is a constant index
+            return str(index)
+        elif len(used_vars) == 1:
+            # Single iteration variable - try to extract stride and offset
+            var = next(iter(used_vars))
+
+            # Expand and collect terms
+            expanded = sympy.expand(index)
+
+            # Try to extract coefficient (stride) and constant (offset)
+            # index = stride*var + offset
+            stride = expanded.coeff(var, 1)
+            offset = expanded.coeff(var, 0)
+
+            if stride is not None:
+                stride_val = stride
+                offset_val = offset if offset is not None else 0
+
+                # Generate JAX slice notation
+                if stride_val == 1 and offset_val == 0:
+                    # Contiguous access
+                    return "..."
+                elif offset_val == 0:
+                    # Pure stride: ::stride
+                    return f"::{stride_val}"
+                else:
+                    # Offset + stride: offset::stride
+                    return f"{offset_val}::{stride_val}"
+        elif len(used_vars) > 1:
+            # Multi-dimensional indexing - need to generate proper index arrays
+            # For patterns like 2*x0 + 30*x1, we need to reshape and use advanced indexing
+            # For now, we'll use ellipsis which works for contiguous multi-dim access
+            # and fall back to error for truly strided multi-dim cases
+
+            # Check if all coefficients are 1 (contiguous multi-dim access)
+            all_unit_stride = True
+            for var in used_vars:
+                coeff = index.coeff(var, 1)
+                if coeff != 1:
+                    all_unit_stride = False
+                    break
+
+            if all_unit_stride:
+                # Contiguous multi-dimensional access
+                return "..."
+            else:
+                # Strided multi-dimensional access - requires advanced indexing
+                # For now, use ellipsis which may work for many cases
+                # TODO: Implement proper multi-dimensional strided indexing
+                return "..."


I don't think you need to rewrite this logic from scratch. The block_ptr handling in the Triton backend is doing something very similar. Can we reuse that?

I shared bunch of things via BlockPatternMatcher but not sure what more can be shared since we are not using arange for broadcasting in pallas.

[ghstack-poisoned]

ghstack-source-id: 7bb46d5 Pull-Request: #167426

oulgen · 2025-11-11T06:21:24Z

@pytorchbot merge

pytorchmergebot · 2025-11-11T06:26:50Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Pull Request resolved: #167493 Approved by: https://github.com/jansel ghstack dependencies: #167426

Update

49fb2fb

[ghstack-poisoned]

oulgen added a commit that referenced this pull request Nov 9, 2025

[pallas backend] Implementing Strided/Scatter Access

9974576

ghstack-source-id: 7dd6f45 Pull-Request: #167426

pytorch-bot bot added ciflow/inductor module: inductor labels Nov 9, 2025

Update

725bf6f

[ghstack-poisoned]

oulgen added a commit that referenced this pull request Nov 9, 2025

[pallas backend] Implementing Strided/Scatter Access

629d39b

ghstack-source-id: fae5994 Pull-Request: #167426

oulgen requested a review from jansel November 9, 2025 18:30

oulgen added ciflow/trunk Trigger trunk jobs on your pull request topic: not user facing topic category labels Nov 9, 2025

oulgen requested review from yarongmu-google and yf225 November 9, 2025 18:31

oulgen marked this pull request as ready for review November 9, 2025 18:34

Update

d0ec060

[ghstack-poisoned]

oulgen added a commit that referenced this pull request Nov 9, 2025

[pallas backend] Implementing Strided/Scatter Access

9c6d9ef

ghstack-source-id: c143b78 Pull-Request: #167426

yarongmu-google reviewed Nov 9, 2025

View reviewed changes

yarongmu-google approved these changes Nov 9, 2025

View reviewed changes

Update

1dc62d6

[ghstack-poisoned]

oulgen added a commit that referenced this pull request Nov 9, 2025

[pallas backend] Implementing Strided/Scatter Access

c5e8a8d

ghstack-source-id: 143e916 Pull-Request: #167426

oulgen mentioned this pull request Nov 9, 2025

Swap pallas test shard to 12.8 #167428

Closed

jansel requested changes Nov 10, 2025

View reviewed changes

Update

5b7b216

[ghstack-poisoned]

oulgen added a commit that referenced this pull request Nov 10, 2025

[pallas backend] Implementing Strided/Scatter Access

c8f75c8

ghstack-source-id: 7bb46d5 Pull-Request: #167426

oulgen requested a review from jansel November 10, 2025 20:46

oulgen added a commit that referenced this pull request Nov 11, 2025

[pallas backend] Implementing Strided/Scatter Access

171ade0

ghstack-source-id: 7bb46d5 Pull-Request: #167426

oulgen mentioned this pull request Nov 11, 2025

[pallas backend] implement complex indexing #167493

Closed

jansel approved these changes Nov 11, 2025

View reviewed changes

pytorchmergebot added the merging label Nov 11, 2025

pytorchmergebot added the Merged label Nov 11, 2025

pytorchmergebot closed this in 87d17e9 Nov 11, 2025

pytorchmergebot removed the merging label Nov 11, 2025

pytorchmergebot pushed a commit that referenced this pull request Nov 11, 2025

[pallas backend] implement complex indexing (#167493)

5f0a563

Pull Request resolved: #167493 Approved by: https://github.com/jansel ghstack dependencies: #167426

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[pallas backend] Implementing Strided/Scatter Access #167426

[pallas backend] Implementing Strided/Scatter Access #167426

oulgen commented Nov 9, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Nov 9, 2025 •

edited

Loading

Uh oh!

yarongmu-google left a comment •

edited

Loading

Uh oh!

yarongmu-google left a comment

Uh oh!

yarongmu-google Nov 9, 2025

Uh oh!

oulgen Nov 9, 2025

Uh oh!

jansel Nov 10, 2025

Uh oh!

jansel Nov 10, 2025

Uh oh!

jansel Nov 10, 2025

Uh oh!

oulgen Nov 10, 2025

Uh oh!

oulgen commented Nov 11, 2025

Uh oh!

pytorchmergebot commented Nov 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[pallas backend] Implementing Strided/Scatter Access #167426

[pallas backend] Implementing Strided/Scatter Access #167426

Conversation

oulgen commented Nov 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Nov 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/167426

✅ You can merge normally! (1 Unrelated Failure)

Uh oh!

yarongmu-google left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yarongmu-google left a comment

Choose a reason for hiding this comment

Uh oh!

yarongmu-google Nov 9, 2025

Choose a reason for hiding this comment

Uh oh!

oulgen Nov 9, 2025

Choose a reason for hiding this comment

Uh oh!

jansel Nov 10, 2025

Choose a reason for hiding this comment

Uh oh!

jansel Nov 10, 2025

Choose a reason for hiding this comment

Uh oh!

jansel Nov 10, 2025

Choose a reason for hiding this comment

Uh oh!

oulgen Nov 10, 2025

Choose a reason for hiding this comment

Uh oh!

oulgen commented Nov 11, 2025

Uh oh!

pytorchmergebot commented Nov 11, 2025

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

oulgen commented Nov 9, 2025 •

edited

Loading

pytorch-bot bot commented Nov 9, 2025 •

edited

Loading

yarongmu-google left a comment •

edited

Loading