Skip to content

[pull] master from tensorflow:master#387

Merged
pull[bot] merged 6 commits intoSystemmatrix555:masterfrom
tensorflow:master
May 26, 2025
Merged

[pull] master from tensorflow:master#387
pull[bot] merged 6 commits intoSystemmatrix555:masterfrom
tensorflow:master

Conversation

@pull
Copy link
Copy Markdown

@pull pull bot commented May 26, 2025

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.1)

Can you help keep this open source service alive? 💖 Please sponsor : )

olegshyshkov and others added 6 commits May 26, 2025 04:54
Move a bit of logic to LaunchTypedKernel to make function arguments simpler.

PiperOrigin-RevId: 763377863
while symbol replacements were correct, OptimizeRTVar did not correctly update the value ranges, e.g. for HLO in test

```
p1 = s64[] parameter(1)
c42 = s64[] constant(42)
add = s64[] add(c42, p1)
ROOT dynamic-slice = s32[10] dynamic-slice(s32[4096] p0, s64[] add),
  dynamic_slice_sizes={10}
```

the indexing map of the dynamic slice operand p0 `(d0){rt0} -> (d0 + rt0)` was replaced with `(d0){rt0} -> (d0 + rt0 + 42)` with rt0 now pointing to the p1.
But the range of were rt0 kept [0, 4086] instead of [-42, 4044].

Also we didn't check if a constant value satisfies constraints. So for HLO

```
p0 = s32[100] parameter(0)
offset = s64[] constant(99)
ROOT dynamic-slice = s32[10]
  dynamic-slice(p0, offset), dynamic_slice_sizes={10}
```

indexing of p0 became (d0) -> (d0 + 99) without any additional constrains.
Now we keep such constants as a variable to postpone the handling until runtime.

We might want to make OptimizeRTVar handle more cases (again) later but that would require updating value ranges.

PiperOrigin-RevId: 763379261
This reduces the memory usage by 1kb per delegated elementwise quantized op

PiperOrigin-RevId: 763405481
@pull pull bot added the ⤵️ pull label May 26, 2025
@pull pull bot merged commit 976eae4 into Systemmatrix555:master May 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants