You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Depends on which backend claims the ltorch.sum. i.e. for torchex, since aten can handle arbitrary reducedim given at runtime, we can re-use the cache and there's no need to insert any check on arg[1].
On the contrary, nvfuserex would require the program to bake in reduction axis as compile time constant. So we'd want to insert that as part of the prologue trace checks.
Alternatives
Alternative 0: we can converge on the most conservative backends and apply a simpler caching strategy at the primitive level. In the example above, we'll just require reduction axis to stay as a compile time constant thing across the board, even though it could be re-used for some executors. This would unfortunately gives us some negative cache hit but would be easier to plumb through.
Alternative 1: thunder as a system can establish a caching strategy. When a backend sees a cache requirement on a certain op that it cannot fulfill, the backend could just reject the operation.
The text was updated successfully, but these errors were encountered:
馃殌 Feature
dynamic constraints would want to insert executor specific checks into prologue trace, given that backends might have specific dynamic constraints.
A quick example as below:
Given a program to be compiled, where we would expect the reduction axis to be a direct input to the program vvv
The compute trace looks like below vvv.
Depends on which backend claims the
ltorch.sum
. i.e. for torchex, since aten can handle arbitraryreducedim
given at runtime, we can re-use the cache and there's no need to insert any check onarg[1]
.On the contrary, nvfuserex would require the program to bake in reduction axis as compile time constant. So we'd want to insert that as part of the prologue trace checks.
Alternatives
Alternative 0: we can converge on the most conservative backends and apply a simpler caching strategy at the primitive level. In the example above, we'll just require reduction axis to stay as a compile time constant thing across the board, even though it could be re-used for some executors. This would unfortunately gives us some negative cache hit but would be easier to plumb through.
Alternative 1: thunder as a system can establish a caching strategy. When a backend sees a cache requirement on a certain op that it cannot fulfill, the backend could just reject the operation.
The text was updated successfully, but these errors were encountered: