[ET-VK] Fix force_fp16 texture bias being silently rejected for CONTIGUOUS_ANY ops#18770
Conversation
…GUOUS_ANY ops The `force_fp16` path in `TagMemoryMetaPass` applies `ANY_TEXTURE` to bias ops toward texture storage. However, `try_constrain_with_arg_repset` has a packed-dim compatibility check that requires ALL of the source repset's PDIs to exist in the output repset. `ANY_TEXTURE` has 3 texture layouts (WP, HP, CP) but `CONTIGUOUS_ANY` outputs only support WP, so the check fails and the texture bias is silently dropped. Without the bias, buffer storage cascades from ops that must use buffer (e.g. embedding with vocab exceeding texture limits) into downstream ops that could use texture, causing unnecessary buffer↔texture transitions. Fix: check PDI compatibility against the intersection of arg and source repsets (what would actually be applied) rather than the raw source. The intersection of `ANY_TEXTURE ∩ CONTIGUOUS_ANY` = `WIDTH_PACKED_TEXTURE`, which IS compatible with the output. Authored by Claude. Differential Revision: [D100004702](https://our.internmc.facebook.com/intern/diff/D100004702/) [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18770
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 Cancelled Job, 2 Unrelated FailuresAs of commit f8c5861 with merge base 4afd7f9 ( CANCELLED JOB - The following job was cancelled. Please retry:
BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
…d for CONTIGUOUS_ANY ops" The `force_fp16` path in `TagMemoryMetaPass` applies `ANY_TEXTURE` to bias ops toward texture storage. However, `try_constrain_with_arg_repset` has a packed-dim compatibility check that requires ALL of the source repset's PDIs to exist in the output repset. `ANY_TEXTURE` has 3 texture layouts (WP, HP, CP) but `CONTIGUOUS_ANY` outputs only support WP, so the check fails and the texture bias is silently dropped. Without the bias, buffer storage cascades from ops that must use buffer (e.g. embedding with vocab exceeding texture limits) into downstream ops that could use texture, causing unnecessary buffer↔texture transitions. Fix: check PDI compatibility against the intersection of arg and source repsets (what would actually be applied) rather than the raw source. The intersection of `ANY_TEXTURE ∩ CONTIGUOUS_ANY` = `WIDTH_PACKED_TEXTURE`, which IS compatible with the output. Authored by Claude. Differential Revision: [D100004702](https://our.internmc.facebook.com/intern/diff/D100004702/) [ghstack-poisoned]
85199dd
into
gh/SS-JIA/516/base
…GUOUS_ANY ops Pull Request resolved: #18770 The `force_fp16` path in `TagMemoryMetaPass` applies `ANY_TEXTURE` to bias ops toward texture storage. However, `try_constrain_with_arg_repset` has a packed-dim compatibility check that requires ALL of the source repset's PDIs to exist in the output repset. `ANY_TEXTURE` has 3 texture layouts (WP, HP, CP) but `CONTIGUOUS_ANY` outputs only support WP, so the check fails and the texture bias is silently dropped. Without the bias, buffer storage cascades from ops that must use buffer (e.g. embedding with vocab exceeding texture limits) into downstream ops that could use texture, causing unnecessary buffer↔texture transitions. Fix: check PDI compatibility against the intersection of arg and source repsets (what would actually be applied) rather than the raw source. The intersection of `ANY_TEXTURE ∩ CONTIGUOUS_ANY` = `WIDTH_PACKED_TEXTURE`, which IS compatible with the output. Authored by Claude. ghstack-source-id: 364280901 @exported-using-ghexport Differential Revision: [D100004702](https://our.internmc.facebook.com/intern/diff/D100004702/)
…GUOUS_ANY ops Pull Request resolved: pytorch#18770 The `force_fp16` path in `TagMemoryMetaPass` applies `ANY_TEXTURE` to bias ops toward texture storage. However, `try_constrain_with_arg_repset` has a packed-dim compatibility check that requires ALL of the source repset's PDIs to exist in the output repset. `ANY_TEXTURE` has 3 texture layouts (WP, HP, CP) but `CONTIGUOUS_ANY` outputs only support WP, so the check fails and the texture bias is silently dropped. Without the bias, buffer storage cascades from ops that must use buffer (e.g. embedding with vocab exceeding texture limits) into downstream ops that could use texture, causing unnecessary buffer↔texture transitions. Fix: check PDI compatibility against the intersection of arg and source repsets (what would actually be applied) rather than the raw source. The intersection of `ANY_TEXTURE ∩ CONTIGUOUS_ANY` = `WIDTH_PACKED_TEXTURE`, which IS compatible with the output. Authored by Claude. ghstack-source-id: 364280901 @exported-using-ghexport Differential Revision: [D100004702](https://our.internmc.facebook.com/intern/diff/D100004702/)
Stack from ghstack (oldest at bottom):
The
force_fp16path inTagMemoryMetaPassappliesANY_TEXTUREtobias ops toward texture storage. However,
try_constrain_with_arg_repsethas a packed-dim compatibility check that requires ALL of the source
repset's PDIs to exist in the output repset.
ANY_TEXTUREhas 3 texturelayouts (WP, HP, CP) but
CONTIGUOUS_ANYoutputs only support WP, sothe check fails and the texture bias is silently dropped.
Without the bias, buffer storage cascades from ops that must use buffer
(e.g. embedding with vocab exceeding texture limits) into downstream ops
that could use texture, causing unnecessary buffer↔texture transitions.
Fix: check PDI compatibility against the intersection of arg and source
repsets (what would actually be applied) rather than the raw source. The
intersection of
ANY_TEXTURE ∩ CONTIGUOUS_ANY=WIDTH_PACKED_TEXTURE,which IS compatible with the output.
Authored by Claude.
Differential Revision: D100004702