PropagateDevicePass inserts H2D/D2H copy ops at delegate boundaries (#19921)#19921
PropagateDevicePass inserts H2D/D2H copy ops at delegate boundaries (#19921)#19921Gasoonjia wants to merge 1 commit into
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19921
Note: Links to docs will display an error until the docs builds have been completed. ⏳ No Failures, 1 PendingAs of commit 7743dde with merge base eeb0646 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@Gasoonjia has exported this pull request. If you are a Meta employee, you can view the originating Diff in D99636777. |
JacobSzwejbka
left a comment
There was a problem hiding this comment.
Review automatically exported from Phabricator review in Meta.
digantdesai
left a comment
There was a problem hiding this comment.
Review automatically exported from Phabricator review in Meta.
This PR needs a
|
…19921) Summary: Extend PropagateDevicePass to insert explicit et_copy._h2d_copy and et_copy._d2h_copy ops at delegate boundaries, making the graph functional by explicitly transferring data between CPU and device memory. Key changes: - Inserts _h2d_copy before each delegate input, _d2h_copy after each output - Original input nodes stay CPU; h2d_copy output tagged as device - Getitem nodes inherit device; d2h_copy output tagged as CPU - Skip-copy optimizations via skip_h2d_for_method_inputs/skip_d2h_for_method_outputs - _parse_device_spec_value: lowercases string, raises ValueError for unknown types - _program.py passes config flags to PropagateDevicePass constructor Reviewed By: JacobSzwejbka Differential Revision: D99636777
85895ed to
7743dde
Compare
Summary:
Extend PropagateDevicePass to insert explicit et_copy._h2d_copy and
et_copy._d2h_copy ops at delegate boundaries, making the graph functional
by explicitly transferring data between CPU and device memory.
Key changes:
Reviewed By: JacobSzwejbka
Differential Revision: D99636777