[cpu-offloading] Implement prefetching for cpu offloading pass#3166
[cpu-offloading] Implement prefetching for cpu offloading pass#3166mlazos wants to merge 5 commits intogh/mlazos/2/basefrom
Conversation
[ghstack-poisoned]
|
@claude can you review this and ensure that the prefetching per layer does not miss any edge cases and that the tests do not mask any behavior |
|
Claude finished @mlazos's task in 2m 44s —— View job Review: CPU Offload Prefetching Implementation
Prefetching Edge Cases1.
|
[ghstack-poisoned]
|
@claude can you review to see if we are missing any overlap opportunities or edge cases? |
|
Claude finished @mlazos's task in 3m 41s —— View job
Review: Overlap Opportunities & Edge CasesThe core design is sound — decoupling Missed Overlap OpportunityHighest offloaded layer gets zero prefetch benefit (
|
| f"Available joint passes: {list(AVAILABLE_JOINT_PASSES.keys())}" | ||
| ) | ||
|
|
||
| # Skip inductor_decomposition - it's handled in joint_graph_builder |
There was a problem hiding this comment.
Not sure if this should be skipped
There was a problem hiding this comment.
don't need this here.
| ) | ||
|
|
||
|
|
||
| def cpu_offload_pass( |
There was a problem hiding this comment.
we can remove this pass now.
tag_all_offloadable_activations is now run via tag_with_memory_policy_pass
| continue | ||
| existing = node.meta.get("recompute") | ||
|
|
||
| if sac_active: |
There was a problem hiding this comment.
don't need to detect sac_active?
just do defensive programming, and assume some tagging has been applied
…pass" [ghstack-poisoned]
…pass" [ghstack-poisoned]
…pass" [ghstack-poisoned]
Stack from ghstack (oldest at bottom):