Give`wait_tensor()` a schema that reflects its side effect #126773

bdhirsh · 2024-05-21T13:44:03Z

wait_tensor currently has a schema that advertises as functional/pure (it takes in a tensor, returns a fresh tensor) - but this is not really true:

(1) it has the side effect of waiting for a pending collective to finish
(2) it returns its input directly : https://github.com/pytorch/pytorch/blob/main/torch/csrc/distributed/c10d/Functional.cpp#L307

One reason this is error prone is that e.g. if you use this API as if it had a side effect (and discard the result, e.g. the bug here), then we will DCE it during tracing (unless we land this PR).

The more proper thing to do would be to: either:

(1) tell functionalization to explicitly not functionalize the op.

(2) update the schema of this op to mark it as mutating its input (to reflect the side effect), and add a functional variant of the op so we can functionalize it, and "de-functionalize" it in inductor (although to be fair, this all amounts to a giant no-op).

cc @mrshenli @pritamdamania87 @zhaojuanmao @satgera @gqchen @aazzolini @osalpekar @jiayisuse @H-Huang @kwen2501 @awgu @penguinwu @fegin @XilunWu @wanchaol @fduwjj @wz337 @tianyu-l @wconstab @yf225 @chauhang @d4l3k

The text was updated successfully, but these errors were encountered:

zou3519 · 2024-05-21T19:14:59Z

What are the chances effect tokens would help here?

wconstab · 2024-05-21T19:40:09Z

Just curious, is there such a notion today of a schema that marks a side effect? What would that look like?

wanchaol · 2024-05-21T20:42:48Z

should this just be wait_tensor(Tensor(a!) self) -> Tensor(a!)?

zou3519 · 2024-05-22T10:03:12Z

There are a couple of options for option 2 "update the schema of this op to mark it as mutating its input (to reflect the side effect)":

a) wait_tensor(Tensor(a!) self) -> Tensor(a!). This requires us to add a functional version of the op and reinplacing logic to inductor for it.
b) wait_tensor(Tensor(a!) self) -> (). This one doesn't require us to add a functional op (we will automatically generate a functional op
c) we've been adding new "print" operators that have arbitrary side effects that automatically get functionalized (@angelayi @yanboliang). There's a side table where we actually mark the operators as having side effects.

pytorch/torch/_higher_order_ops/effects.py

Lines 20 to 22 in 2dd2699

    
           SIDE_EFFECTS: Dict[torch._ops.OpOverload, _EffectType] = { 
        
               torch.ops.aten._print.default: _EffectType.ORDERED, 
        
           }

bdhirsh added the oncall: distributed Add this issue/PR to distributed oncall triage queue label May 21, 2024

bdhirsh mentioned this issue May 21, 2024

add wait_tensor to ops to skip in DCE pass #126541

Open

LucasLLC self-assigned this Jun 5, 2024

LucasLLC added triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Jun 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Give`wait_tensor()` a schema that reflects its side effect #126773

Give`wait_tensor()` a schema that reflects its side effect #126773

bdhirsh commented May 21, 2024 •

edited

zou3519 commented May 21, 2024

wconstab commented May 21, 2024

wanchaol commented May 21, 2024

zou3519 commented May 22, 2024

Givewait_tensor() a schema that reflects its side effect #126773

Givewait_tensor() a schema that reflects its side effect #126773

Comments

bdhirsh commented May 21, 2024 • edited

zou3519 commented May 21, 2024

wconstab commented May 21, 2024

wanchaol commented May 21, 2024

zou3519 commented May 22, 2024

Give`wait_tensor()` a schema that reflects its side effect #126773

Give`wait_tensor()` a schema that reflects its side effect #126773

bdhirsh commented May 21, 2024 •

edited