Skip to content

Conversation

@dulinriley
Copy link
Contributor

Summary:
This panic is very common due to the CommActor message forwarding which obscures
the return address.

When this happens, have a better message that also includes what the destination was
so people aren't confused.

Reviewed By: pablorfb-meta

Differential Revision: D84952942

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Oct 17, 2025
@meta-codesync
Copy link

meta-codesync bot commented Oct 17, 2025

@dulinriley has exported this pull request. If you are a Meta employee, you can view the originating Diff in D84952942.

dulinriley added a commit to dulinriley/monarch that referenced this pull request Oct 17, 2025
…orch#1606)

Summary:

Fix a panic in PythonActor::handle_undeliverable_message when the "sender" is the comm
actor. We need to update the sender back to the original "self" actor by using the headers
set in the envelope.

After this fix, instead of a panic we get a supervision error:
```
SupervisionError(
Actor ...wrapper_1xYczTZiTdb1[0] exited because of the following reason:
<PyActorSupervisionEvent: ...wrapper_1xYczTZiTdb1[0]: failed: serving
...wrapper_1xYczTZiTdb1[0]: processing error: a message from ...wrapper_1xYczTZiTdb1[0] to
...fail_1am38hE5fnus[0] was undeliverable and returned: Some("send error: channel closed;
multicast error: comm actor comm_1JgvjFbdpnUf[0] failed to deliver the cast message to the
dest actor; return to its original sender's port ...wrapper_1xYczTZiTdb1[0]
)
```

Not very terse, but better than a panic! This also allows any custom override of
handle_undeliverable_message to work.

Reviewed By: pablorfb-meta

Differential Revision: D84952942
@dulinriley dulinriley changed the title Enhance common panic in undeliverable with destination port Fix assert with undeliverable message from comm actor Oct 17, 2025
@dulinriley
Copy link
Contributor Author

Updated to actually fix the problem instead of just enhancing the assert

)

Summary:

Fix a panic in PythonActor::handle_undeliverable_message when the "sender" is the comm
actor. We need to update the sender back to the original "self" actor by using the headers
set in the envelope.

After this fix, instead of a panic we get a supervision error:
```
SupervisionError(
Actor ...wrapper_1xYczTZiTdb1[0] exited because of the following reason:
<PyActorSupervisionEvent: ...wrapper_1xYczTZiTdb1[0]: failed: serving
...wrapper_1xYczTZiTdb1[0]: processing error: a message from ...wrapper_1xYczTZiTdb1[0] to
...fail_1am38hE5fnus[0] was undeliverable and returned: Some("send error: channel closed;
multicast error: comm actor comm_1JgvjFbdpnUf[0] failed to deliver the cast message to the
dest actor; return to its original sender's port ...wrapper_1xYczTZiTdb1[0]
)
```

Not very terse, but better than a panic! This also allows any custom override of
handle_undeliverable_message to work.

Reviewed By: pablorfb-meta

Differential Revision: D84952942
@meta-codesync
Copy link

meta-codesync bot commented Oct 18, 2025

This pull request has been merged in 1a3442a.

AlirezaShamsoshoara pushed a commit to AlirezaShamsoshoara/monarch that referenced this pull request Oct 30, 2025
)

Summary:
Pull Request resolved: meta-pytorch#1606

Fix a panic in PythonActor::handle_undeliverable_message when the "sender" is the comm
actor. We need to update the sender back to the original "self" actor by using the headers
set in the envelope.

After this fix, instead of a panic we get a supervision error:
```
SupervisionError(
Actor ...wrapper_1xYczTZiTdb1[0] exited because of the following reason:
<PyActorSupervisionEvent: ...wrapper_1xYczTZiTdb1[0]: failed: serving
...wrapper_1xYczTZiTdb1[0]: processing error: a message from ...wrapper_1xYczTZiTdb1[0] to
...fail_1am38hE5fnus[0] was undeliverable and returned: Some("send error: channel closed;
multicast error: comm actor comm_1JgvjFbdpnUf[0] failed to deliver the cast message to the
dest actor; return to its original sender's port ...wrapper_1xYczTZiTdb1[0]
)
```

Not very terse, but better than a panic! This also allows any custom override of
handle_undeliverable_message to work.

Reviewed By: pablorfb-meta

Differential Revision: D84952942

fbshipit-source-id: b2cb36600ca89a03e7a85cfccd46ce6bcf2487cf
@dulinriley dulinriley deleted the export-D84952942 branch November 4, 2025 23:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot. fb-exported Merged meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants