Skip to content

Conversation

@pzhan9
Copy link
Contributor

@pzhan9 pzhan9 commented Oct 27, 2025

Summary:
We need to have a way to associate the channel address to why it was served. This diff adds a reason parameter to channel::serve and log it along with the address.

We probably do not need this diff in the future if we include "reason" or something similar in the channel's name directly. But before that, this diff is required otherwise we cannot debug channel errors in many cases.

Reviewed By: shayne-fletcher

Differential Revision: D85529357

Summary:

We found this log not helpful when debugging S576170. But it causes log spew and makes it hard to find other useful logs.

Downgrade to TRACE instead of deleting it just in case we might need it in the future debugging (at that time we can manually bump its level).

Reviewed By: shayne-fletcher

Differential Revision: D85533659
Summary:
Two changes:

1. Include the timeout value in the log.
2. Include port id and message type name, or hash if not registered, in the undelivered message error. It it will look like the following in a returned message log:

> metatls:twshared13737.02.gtn2.facebook.com:46155,service,agent[0]: actor failure: serving metatls:twshared13737.02.gtn2.facebook.com:46155,service,agent[0]: processing error: a message from metatls:twshared13737.02.gtn2.facebook.com:46155,service,agent[0] to metatls:twshared13736.02.gtn2.facebook.com:40835,mesh_root_client_proc,client[0][430914] was undeliverable and returned: Some(**"address not routable: port not bound in mailbox; port id: 430914; message type: unregistered type hash 16038717096654025819**; broken link: message was undeliverable")

Reviewed By: shayne-fletcher

Differential Revision: D85537948
Summary:
We need to have a way to associate the channel address to why it was served. This diff adds a reason parameter to `channel::serve` and log it along with the address.

We probably do not need this diff in the future if we include "reason" or something similar in the channel's name directly. But before that, this diff is required otherwise we cannot debug channel errors in many cases.

Reviewed By: shayne-fletcher

Differential Revision: D85529357
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Oct 27, 2025
@meta-codesync
Copy link

meta-codesync bot commented Oct 27, 2025

@pzhan9 has exported this pull request. If you are a Meta employee, you can view the originating Diff in D85529357.

@meta-codesync
Copy link

meta-codesync bot commented Oct 27, 2025

This pull request has been merged in 1ee6502.

AlirezaShamsoshoara pushed a commit to AlirezaShamsoshoara/monarch that referenced this pull request Oct 30, 2025
Summary:
Pull Request resolved: meta-pytorch#1674

We need to have a way to associate the channel address to why it was served. This diff adds a reason parameter to `channel::serve` and log it along with the address.

We probably do not need this diff in the future if we include "reason" or something similar in the channel's name directly. But before that, this diff is required otherwise we cannot debug channel errors in many cases.

Reviewed By: shayne-fletcher

Differential Revision: D85529357

fbshipit-source-id: d75146766ff93834d5781ee67833084ca5932854
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot. fb-exported Merged meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants