Skip to content

Conversation

@pzhan9
Copy link
Contributor

@pzhan9 pzhan9 commented Nov 17, 2025

Summary:
iiuc D84962784 correctly, v1 mesh does not rely on allocator heartbeat anymore. Yet we still get noises from its heartbeat timeout because the default value 5secs was set too aggressive for a large mesh.

This diff bumps it to 300secs.

Reviewed By: mariusae

Differential Revision: D86678878

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Nov 17, 2025
@meta-codesync
Copy link

meta-codesync bot commented Nov 17, 2025

@pzhan9 has exported this pull request. If you are a Meta employee, you can view the originating Diff in D86678878.

…a-pytorch#1906)

Summary:

iiuc D84962784 correctly, v1 mesh does not rely on allocator heartbeat anymore. Yet we still get noises from its heartbeat timeout because the default value 5secs was set too aggressive for a large mesh.

This diff bumps it to 300secs.

Reviewed By: mariusae

Differential Revision: D86678878
@meta-codesync
Copy link

meta-codesync bot commented Nov 17, 2025

This pull request has been merged in a25c3ed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot. fb-exported Merged meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants