Skip to content

Conversation

@d4l3k
Copy link
Member

@d4l3k d4l3k commented Jul 15, 2025

Summary:
Improve error messages in the TCP transport when a collective mismatch occurs. Right now it throws a mysterious error such as:

gloo::EnforceNotMet: [enforce fail at fbcode/gloo/transport/tcp/pair.cc:456] op.preamble.length <= op.nbytes. 194478720 vs 4

This updates the error message to indicate that it's a size mismatch and likely due to a bug in the user code.

Meta:

User post: gloo::EnforceNotMet: [enforce fail at fbcode/gloo/transport/tcp/pair.cc:456] op.preamble.length <= op.nbytes. 194478720 vs 4

Differential Revision: D78377800

@facebook-github-bot
Copy link

This pull request was exported from Phabricator. Differential Revision: D78377800

Summary:

Improve error messages in the TCP transport when a collective mismatch occurs. Right now it throws a mysterious error such as:

```
gloo::EnforceNotMet: [enforce fail at fbcode/gloo/transport/tcp/pair.cc:456] op.preamble.length <= op.nbytes. 194478720 vs 4
```

This updates the error message to indicate that it's a size mismatch and likely due to a bug in the user code.

Meta:

User post: gloo::EnforceNotMet: [enforce fail at fbcode/gloo/transport/tcp/pair.cc:456] op.preamble.length <= op.nbytes. 194478720 vs 4

Differential Revision: D78377800
@d4l3k d4l3k force-pushed the export-D78377800 branch from 2ca151b to 5bc2943 Compare July 15, 2025 23:04
@facebook-github-bot
Copy link

This pull request was exported from Phabricator. Differential Revision: D78377800

@facebook-github-bot facebook-github-bot merged commit 306281a into pytorch:main Jul 16, 2025
9 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants