Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UCT/TCP: Fix PUT protocol #5936

Merged
merged 2 commits into from
Nov 23, 2020
Merged

UCT/TCP: Fix PUT protocol #5936

merged 2 commits into from
Nov 23, 2020

Conversation

dmitrygx
Copy link
Member

What

Fix PUT protocol

Why ?

The following happens on UCP when TCP and MM are used (TCP for PUT, MM for AM):

  • MM sends RTS from sender
  • MM sends RTR from receiver
  • TCP does PUT and completes it in-place, but some data still is sending by TCP/IP stack
  • MM sends ATP immediately from sender
  • Receiver verifies the data, it is incomplete

How ?

  1. Always return UCS_INPROGRESS from TCP PUT operation and add user's completion to a queue of PUT completions that are waiting for ACK.
  2. Allocate PUT completion from iface TX mpool instead of heap by doing calloc().
  3. Update UCP TAG match RNDV tests to run over TCP/MM, it reproduces the bug.

@dmitrygx
Copy link
Member Author

@yosefe this PR based on your patch for TCP with some changes related to PUT return value (it is always UCS_INPROGRESS now) and using mpool elements fotr PUT completions instead of allocation from heap

@dmitrygx
Copy link
Member Author

@brminich @hoopoepg could you review pls?
the io_demo failure is unrelated

@@ -263,7 +263,7 @@ typedef struct uct_tcp_ep_ctx {
* buffer from TCP EP context
*/
typedef struct uct_tcp_ep_zcopy_tx {
uct_tcp_am_hdr_t super; /* UCT TCP AM header */
uct_tcp_am_hdr_t super; /* UCT TCP AM header */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra space

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -795,3 +795,4 @@ UCS_TEST_P(test_ucp_tag_match_rndv, bidir_multi_exp_post, "RNDV_THRESH=0") {
}

UCP_INSTANTIATE_TEST_CASE(test_ucp_tag_match_rndv)
UCP_INSTANTIATE_TEST_CASE_TLS(test_ucp_tag_match_rndv, mm_tcp, "posix,sysv,tcp")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍
did u manage to repro the issue with this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍
did u manage to repro the issue with this?

yes, the tests check the data consistency, and those tests are failed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[ RUN      ] mm_tcp/test_ucp_tag_match_rndv.sync_send_unexp/0 <posix,sysv,tcp/rndv_auto>
ucp/test_ucp_tag_match.cc:521: Failure
Value of: recvbuf
  Actual: { '\x8A' (138), '\xAC' (172), '\x80' (128), 'd' (100, 0x64), '\0', '\0', '\0', '\0', 'u' (117, 0x75), '\xBD' (189), '\x6' (6), '\xED' (237), '\x3' (3), '\0', '\0', '\0', '\xA3' (163), 'f' (102, 0x66), 'C' (67, 0x43), 'B' (66, 0x42), '\'' (39, 0x27), '\0', '\0', '\0', 'o' (111, 0x6F), '\x2' (2), '\xA2' (162), '\x96' (150), '\x88' (136), '\x1' (1), '\0', '\0', ... }
Expected: sendbuf
Which is: { '\x8A' (138), '\xAC' (172), '\x80' (128), 'd' (100, 0x64), '\0', '\0', '\0', '\0', 'u' (117, 0x75), '\xBD' (189), '\x6' (6), '\xED' (237), '\x3' (3), '\0', '\0', '\0', '\xA3' (163), 'f' (102, 0x66), 'C' (67, 0x43), 'B' (66, 0x42), '\'' (39, 0x27), '\0', '\0', '\0', 'o' (111, 0x6F), '\x2' (2), '\xA2' (162), '\x96' (150), '\x88' (136), '\x1' (1), '\0', '\0', ... }

@dmitrygx
Copy link
Member Author

#5938 failure

@dmitrygx
Copy link
Member Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@dmitrygx
Copy link
Member Author

bot:pipe:retest

@dmitrygx
Copy link
Member Author

@yosefe ok to merge?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants