[core] on plasma server, ref count fds to each client, and request unmaps on release. #40370

rynewang · 2023-10-16T19:37:43Z

Plasma memory sharing works this way: the plasma server creates a temp file and mmaps it; then upon plasma client's Get, the server sends the fd to client who mmaps that fd. Then client can Release an object, and if all clients released refs to an fd, the server unmaps it. See the missing piece? the plasma client never unmaps.

This is normally not a problem because we don't want to unmap the main memory in /dev/shm anyway; but on memory pressure when we do fallback allocations (mmaps to disk files in /tmp/ray), we will leak mmaps by never unmapping them in plasma client, even if nobody is using those mmap files. To make things worse, raylet itself has a plasma client so even if the core workers exit we are still leaking.

A good place for a plasma client to unmap is at Release, after which it may no longer read or write an object ID. However a mmap region may be used by more than 1 object (this is NOT the case today for fallback allocations, but we want to be future proof); also if a mmap region is unmapped and mapped again, the plasma client fails, because the plasma server did not know the client unmapped it and hence would not send the fd.

This PR allows the plasma server to ask a plasma client to unmap. The server maintains a per-client ref count table: {object ID -> mmap fd}, and if a certain fd is no longer referenced by a Release request, the server sends a boolean "should_unmap" which orders the client to do that. The client MUST unmap.

If some time later the same fd needs to be mapped again in a Get request, it's fine, because the server knows the client no longer mmaps that fd, and would send the fd; and the client had removed the knowledge of that fd so it receives the fd and maps again.

Fixes #39229

… refcnt == 0. Signed-off-by: Ruiyang Wang <rywang014@gmail.com>

Signed-off-by: Ruiyang Wang <rywang014@gmail.com>

rkooo567

Generally LGTM. I will approve it after addressing comments (small)

src/ray/object_manager/plasma/protocol.cc

src/ray/object_manager/test/get_request_queue_test.cc

src/ray/object_manager/plasma/connection.h

rkooo567 · 2023-10-19T12:37:43Z

src/ray/object_manager/plasma/client.cc

+    bool should_unmap;
+    RAY_RETURN_NOT_OK(ReadReleaseReply(
+        buffer.data(), buffer.size(), &released_object_id, &should_unmap));
+    if (should_unmap) {


Can you add a comment this is always false if the mmap is not fallback allocated? (it'd be nice to have assertion here, but I assume there's no way to check if mmap is fallback allocated vs not in client and add RAY_CHECK right?)

yeah there's no way. In fact this is the main reason why we need to transfer this bool in the first place (because client can't tell)

Co-authored-by: SangBin Cho <rkooo567@gmail.com> Signed-off-by: Ruiyang Wang <56065503+rynewang@users.noreply.github.com>

Signed-off-by: Ruiyang Wang <rywang014@gmail.com>

src/ray/object_manager/plasma/store.cc

jjyao · 2023-10-24T18:53:59Z

src/ray/object_manager/plasma/client.cc

+    RAY_RETURN_NOT_OK(
+        PlasmaReceive(store_conn_, MessageType::PlasmaReleaseReply, &buffer));
+    ObjectID released_object_id;


We need to verify the perf impact.

will monitor release perf tests after this PR.

I don't think any of existing test can find perf diff from this change (it is only applied to fallbcak allocation). We should write a new test. I feel like it is probably okay not to do this.

no this applies to all object releases, just for fallback allocations the bool should_unmap == true.

src/ray/object_manager/plasma/connection.h

Signed-off-by: Ruiyang Wang <rywang014@gmail.com>

rkooo567 · 2023-10-24T22:10:19Z

@rynewang can you handle the premerge failures? Also plz double check if the data tests failing in this CI is also flaky in the master (since they could be related )

rkooo567 · 2023-10-24T22:10:32Z

Let's merge this asap since it is high risk change

Signed-off-by: Ruiyang Wang <rywang014@gmail.com>

Signed-off-by: Ruiyang Wang <56065503+rynewang@users.noreply.github.com>

rynewang · 2023-10-24T23:22:58Z

@rynewang can you handle the premerge failures? Also plz double check if the data tests failing in this CI is also flaky in the master (since they could be related )

ok. I think it's due to too far behind the master && a linter complaint. updated

jjyao · 2023-10-25T05:44:18Z

Premerge failed. Can you take a look?

rkooo567 · 2023-10-25T11:29:53Z

get_request_queue test in windows failure seem related

Signed-off-by: Ruiyang Wang <rywang014@gmail.com>

rynewang · 2023-10-26T00:49:52Z

premerge is green now

rkooo567

workflow failure is a real issue and we shouldn't modify by fixing a test.

Signed-off-by: Ruiyang Wang <rywang014@gmail.com>

Signed-off-by: Ruiyang Wang <56065503+rynewang@users.noreply.github.com>

rynewang · 2023-10-26T22:08:22Z

Premerge passed; CI civ1 test failed on test_memory_pressure which is already failing on master:

rkooo567 · 2023-10-27T02:03:23Z

This was a really awesome investigation! Great job!

…quest unmaps on release. (ray-project#40370)" This reverts commit d3567e0.

… fallback allocated. (#41842) This is a performance fix for #40370. Previously the plasma client sends a PlasmaReleaseRequest and does not wait for a reply. This causes the client to never know when it needs to unmap a fallback-allocated mmap. #40370 fixed it by adding back the PlasmaReleaseReply that says should_unmap and client unmaps. However this is in hot path of an object release, and most object releases are on main memory but still pays for this extra RTT. This PR fixes by sharing more info: at Get/Create time, server notifies the client that this object is fallback_allocated if it lives on such a mmap. Then at Release time, the reply only happens if object is fallback_allocated. In hot path (main memory release), the reply is skipped so we no longer pay for the RTT. Signed-off-by: Ruiyang Wang <rywang014@gmail.com>

… fallback allocated. (ray-project#41842) This is a performance fix for ray-project#40370. Previously the plasma client sends a PlasmaReleaseRequest and does not wait for a reply. This causes the client to never know when it needs to unmap a fallback-allocated mmap. ray-project#40370 fixed it by adding back the PlasmaReleaseReply that says should_unmap and client unmaps. However this is in hot path of an object release, and most object releases are on main memory but still pays for this extra RTT. This PR fixes by sharing more info: at Get/Create time, server notifies the client that this object is fallback_allocated if it lives on such a mmap. Then at Release time, the reply only happens if object is fallback_allocated. In hot path (main memory release), the reply is skipped so we no longer pay for the RTT. Signed-off-by: Ruiyang Wang <rywang014@gmail.com>

… fallback allocated. (#41842) (#41925) This is a performance fix for #40370. Previously the plasma client sends a PlasmaReleaseRequest and does not wait for a reply. This causes the client to never know when it needs to unmap a fallback-allocated mmap. #40370 fixed it by adding back the PlasmaReleaseReply that says should_unmap and client unmaps. However this is in hot path of an object release, and most object releases are on main memory but still pays for this extra RTT. This PR fixes by sharing more info: at Get/Create time, server notifies the client that this object is fallback_allocated if it lives on such a mmap. Then at Release time, the reply only happens if object is fallback_allocated. In hot path (main memory release), the reply is skipped so we no longer pay for the RTT. Signed-off-by: Ruiyang Wang <rywang014@gmail.com>

on plasma server, ref count fds to each client, and request unmaps on…

19dfccc

… refcnt == 0. Signed-off-by: Ruiyang Wang <rywang014@gmail.com>

rynewang mentioned this pull request Oct 16, 2023

[Core] Data jobs able to trigger persistent mmap file leak in between job runs #39229

Closed

rynewang added 4 commits October 16, 2023 13:06

idempotent ref counting

72b7190

Signed-off-by: Ruiyang Wang <rywang014@gmail.com>

only refcnt for the fallback-allocated mmaps

c77f10b

Signed-off-by: Ruiyang Wang <rywang014@gmail.com>

fix bugs, move pointer_ to void*

c3119a9

Signed-off-by: Ruiyang Wang <rywang014@gmail.com>

fix cpp unit tests

5bc82aa

Signed-off-by: Ruiyang Wang <rywang014@gmail.com>

rynewang mentioned this pull request Oct 16, 2023

[core] Releases mmap sections in plasma client when releasing objects #40342

Closed

rynewang assigned rkooo567 Oct 16, 2023

rynewang added 2 commits October 16, 2023 17:28

Add tests on number of mmaps in core worker.

5dcb3c1

Signed-off-by: Ruiyang Wang <rywang014@gmail.com>

fix test

6082395

Signed-off-by: Ruiyang Wang <rywang014@gmail.com>

rynewang marked this pull request as ready for review October 17, 2023 19:02

rkooo567 reviewed Oct 19, 2023

View reviewed changes

rkooo567 added the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Oct 19, 2023

rynewang and others added 3 commits October 23, 2023 12:26

Update src/ray/object_manager/plasma/connection.h

fbf8409

Co-authored-by: SangBin Cho <rkooo567@gmail.com> Signed-off-by: Ruiyang Wang <56065503+rynewang@users.noreply.github.com>

Update src/ray/object_manager/plasma/connection.h

2368d39

Co-authored-by: SangBin Cho <rkooo567@gmail.com> Signed-off-by: Ruiyang Wang <56065503+rynewang@users.noreply.github.com>

add test, rename vars

69ca397

Signed-off-by: Ruiyang Wang <rywang014@gmail.com>

rynewang removed the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Oct 23, 2023

jjyao approved these changes Oct 24, 2023

View reviewed changes

update comment and checking

b1b06c9

Signed-off-by: Ruiyang Wang <rywang014@gmail.com>

rynewang and others added 2 commits October 24, 2023 16:21

make linter happy

6636068

Signed-off-by: Ruiyang Wang <rywang014@gmail.com>

Merge branch 'master' into plasma_server_request_unmap

f81bf23

Signed-off-by: Ruiyang Wang <56065503+rynewang@users.noreply.github.com>

rynewang added 2 commits October 25, 2023 10:18

fix windows compatibility

2ca1438

Signed-off-by: Ruiyang Wang <rywang014@gmail.com>

patch the workflow test with a FIXME

6275f98

Signed-off-by: Ruiyang Wang <rywang014@gmail.com>

rynewang requested review from ericl and fishbone as code owners October 25, 2023 22:25

rynewang requested review from stephanie-wang and suquark as code owners October 25, 2023 22:25

rkooo567 requested changes Oct 26, 2023

View reviewed changes

Disconnect the client from raylet

a5f2d48

Signed-off-by: Ruiyang Wang <rywang014@gmail.com>

rynewang force-pushed the plasma_server_request_unmap branch from 4c3a638 to a5f2d48 Compare October 26, 2023 17:18

Merge branch 'master' into plasma_server_request_unmap

84243a8

Signed-off-by: Ruiyang Wang <56065503+rynewang@users.noreply.github.com>

rkooo567 approved these changes Oct 27, 2023

View reviewed changes

rkooo567 merged commit d3567e0 into ray-project:master Oct 27, 2023
39 of 44 checks passed

rynewang deleted the plasma_server_request_unmap branch October 28, 2023 05:03

rkooo567 pushed a commit to rkooo567/ray that referenced this pull request Oct 30, 2023

Revert "[core] on plasma server, ref count fds to each client, and re…

3212d39

…quest unmaps on release. (ray-project#40370)" This reverts commit d3567e0.

rkooo567 mentioned this pull request Oct 30, 2023

Revert "[core] on plasma server, ref count fds to each client, and re… #40785

Closed

8 tasks

z4y1b2 mentioned this pull request Nov 13, 2023

[memory leak] worker can not release all memory after ray job finished #41047

Closed

jjyao mentioned this pull request Dec 12, 2023

[Core] Perf regression #41695

Closed

rynewang mentioned this pull request Dec 12, 2023

[core][performance] Only do PlasmaReleaseReply when client knows it's fallback allocated. #41842

Merged

rynewang mentioned this pull request Dec 14, 2023

[core][performance] Only do PlasmaReleaseReply when client knows it's… #41925

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[core] on plasma server, ref count fds to each client, and request unmaps on release. #40370

[core] on plasma server, ref count fds to each client, and request unmaps on release. #40370

rynewang commented Oct 16, 2023 •

edited

Loading

rkooo567 left a comment

rkooo567 Oct 19, 2023

rynewang Oct 23, 2023

jjyao Oct 24, 2023

rynewang Oct 24, 2023

rkooo567 Oct 24, 2023

rynewang Oct 24, 2023

rkooo567 commented Oct 24, 2023

rkooo567 commented Oct 24, 2023

rynewang commented Oct 24, 2023

jjyao commented Oct 25, 2023

rkooo567 commented Oct 25, 2023

rynewang commented Oct 26, 2023

rkooo567 left a comment

rynewang commented Oct 26, 2023

rkooo567 commented Oct 27, 2023

[core] on plasma server, ref count fds to each client, and request unmaps on release. #40370

[core] on plasma server, ref count fds to each client, and request unmaps on release. #40370

Conversation

rynewang commented Oct 16, 2023 • edited Loading

rkooo567 left a comment

Choose a reason for hiding this comment

rkooo567 Oct 19, 2023

Choose a reason for hiding this comment

rynewang Oct 23, 2023

Choose a reason for hiding this comment

jjyao Oct 24, 2023

Choose a reason for hiding this comment

rynewang Oct 24, 2023

Choose a reason for hiding this comment

rkooo567 Oct 24, 2023

Choose a reason for hiding this comment

rynewang Oct 24, 2023

Choose a reason for hiding this comment

rkooo567 commented Oct 24, 2023

rkooo567 commented Oct 24, 2023

rynewang commented Oct 24, 2023

jjyao commented Oct 25, 2023

rkooo567 commented Oct 25, 2023

rynewang commented Oct 26, 2023

rkooo567 left a comment

Choose a reason for hiding this comment

rynewang commented Oct 26, 2023

rkooo567 commented Oct 27, 2023

rynewang commented Oct 16, 2023 •

edited

Loading