Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core] on plasma server, ref count fds to each client, and request unmaps on release. #40370

Merged
merged 17 commits into from
Oct 27, 2023

Conversation

rynewang
Copy link
Contributor

@rynewang rynewang commented Oct 16, 2023

Plasma memory sharing works this way: the plasma server creates a temp file and mmaps it; then upon plasma client's Get, the server sends the fd to client who mmaps that fd. Then client can Release an object, and if all clients released refs to an fd, the server unmaps it. See the missing piece? the plasma client never unmaps.

This is normally not a problem because we don't want to unmap the main memory in /dev/shm anyway; but on memory pressure when we do fallback allocations (mmaps to disk files in /tmp/ray), we will leak mmaps by never unmapping them in plasma client, even if nobody is using those mmap files. To make things worse, raylet itself has a plasma client so even if the core workers exit we are still leaking.

A good place for a plasma client to unmap is at Release, after which it may no longer read or write an object ID. However a mmap region may be used by more than 1 object (this is NOT the case today for fallback allocations, but we want to be future proof); also if a mmap region is unmapped and mapped again, the plasma client fails, because the plasma server did not know the client unmapped it and hence would not send the fd.

This PR allows the plasma server to ask a plasma client to unmap. The server maintains a per-client ref count table: {object ID -> mmap fd}, and if a certain fd is no longer referenced by a Release request, the server sends a boolean "should_unmap" which orders the client to do that. The client MUST unmap.

If some time later the same fd needs to be mapped again in a Get request, it's fine, because the server knows the client no longer mmaps that fd, and would send the fd; and the client had removed the knowledge of that fd so it receives the fd and maps again.

Fixes #39229

… refcnt == 0.

Signed-off-by: Ruiyang Wang <rywang014@gmail.com>
Signed-off-by: Ruiyang Wang <rywang014@gmail.com>
Signed-off-by: Ruiyang Wang <rywang014@gmail.com>
Signed-off-by: Ruiyang Wang <rywang014@gmail.com>
Signed-off-by: Ruiyang Wang <rywang014@gmail.com>
Signed-off-by: Ruiyang Wang <rywang014@gmail.com>
Signed-off-by: Ruiyang Wang <rywang014@gmail.com>
@rynewang rynewang marked this pull request as ready for review October 17, 2023 19:02
Copy link
Contributor

@rkooo567 rkooo567 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM. I will approve it after addressing comments (small)

src/ray/object_manager/plasma/protocol.cc Outdated Show resolved Hide resolved
src/ray/object_manager/plasma/connection.h Outdated Show resolved Hide resolved
src/ray/object_manager/plasma/connection.h Outdated Show resolved Hide resolved
src/ray/object_manager/plasma/connection.h Outdated Show resolved Hide resolved
bool should_unmap;
RAY_RETURN_NOT_OK(ReadReleaseReply(
buffer.data(), buffer.size(), &released_object_id, &should_unmap));
if (should_unmap) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment this is always false if the mmap is not fallback allocated? (it'd be nice to have assertion here, but I assume there's no way to check if mmap is fallback allocated vs not in client and add RAY_CHECK right?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah there's no way. In fact this is the main reason why we need to transfer this bool in the first place (because client can't tell)

@rkooo567 rkooo567 added the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Oct 19, 2023
rynewang and others added 3 commits October 23, 2023 12:26
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
Signed-off-by: Ruiyang Wang <56065503+rynewang@users.noreply.github.com>
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
Signed-off-by: Ruiyang Wang <56065503+rynewang@users.noreply.github.com>
Signed-off-by: Ruiyang Wang <rywang014@gmail.com>
@rynewang rynewang removed the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Oct 23, 2023
src/ray/object_manager/plasma/store.cc Outdated Show resolved Hide resolved
Comment on lines +602 to +604
RAY_RETURN_NOT_OK(
PlasmaReceive(store_conn_, MessageType::PlasmaReleaseReply, &buffer));
ObjectID released_object_id;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to verify the perf impact.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will monitor release perf tests after this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think any of existing test can find perf diff from this change (it is only applied to fallbcak allocation). We should write a new test. I feel like it is probably okay not to do this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no this applies to all object releases, just for fallback allocations the bool should_unmap == true.

src/ray/object_manager/plasma/connection.h Outdated Show resolved Hide resolved
src/ray/object_manager/plasma/connection.h Outdated Show resolved Hide resolved
Signed-off-by: Ruiyang Wang <rywang014@gmail.com>
@rkooo567
Copy link
Contributor

@rynewang can you handle the premerge failures? Also plz double check if the data tests failing in this CI is also flaky in the master (since they could be related )

@rkooo567
Copy link
Contributor

Let's merge this asap since it is high risk change

rynewang and others added 2 commits October 24, 2023 16:21
Signed-off-by: Ruiyang Wang <rywang014@gmail.com>
Signed-off-by: Ruiyang Wang <56065503+rynewang@users.noreply.github.com>
@rynewang
Copy link
Contributor Author

@rynewang can you handle the premerge failures? Also plz double check if the data tests failing in this CI is also flaky in the master (since they could be related )

ok. I think it's due to too far behind the master && a linter complaint. updated

@jjyao
Copy link
Contributor

jjyao commented Oct 25, 2023

Premerge failed. Can you take a look?

@rkooo567
Copy link
Contributor

get_request_queue test in windows failure seem related

Signed-off-by: Ruiyang Wang <rywang014@gmail.com>
Signed-off-by: Ruiyang Wang <rywang014@gmail.com>
@rynewang
Copy link
Contributor Author

premerge is green now

Copy link
Contributor

@rkooo567 rkooo567 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

workflow failure is a real issue and we shouldn't modify by fixing a test.

Signed-off-by: Ruiyang Wang <rywang014@gmail.com>
Signed-off-by: Ruiyang Wang <56065503+rynewang@users.noreply.github.com>
@rynewang
Copy link
Contributor Author

Premerge passed; CI civ1 test failed on test_memory_pressure which is already failing on master:

image

@rkooo567
Copy link
Contributor

This was a really awesome investigation! Great job!

@rkooo567 rkooo567 merged commit d3567e0 into ray-project:master Oct 27, 2023
39 of 44 checks passed
@rynewang rynewang deleted the plasma_server_request_unmap branch October 28, 2023 05:03
rkooo567 pushed a commit to rkooo567/ray that referenced this pull request Oct 30, 2023
@jjyao jjyao mentioned this pull request Dec 12, 2023
jjyao pushed a commit that referenced this pull request Dec 14, 2023
… fallback allocated. (#41842)

This is a performance fix for #40370.

Previously the plasma client sends a PlasmaReleaseRequest and does not wait for a reply. This causes the client to never know when it needs to unmap a fallback-allocated mmap. #40370 fixed it by adding back the PlasmaReleaseReply that says should_unmap and client unmaps. However this is in hot path of an object release, and most object releases are on main memory but still pays for this extra RTT.

This PR fixes by sharing more info: at Get/Create time, server notifies the client that this object is fallback_allocated if it lives on such a mmap. Then at Release time, the reply only happens if object is fallback_allocated. In hot path (main memory release), the reply is skipped so we no longer pay for the RTT.

Signed-off-by: Ruiyang Wang <rywang014@gmail.com>
rynewang added a commit to rynewang/ray that referenced this pull request Dec 14, 2023
… fallback allocated. (ray-project#41842)

This is a performance fix for ray-project#40370.

Previously the plasma client sends a PlasmaReleaseRequest and does not wait for a reply. This causes the client to never know when it needs to unmap a fallback-allocated mmap. ray-project#40370 fixed it by adding back the PlasmaReleaseReply that says should_unmap and client unmaps. However this is in hot path of an object release, and most object releases are on main memory but still pays for this extra RTT.

This PR fixes by sharing more info: at Get/Create time, server notifies the client that this object is fallback_allocated if it lives on such a mmap. Then at Release time, the reply only happens if object is fallback_allocated. In hot path (main memory release), the reply is skipped so we no longer pay for the RTT.

Signed-off-by: Ruiyang Wang <rywang014@gmail.com>
architkulkarni pushed a commit that referenced this pull request Dec 15, 2023
… fallback allocated. (#41842) (#41925)

This is a performance fix for #40370.

Previously the plasma client sends a PlasmaReleaseRequest and does not wait for a reply. This causes the client to never know when it needs to unmap a fallback-allocated mmap. #40370 fixed it by adding back the PlasmaReleaseReply that says should_unmap and client unmaps. However this is in hot path of an object release, and most object releases are on main memory but still pays for this extra RTT.

This PR fixes by sharing more info: at Get/Create time, server notifies the client that this object is fallback_allocated if it lives on such a mmap. Then at Release time, the reply only happens if object is fallback_allocated. In hot path (main memory release), the reply is skipped so we no longer pay for the RTT.

Signed-off-by: Ruiyang Wang <rywang014@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Core] Data jobs able to trigger persistent mmap file leak in between job runs
3 participants