Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rpc: free buffer after client disconnect #7378

Closed
wants to merge 1 commit into from

Conversation

chraac
Copy link

@chraac chraac commented May 19, 2024

In PR #6829, @rgerganov add support to rpc backend, after using it for several days, I have noticed an issue:

  • When the client closes the connection, the server does not free the memory it has allocated.

Upon investigating the source code, I discovered that instead of releasing the memory, we simply exit the inner loop and immediately wait for a new connection (ggml-rpc.cpp#L1027).

So here I create this PR, which monitor the ALLOC_BUFFER and FREE_BUFFER command, maintaining a list of allocated buffers, then free the remind buffer after client disconnect.

@@ -934,7 +938,7 @@ static void rpc_serve_client(ggml_backend_t backend, sockfd_t sockfd, size_t fre
}
switch (cmd) {
case ALLOC_BUFFER: {
rpc_alloc_buffer(backend, input, output);
allocated_buffers.push_back(rpc_alloc_buffer(backend, input, output));
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add the allocated buffer into list.

@@ -950,7 +954,7 @@ static void rpc_serve_client(ggml_backend_t backend, sockfd_t sockfd, size_t fre
break;
}
case FREE_BUFFER: {
rpc_free_buffer(input);
allocated_buffers.remove(rpc_free_buffer(input));
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the freed buffer from list


for (auto buff: allocated_buffers) {
ggml_backend_buffer_free(buff);
}
Copy link
Author

@chraac chraac May 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

free the reminding buffers.

@mofosyne mofosyne added bugfix fixes an issue or bug Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix labels May 20, 2024
@rgerganov
Copy link
Collaborator

rgerganov commented May 20, 2024

I think a better approach would be to track allocated buffers with std::unordered_set, so we can also perform input validation in SET_TENSOR, `GET_TENSOR, etc. I will submit a patch shortly and add you as a reviewer.

@chraac
Copy link
Author

chraac commented May 20, 2024

I think a better approach would be to track allocated buffers with std::unordered_set, so we can also perform input validation in SET_TENSOR, `GET_TENSOR, etc. I will submit a patch shortly and add you as a reviewer.

emm, just forgot the SET_TENSOR and GET_TENSOR case, see there's also a tensor allocation in there.

also, for the std::unordered_set, thought the std::list will be okay here, since we won't have too many tensor, so this will no impact the free operation's performance.

but agree that std::unordered_set is a better alternative, let use it instead.

@chraac
Copy link
Author

chraac commented May 20, 2024

close this PR and wait for @rgerganov 's fix, related discussion: #7407

@chraac chraac closed this May 20, 2024
@chraac chraac mentioned this pull request May 20, 2024
@chraac chraac deleted the dev-fix-rpc-mem-alloc branch June 1, 2024 03:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix fixes an issue or bug Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants