Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/memory leaks #846

Merged
merged 24 commits into from
Mar 29, 2024
Merged

Conversation

DenisBiryukov91
Copy link
Contributor

@DenisBiryukov91 DenisBiryukov91 commented Mar 19, 2024

Addresses a series of memory leaks detected by valgrind (Closes #695 ):

due to cyclic references in Runtime and its dependents
due to cyclic references in Session and its dependents
due to cyclic references in Resource and its children
due to cyclic reference in Mux->Face->State->Mux
due to non-terminated tasks after Session.close()
Valgrind still reports leaks, related to static variable ZRUNTIME_POOL. This is because rust never calls drop on static variables by design. This leaks can be removed by calling zruntime_pool_free() or using ZRuntimePoolGuard.

Also adds a valgrind memory leaks test into ci.

Copy link
Contributor

@milyin milyin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good for me. But I don't have enough experience with zenoh deep internals and with tokyo, so it would be good to have this PR reviewed by someone from protocol team. @OlivierHecart ?
The only concern: I'd prefer graceful failure instead of unwrap() panics. In most cases it's possible to just return empty result as far as I see

commons/zenoh-task/src/lib.rs Outdated Show resolved Hide resolved
commons/zenoh-task/src/lib.rs Outdated Show resolved Hide resolved
zenoh/src/net/routing/dispatcher/resource.rs Show resolved Hide resolved
zenoh/tests/routing.rs Outdated Show resolved Hide resolved
@YuanYuYuan
Copy link
Contributor

YuanYuYuan commented Mar 26, 2024

@DenisBiryukov91 , I'd like to leave some comments since you're addressing some issues that I wanted to solve a few weeks ago on Tokio porting. Here're some suggestions

  • It's better not to mix an additional executor with Tokio. Any Tokio spawn , spawn_blocking, etc. require to be inside a Tokio context. The tests used for valgrind don't fail since we use #[tokio::main]. It would panic if we run it with different runtime or in a blocking manner. It's suggested to either explicitly use our ZRuntime or use std::thread.
  • Regarding the Drop of ZRUNTIME_POOL , we could use fewer unsafe codes and tackle the runtime properly. Can you have a look at this PR please?
  • I once tried to implement something similar to TaskControllerbefore but gave up then. The reasons are
    1. The variance of the using scope. Wrapping every task with a CancellationToken and hooking it to a TaskTracker is not always the optimal solution. Sometimes, we may want to collect the results of spawning tasks, then we need to use JoinSet instead of TaskTracker.
    2. You may find that some implementations in zenoh-links doesn't fit the scenario of TaskTracker and we simply use a JoinHandle to manage it. This is more efficient for handling the tasks for that specific zenoh-link.
      Anyway, I do not object to the additional zenoh-task crate. But it might not be a complete solution and will suffer to change in the future. A better one is to review and improve our async usage. Like do we really need to spawn this task that never awaits on a ZRuntime? https://github.com/DenisBiryukov91/zenoh/blob/fix/memory_leaks/zenoh/src/session.rs#L1510-L1537

DenisBiryukov91 and others added 3 commits March 26, 2024 18:07
- terminate more tasks
- make TaskController::terminate_all[_async] accept timeout duration
@milyin milyin changed the base branch from main to protocol_changes March 29, 2024 11:39
@milyin milyin changed the base branch from protocol_changes to main March 29, 2024 11:40
@milyin milyin merged commit 888d340 into eclipse-zenoh:main Mar 29, 2024
8 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Memory leaks cleanup
4 participants