Skip to content

Fix wslservice lifetime callback teardown to stop long-run threadpool resource buildup#40486

Draft
Copilot wants to merge 5 commits into
masterfrom
copilot/fix-wslservice-memory-leak
Draft

Fix wslservice lifetime callback teardown to stop long-run threadpool resource buildup#40486
Copilot wants to merge 5 commits into
masterfrom
copilot/fix-wslservice-memory-leak

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 10, 2026

Summary of the Pull Request

On long-running WSL1 workloads, wslservice.exe can accumulate large commit size and thread count. This change tightens LifetimeManager teardown so per-client threadpool waits/timers are explicitly canceled/unregistered when callbacks are removed or globally cleared.

PR Checklist

  • Communication: I've discussed this with core contributors already. If work hasn't been agreed, this work might be rejected
  • Tests: Added/updated if needed and all pass
  • Localization: All end user facing strings can be localized
  • Dev docs: Added/updated if needed
  • Documentation updated: If checked, please file a pull request on our docs repo and link it here: #xxx

Detailed Description of the Pull Request / Additional comments

  • Scoped fix: LifetimeManager::ClearCallbacks

    • Cancels callback timers before releasing ownership.
    • Unregisters threadpool waits (SetThreadpoolWait(..., nullptr, nullptr)) before releasing ownership.
    • Moves released wait/timer handles into local WIL RAII containers so teardown completes outside the lock.
  • Scoped fix: LifetimeManager::RemoveCallback

    • Applies the same explicit timer cancellation and wait unregistration for single-callback removal.
    • Drains released threadpool objects outside the lock to avoid lock-held teardown and reduce residual callback resources.
  • Behavioral intent

    • Prevent stale callback-associated threadpool resources from persisting across long service lifetimes.
if (oldClient.timer)
{
    SetThreadpoolTimer(oldClient.timer.get(), nullptr, 0, 0);
    timers.emplace_back(oldClient.timer.release());
}

for (auto& process : oldClient.clientProcesses)
{
    SetThreadpoolWait(process.terminationWait.get(), nullptr, nullptr);
    waits.emplace_back(process.terminationWait.release());
}

Validation Steps Performed

  • Focused static/code-path validation of LifetimeManager callback removal and global clear flows.
  • No user-facing/API contract changes; no localization or documentation deltas introduced.

Copilot AI requested review from Copilot and removed request for Copilot May 10, 2026 15:06
Copilot AI linked an issue May 10, 2026 that may be closed by this pull request
2 tasks
Copilot AI requested review from Copilot and removed request for Copilot May 10, 2026 15:10
Agent-Logs-Url: https://github.com/microsoft/WSL/sessions/b6855d2b-63ea-4edf-ba7c-81eb71f49268

Co-authored-by: benhillis <17727402+benhillis@users.noreply.github.com>
Copilot AI requested review from Copilot and removed request for Copilot May 10, 2026 15:13
Copilot AI changed the title [WIP] Fix memory leaks in wslservice.exe Fix wslservice lifetime callback teardown to stop long-run threadpool resource buildup May 10, 2026
Copilot AI requested a review from benhillis May 10, 2026 15:14
Replace open-coded SetThreadpoolWait/SetThreadpoolTimer null calls in ClearCallbacks and RemoveCallback with the existing ClientCallback::CancelTimer helper and a new symmetric OwnedProcess::CancelWait helper.

Expand the ClearCallbacks comment block to explain the full strategy: why the per-client members use the _nowait wil variants (chain-of-waits in m_lastCallbackWait/m_lastTimerWait), why teardown re-homes them into the cancel-and-wait variants, and how in-flight callbacks are drained safely outside m_lock by the local container destructors.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 11, 2026 16:31
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR tightens LifetimeManager teardown in wslservice.exe to prevent long-running WSL1 scenarios from accumulating threadpool wait/timer resources by explicitly canceling/unregistering per-client threadpool objects and draining them outside m_lock.

Changes:

  • Update LifetimeManager::ClearCallbacks() to cancel timers and unregister waits, then re-home threadpool handles into draining WIL RAII containers outside the lock.
  • Update LifetimeManager::RemoveCallback() to apply the same cancel/unregister + drain behavior for single-client removal.
  • Add OwnedProcess::CancelWait() helper to centralize unregistering a threadpool wait.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
src/windows/service/exe/Lifetime.h Adds OwnedProcess::CancelWait() declaration to support explicit wait unregistration during teardown.
src/windows/service/exe/Lifetime.cpp Implements cancel/unregister + out-of-lock draining for waits/timers in ClearCallbacks and RemoveCallback, and defines OwnedProcess::CancelWait().

Comment thread src/windows/service/exe/Lifetime.cpp Outdated
Comment thread src/windows/service/exe/Lifetime.cpp Outdated
Comment thread src/windows/service/exe/Lifetime.cpp Outdated
Comment thread src/windows/service/exe/Lifetime.cpp Outdated
Address Copilot reviewer feedback: release()-then-emplace_back leaks the raw PTP_WAIT/PTP_TIMER if the vector grows and allocation throws (the source has already lost ownership before emplace_back can construct the new element).

Wrap each released handle in a named wil RAII temporary first, then push_back(std::move(...)). The temporary owns the handle while push_back may throw, ensuring the handle is properly closed via the destructor on failure.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

}

void LifetimeManager::OwnedProcess::CancelWait() const
{
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Wslservice.exe leaks memory and threads

3 participants