Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixing race condition in worker shutdown #9738

Merged
merged 2 commits into from Dec 13, 2023
Merged

Conversation

brettsam
Copy link
Member

@brettsam brettsam commented Dec 12, 2023

Issue describing the changes in this PR

In the code here:

if (_workerChannels.TryGetValue(language, out Dictionary<string, TaskCompletionSource<IRpcWorkerChannel>> rpcWorkerChannels)
&& rpcWorkerChannels.TryRemove(workerId, out TaskCompletionSource<IRpcWorkerChannel> value))

... the call to TryRemove() is not thread-safe as it's not on a ConcurrentDictionary. It's possible that multiple shutdown requests will flow through this code and eventually try to start up new workers. This can lead to an error message of Number of initialized language workers exceeded as we'd be creating more than we shut down.

This change switches to ConcurrentDictionary. The test that I've added only every failed while debugging... I was unable to get it to fail while running normally, but I'll leave it in as an extra check.

Pull request checklist

  • My changes do not require documentation changes
    • Otherwise: Documentation issue linked to PR
  • My changes should not be added to the release notes for the next release
    • Otherwise: I've added my notes to release_notes.md
  • My changes do not need to be backported to a previous version
    • Otherwise: Backport tracked by issue/PR #issue_or_pr
  • My changes do not require diagnostic events changes
    • Otherwise: I have added/updated all related diagnostic events and their documentation (Documentation issue linked to PR)
  • I have added all required tests (Unit tests, E2E tests)

Additional information

Additional PR information

@brettsam brettsam requested a review from a team as a code owner December 12, 2023 21:39
@brettsam brettsam changed the title adding test to check for a race (don't review...) adding test to check for a race Dec 12, 2023
@brettsam brettsam changed the title (don't review...) adding test to check for a race fixing race condition in worker shutdown Dec 12, 2023
@brettsam brettsam merged commit d3150b3 into dev Dec 13, 2023
9 checks passed
@brettsam brettsam deleted the brettsam/worker_shutdown_fix branch December 13, 2023 19:48
v-imohammad added a commit that referenced this pull request Dec 13, 2023
* Removing conditional exception handling from FileLogger (#9739)

* fixing race condition in worker shutdown (#9738)

* updating common prop

---------

Co-authored-by: Mathew Charles <mathewc@microsoft.com>
Co-authored-by: Brett Samblanet <brettsam@microsoft.com>
brettsam added a commit that referenced this pull request Dec 14, 2023
v-imohammad pushed a commit that referenced this pull request Dec 14, 2023
* Removing conditional exception handling from FileLogger (#9739)

* fixing race condition in worker shutdown (#9738)

---------

Co-authored-by: Mathew Charles <mathewc@microsoft.com>
Co-authored-by: Brett Samblanet <brettsam@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants