Skip to content

dnsdist: Fix a hang when removing a server with more than one socket #9900

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 5, 2021

Conversation

rgacogne
Copy link
Member

Short description

There was a lock starvation issue when removing a server with more than one socket in use (sockets greater than 1 on the corresponding newServer directive), because the mutex protecting the sockets array would never be released long enough by the responder thread to allow the thread stopping the server to acquire it.
This commit fixes that by marking the server as stopped right away, before acquiring the lock, and also making sure that the responder thread is woken up regularly (every second, even without any query to process) and that it checks whether the server has been stopped just after that.

The issue was introduced in be55a20, and backported to 1.5.1 in f0d4831.

Checklist

I have:

  • read the CONTRIBUTING.md document
  • compiled this code
  • tested this code
  • included documentation (including possible behaviour changes)
  • documented the code
  • added or modified regression test(s)
  • added or modified unit test(s)

There was a lock starvation issue when removing a server with more
than one socket in use (`sockets` greater than 1 on the corresponding
`newServer` directive), because the mutex protecting the sockets array
would never be released long enough by the responder thread to allow
the thread stopping the server to acquire it.
This commit fixes that by marking the server as stopped right away,
before acquiring the lock, and also making sure that the responder
thread is woken up regularly (every second, even without any query
to process) and that it checks whether the server has been stopped
just after that.

The issue was introduced in be55a20,
and backported to 1.5.1 in f0d4831.
@rgacogne rgacogne added this to the dnsdist-1.6.0 milestone Dec 24, 2020
@rgacogne rgacogne merged commit bf04a19 into PowerDNS:master Jan 5, 2021
@rgacogne rgacogne deleted the ddist-rmserver-lock branch January 5, 2021 09:46
@rgacogne rgacogne mentioned this pull request Mar 10, 2021
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant