Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix deadlock for queries with no candidate partitions #1947

Merged
merged 2 commits into from Nov 3, 2021

Conversation

dominiklohmann
Copy link
Member

@dominiklohmann dominiklohmann commented Nov 3, 2021

With the introduction of the query backlog we changed how we handle the available query workers. This introduced a subtle bug: Queries from anonymous senders and those that did not qualify for any candidate partitions from the meta index did not return the query worker, causing it to be unavailable permanently. This meant that VAST was unable to export data after handling exactly as many queries of that kind as there are query workers available, resulting in a deadlock.

馃摑 Checklist

  • All user-facing changes have changelog entries.
  • The changes are reflected on docs.tenzir.com/vast, if necessary.
  • The PR description contains instructions for the reviewer, if necessary.

馃幆 Review Instructions

I've reproduced this before locally and couldn't after this change. Probably makes sense to follow the logic for the handlers that receive a query worker and a query individually, which makes this fix quite obvious in hindsight.

@dominiklohmann dominiklohmann added the bug Incorrect behavior label Nov 3, 2021
With the introduction of the query backlog we changed how we handle the
available query workers. This introduced a subtle bug: Queries from
anonymous senders and those that did not qualify for any candidate
partitions from the meta index did not return the query worker, causing
it to be unavailable permanently. This meant that VAST was unable to
export data after handling exactly as many queries of that kind as there
are query workers available, resulting in a deadlock.
Copy link
Member

@tobim tobim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great find!

@dominiklohmann dominiklohmann merged commit ae91d00 into master Nov 3, 2021
@dominiklohmann dominiklohmann deleted the topic/query-worker-availability branch November 3, 2021 13:20
tobim added a commit that referenced this pull request Nov 8, 2021
This reverts commit ae91d00, reversing
changes made to 3a9ef86.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect behavior
Projects
None yet
2 participants