New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce a query backlog in the index #1896
Conversation
@tobim and I collaborated on the earlier. The code looks good to me, but it needs testing (and performance testing!) before merging, and also a changelog entry for the changed behavior of |
Before this change the index would push queries back into its own message queue when no worker was available. This means we don't have to skip lots of queries any more whenever the index receives a new message. The change also introduces a stricter restriction on the number of parallel queries. The old version would interleave the work for all queries, i.e. a worker would switch to a random work package from the pending queries map when it was done with the one before, but now a worker gets assigned to a query until that is completely done.
8219d62
to
73f399d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can safely be merged. In limited testing, I did not notice a major overall performance impact, although the changed behavior can easily be observed: When issuing thousands of queries at around the same time, the average response time for earlier queries is lower compared to before this change, while it is higher for later queries.
I think additional logging makes sense on the verbose level. We should notify the user when …
- … a query is pushed onto the backlog (trigger: new query, no worker available)
- … a query is popped from the backlog (trigger: new worker, query backlog not empty)
- … a query is executed immediately because workers were available (trigger: new query, worker available)
- … a query has finished, i.e., a worker goes back into the pool of available workers (trigger: new worker, query backlog empty)
I am approving this as-is, but please make the requested changes before merging.
... and demote another one to DEBUG.
📔 Description
Before this change the index would push queries back into its own message queue when no worker was available. This means we don't have to skip lots of queries any more whenever the index receives a new message.
The change also introduces a stricter restriction on the number of parallel queries. The old version would interleave the work for all queries, i.e. a worker would switch to a random work package from the pending queries map when it was done with the one before, but now a worker gets assigned to a query until that is completely done.
📝 Checklist
🎯 Review Instructions