Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce a query backlog in the index #1896

Merged
merged 6 commits into from Oct 19, 2021
Merged

Conversation

tobim
Copy link
Member

@tobim tobim commented Oct 1, 2021

📔 Description

Before this change the index would push queries back into its own message queue when no worker was available. This means we don't have to skip lots of queries any more whenever the index receives a new message.

The change also introduces a stricter restriction on the number of parallel queries. The old version would interleave the work for all queries, i.e. a worker would switch to a random work package from the pending queries map when it was done with the one before, but now a worker gets assigned to a query until that is completely done.

📝 Checklist

  • All user-facing changes have changelog entries.
  • The PR description contains instructions for the reviewer, if necessary.

🎯 Review Instructions

@tobim tobim added the maintenance Tasks for keeping up the infrastructure label Oct 1, 2021
@tobim tobim requested a review from a team October 1, 2021 13:58
@tobim tobim changed the title Story/ch28702/query backlog Introduce a query backlog in the index Oct 1, 2021
@dominiklohmann
Copy link
Member

@tobim and I collaborated on the earlier. The code looks good to me, but it needs testing (and performance testing!) before merging, and also a changelog entry for the changed behavior of vast.max-queries if this turns out to be a performance-relevant change.

Before this change the index would push queries back into its own
message queue when no worker was available. This means we don't
have to skip lots of queries any more whenever the index receives
a new message.

The change also introduces a stricter restriction on the number of
parallel queries. The old version would interleave the work for all
queries, i.e. a worker would switch to a random work package from
the pending queries map when it was done with the one before, but
now a worker gets assigned to a query until that is completely done.
@tobim tobim force-pushed the story/ch28702/query-backlog branch from 8219d62 to 73f399d Compare October 11, 2021 10:01
Copy link
Member

@dominiklohmann dominiklohmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can safely be merged. In limited testing, I did not notice a major overall performance impact, although the changed behavior can easily be observed: When issuing thousands of queries at around the same time, the average response time for earlier queries is lower compared to before this change, while it is higher for later queries.

I think additional logging makes sense on the verbose level. We should notify the user when …

  • … a query is pushed onto the backlog (trigger: new query, no worker available)
  • … a query is popped from the backlog (trigger: new worker, query backlog not empty)
  • … a query is executed immediately because workers were available (trigger: new query, worker available)
  • … a query has finished, i.e., a worker goes back into the pool of available workers (trigger: new worker, query backlog empty)

I am approving this as-is, but please make the requested changes before merging.

libvast/src/system/index.cpp Outdated Show resolved Hide resolved
libvast/src/system/index.cpp Outdated Show resolved Hide resolved
@dominiklohmann dominiklohmann added the performance Improvements or regressions of performance label Oct 12, 2021
@tobim tobim enabled auto-merge October 19, 2021 07:46
@tobim tobim merged commit 62248a1 into master Oct 19, 2021
@tobim tobim deleted the story/ch28702/query-backlog branch October 19, 2021 08:46
tobim added a commit that referenced this pull request Nov 8, 2021
This reverts commit 62248a1, reversing
changes made to 3f602a2.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Tasks for keeping up the infrastructure performance Improvements or regressions of performance
Projects
None yet
2 participants