New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vmselect: introduce search.skipSlowReplicas
cmd-line flag
#4538
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
f41gh7
approved these changes
Jun 29, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
vmselect has two logical conditions during request processing when `-replicationFactor` cmd-line flag is set: 1. If at least `len(storageNodes) - replicationFactor` responded, it could skip waiting for the rest of nodes to respond. This could lead to problems described here #1207. 2. Mark response as partial if less than `len(storageNodes) - replicationFactor` responded without an error. The P1 showed itself error-prone and became the main reason why `-replicationFactor` wasn't recommended to use at vmselect level. However, this optimization could be still very useful in situations when there are slow and fast replicas in cluster. But P2 remains viable and important conditionless. Hiding P1 behind the feature-flag `search.skipSlowReplicas` should make `-replicationFactor` flag usable again. And let users choose whether they want P1 to be respected. Related issues #1207 #711 Signed-off-by: hagen1778 <roman@victoriametrics.com>
hagen1778
force-pushed
the
vmselect-partial-response
branch
from
July 7, 2023 09:41
c4e1698
to
cbbb555
Compare
hagen1778
changed the title
vmselect: wait for all vmstorage nodes to respond
vmselect: introduce Jul 7, 2023
search.skipSlowReplicas
cmd-line flag
f41gh7
approved these changes
Jul 7, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Signed-off-by: hagen1778 <roman@victoriametrics.com>
valyala
reviewed
Jul 9, 2023
valyala
pushed a commit
that referenced
this pull request
Jul 9, 2023
* vmselect: introduce `search.skipSlowReplicas` cmd-line flag vmselect has two logical conditions during request processing when `-replicationFactor` cmd-line flag is set: 1. If at least `len(storageNodes) - replicationFactor` responded, it could skip waiting for the rest of nodes to respond. This could lead to problems described here #1207. 2. Mark response as partial if less than `len(storageNodes) - replicationFactor` responded without an error. The P1 showed itself error-prone and became the main reason why `-replicationFactor` wasn't recommended to use at vmselect level. However, this optimization could be still very useful in situations when there are slow and fast replicas in cluster. But P2 remains viable and important conditionless. Hiding P1 behind the feature-flag `search.skipSlowReplicas` should make `-replicationFactor` flag usable again. And let users choose whether they want P1 to be respected. Related issues #1207 #711 Signed-off-by: hagen1778 <roman@victoriametrics.com> * docs: update changelog Signed-off-by: hagen1778 <roman@victoriametrics.com> --------- Signed-off-by: hagen1778 <roman@victoriametrics.com>
FYI, this pull request has been included in VictoriaMetrics v1.91.3 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
vmselect has two logical conditions during request processing when
-replicationFactor
cmd-line flag is set:len(storageNodes) - replicationFactor
responded, it could skipwaiting for the rest of nodes to respond. This could lead to problems described
here is not recommand enable vmselect replicationfactor in cluster with data #1207.
len(storageNodes) - replicationFactor
respondedwithout an error.
The P1 showed itself error-prone and became the main reason why
-replicationFactor
wasn't recommended to use at vmselect level.However, this optimization could be still very useful in situations
when there are slow and fast replicas in cluster.
But P2 remains viable and important conditionless.
Hiding P1 behind the feature-flag
search.skipSlowReplicas
should make
-replicationFactor
flag usable again. And let userschoose whether they want P1 to be respected.
Related issues
#1207
#711
Signed-off-by: hagen1778 roman@victoriametrics.com