Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add node-level/Disk-level merge rate throttling #51140

Open
kkewwei opened this issue Jan 17, 2020 · 3 comments
Open

Add node-level/Disk-level merge rate throttling #51140

kkewwei opened this issue Jan 17, 2020 · 3 comments
Labels
:Distributed/Engine Anything around managing Lucene and the Translog in an open shard. Team:Distributed Meta label for distributed team

Comments

@kkewwei
Copy link
Contributor

kkewwei commented Jan 17, 2020

ES_VERSION: 7.3.1
Description of the problem including actual behavior:
We know that ES will throttle the merge rate with shard-level, it concerns the two factors:

  1. maxThreadCount
  2. IO writes throttling

It has good control over the rate merge in a shard, but a node will have many shards and the merges rate among the shards are independently, It's very likely that two shards of one node belonging to one disk are undergoing merging at high IO usage, the io usage of the disk will be 100%, then the node will become write bottleneck, the performance of the cluster will be poor, It happens many times in product.

expected behavior:
If we could also throttle IO usage at node-level, not just shard-level when the segments are merging? or if we could throttle IO usage at disk-level?

@jkakavas jkakavas added the :Core/Infra/Core Core issues without another label label Jan 17, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (:Core/Infra/Core)

@kkewwei
Copy link
Contributor Author

kkewwei commented Mar 2, 2020

hi, @jkakavas, is there anyone can help confirm that if we should add the throttling?

@rjernst rjernst added the Team:Core/Infra Meta label for core/infra team label May 4, 2020
@rjernst rjernst added the needs:triage Requires assignment of a team area label label Dec 3, 2020
@pgomulka
Copy link
Contributor

@kkewwei the issue you described might indeed exist. I am not sure we should be implementing it though.
adding @elastic/es-distributed for help

@pgomulka pgomulka added :Distributed/Engine Anything around managing Lucene and the Translog in an open shard. and removed :Core/Infra/Core Core issues without another label labels Dec 18, 2020
@elasticmachine elasticmachine added Team:Distributed Meta label for distributed team and removed Team:Core/Infra Meta label for core/infra team labels Dec 18, 2020
@jimczi jimczi removed the needs:triage Requires assignment of a team area label label Jan 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Engine Anything around managing Lucene and the Translog in an open shard. Team:Distributed Meta label for distributed team
Projects
None yet
Development

No branches or pull requests

6 participants