Add node-level/Disk-level merge rate throttling #51140

kkewwei · 2020-01-17T10:07:31Z

ES_VERSION: 7.3.1
Description of the problem including actual behavior:
We know that ES will throttle the merge rate with shard-level, it concerns the two factors:

maxThreadCount
IO writes throttling

It has good control over the rate merge in a shard, but a node will have many shards and the merges rate among the shards are independently, It's very likely that two shards of one node belonging to one disk are undergoing merging at high IO usage, the io usage of the disk will be 100%, then the node will become write bottleneck, the performance of the cluster will be poor, It happens many times in product.

expected behavior:
If we could also throttle IO usage at node-level, not just shard-level when the segments are merging? or if we could throttle IO usage at disk-level?

elasticmachine · 2020-01-17T13:49:22Z

Pinging @elastic/es-core-infra (:Core/Infra/Core)

kkewwei · 2020-03-02T11:36:35Z

hi, @jkakavas, is there anyone can help confirm that if we should add the throttling?

pgomulka · 2020-12-18T07:27:02Z

@kkewwei the issue you described might indeed exist. I am not sure we should be implementing it though.
adding @elastic/es-distributed for help

jkakavas added the :Core/Infra/Core Core issues without another label label Jan 17, 2020

rjernst added the Team:Core/Infra Meta label for core/infra team label May 4, 2020

rjernst added the needs:triage Requires assignment of a team area label label Dec 3, 2020

pgomulka added :Distributed/Engine Anything around managing Lucene and the Translog in an open shard. and removed :Core/Infra/Core Core issues without another label labels Dec 18, 2020

elasticmachine added Team:Distributed Meta label for distributed team and removed Team:Core/Infra Meta label for core/infra team labels Dec 18, 2020

jimczi removed the needs:triage Requires assignment of a team area label label Jan 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add node-level/Disk-level merge rate throttling #51140

Add node-level/Disk-level merge rate throttling #51140

kkewwei commented Jan 17, 2020 •

edited

elasticmachine commented Jan 17, 2020

kkewwei commented Mar 2, 2020

pgomulka commented Dec 18, 2020

Add node-level/Disk-level merge rate throttling #51140

Add node-level/Disk-level merge rate throttling #51140

Comments

kkewwei commented Jan 17, 2020 • edited

elasticmachine commented Jan 17, 2020

kkewwei commented Mar 2, 2020

pgomulka commented Dec 18, 2020

kkewwei commented Jan 17, 2020 •

edited