Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow more control of index.merge.policy.deletes_pct_allowed #77270

Closed
n0othing opened this issue Sep 3, 2021 · 3 comments
Closed

Allow more control of index.merge.policy.deletes_pct_allowed #77270

n0othing opened this issue Sep 3, 2021 · 3 comments
Labels
:Distributed/Engine Anything around managing Lucene and the Translog in an open shard. >enhancement Team:Distributed Meta label for distributed team

Comments

@n0othing
Copy link
Member

n0othing commented Sep 3, 2021

Today, the default merge policy allows you to control the maximum percentage of deleted documents that is tolerated in an index via index.merge.policy.deletes_pct_allowed. This value defaults to 33% and allows users to configure anywhere between 20% - 50%.

The 20% floor is in place to avoid overwhelming nodes with excessive CPU usage and disk I/O. However, at scale, this 20% could translate to TBs of wasted space. Some users may have really performant hardware and be ok with trading extra CPU/disk I/O in exchange for more free disk space.

@n0othing n0othing added >enhancement needs:triage Requires assignment of a team area label labels Sep 3, 2021
@gwbrown gwbrown added :Distributed/Engine Anything around managing Lucene and the Translog in an open shard. and removed needs:triage Requires assignment of a team area label labels Sep 3, 2021
@elasticmachine elasticmachine added the Team:Distributed Meta label for distributed team label Sep 3, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

jimczi added a commit to jimczi/elasticsearch that referenced this issue Sep 9, 2021
This commit changes the es merge policy to apply the maximum segment size
on force merges that only expunge deletes (forceMergeDeletes).
This option is useful for read-write use cases that wants to reclaim deleted docs
more aggressively than the `index.merge.policy.deletes_pct_allowed`.

Relates elastic#77270
jimczi added a commit that referenced this issue Sep 14, 2021
…rge (#77478)

This commit changes the es merge policy to apply the maximum segment size
on force merges that only expunge deletes (forceMergeDeletes).
This option is useful for read-write use cases that wants to reclaim deleted docs
more aggressively than the `index.merge.policy.deletes_pct_allowed`.

Closes #61764
Relates #77270
jimczi added a commit that referenced this issue Sep 14, 2021
…rge (#77478) (#77692)

This commit changes the es merge policy to apply the maximum segment size
on force merges that only expunge deletes (forceMergeDeletes).
This option is useful for read-write use cases that wants to reclaim deleted docs
more aggressively than the `index.merge.policy.deletes_pct_allowed`.

Closes #61764
Relates #77270
@AlexanderGunnarssonMW
Copy link

We are also interested in this. We are currently scaling up our cluster based on disk usage and have spare CPU and would benefit from a value lower than 20%.

@n0othing
Copy link
Member Author

Closing thanks to #93188!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Engine Anything around managing Lucene and the Translog in an open shard. >enhancement Team:Distributed Meta label for distributed team
Projects
None yet
Development

No branches or pull requests

5 participants