Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce merging in PersistedClusterStateService #79793

Conversation

DaveCTurner
Copy link
Contributor

When writing the cluster state index we flush a segment every 2000 docs
or so, which sometimes triggers merging in the middle of the write
process. This merging is often unnecessary since many of the segments
being merged would have ended up containing no live docs at the end of
the process and hence could have just been deleted.

With this commit we adjust the merge policy to be much more relaxed
about merging, permitting up to 100 segments per tier, since we only
read this index very rarely and not on any hot paths. We also disable
merging completely during the write process, checking just before commit
to see if any merging should be done.

Relates #77466

When writing the cluster state index we flush a segment every 2000 docs
or so, which sometimes triggers merging in the middle of the write
process. This merging is often unnecessary since many of the segments
being merged would have ended up containing no live docs at the end of
the process and hence could have just been deleted.

With this commit we adjust the merge policy to be much more relaxed
about merging, permitting up to 100 segments per tier, since we only
read this index very rarely and not on any hot paths. We also disable
merging completely during the write process, checking just before commit
to see if any merging should be done.

Relates elastic#77466
@DaveCTurner DaveCTurner added >bug :Distributed/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. v8.0.0 v7.16.1 labels Oct 26, 2021
@elasticmachine elasticmachine added the Team:Distributed Meta label for distributed team label Oct 26, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

Copy link
Member

@original-brownbear original-brownbear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks David!

@DaveCTurner DaveCTurner added the auto-backport-and-merge Automatically create backport pull requests and merge when ready label Oct 26, 2021
@DaveCTurner DaveCTurner merged commit 5ee9bde into elastic:master Oct 26, 2021
@DaveCTurner DaveCTurner deleted the 2021-10-26-less-merging-in-cluster-state branch October 26, 2021 12:02
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
7.16 Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 79793

DaveCTurner added a commit that referenced this pull request Oct 26, 2021
When writing the cluster state index we flush a segment every 2000 docs
or so, which sometimes triggers merging in the middle of the write
process. This merging is often unnecessary since many of the segments
being merged would have ended up containing no live docs at the end of
the process and hence could have just been deleted.

With this commit we adjust the merge policy to be much more relaxed
about merging, permitting up to 100 segments per tier, since we only
read this index very rarely and not on any hot paths. We also disable
merging completely during the write process, checking just before commit
to see if any merging should be done.

Relates #77466
weizijun added a commit to weizijun/elasticsearch that referenced this pull request Oct 26, 2021
* upstream/master: (209 commits)
  Enforce license expiration (elastic#79671)
  TSDB: Automatically add timestamp mapper (elastic#79136)
  [DOCS]  `_id` is required for bulk API's `update` action (elastic#79774)
  EQL: Add optional fields and limit joining keys on non-null values only (elastic#79677)
  [DOCS] Document range enrich policy (elastic#79607)
  [DOCS] Fix typos in 8.0 security migration (elastic#79802)
  Allow listing older repositories (elastic#78244)
  [ML] track inference model feature usage per node (elastic#79752)
  Remove IncrementalClusterStateWriter & related code (elastic#79738)
  Reuse previous indices lookup when possible (elastic#79004)
  Reduce merging in PersistedClusterStateService (elastic#79793)
  SQL: Adjust JDBC docs to use milliseconds for timeouts (elastic#79628)
  Remove endpoint for freezing indices (elastic#78918)
  [ML] add timeout parameter for DELETE trained_models API (elastic#79739)
  [ML] wait for .ml-state-write alias to be readable (elastic#79731)
  [Docs] Update edgengram-tokenizer.asciidoc (elastic#79577)
  [Test][Transform] fix UpdateTransformActionRequestTests failure (elastic#79787)
  Limit CS Update Task Description Size (elastic#79443)
  Apply the reader wrapper on can_match source (elastic#78988)
  [DOCS] Adds new transform limitation item and a note to the tutorial (elastic#79479)
  ...

# Conflicts:
#	server/src/main/java/org/elasticsearch/index/IndexMode.java
#	server/src/test/java/org/elasticsearch/index/TimeSeriesModeTests.java
lockewritesdocs pushed a commit to lockewritesdocs/elasticsearch that referenced this pull request Oct 28, 2021
When writing the cluster state index we flush a segment every 2000 docs
or so, which sometimes triggers merging in the middle of the write
process. This merging is often unnecessary since many of the segments
being merged would have ended up containing no live docs at the end of
the process and hence could have just been deleted.

With this commit we adjust the merge policy to be much more relaxed
about merging, permitting up to 100 segments per tier, since we only
read this index very rarely and not on any hot paths. We also disable
merging completely during the write process, checking just before commit
to see if any merging should be done.

Relates elastic#77466
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport-and-merge Automatically create backport pull requests and merge when ready >bug :Distributed/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. Team:Distributed Meta label for distributed team v7.16.0 v8.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants