Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid unnecessary persistence of retention leases #42299

Conversation

jasontedor
Copy link
Member

Today we are persisting the retention leases at least every thirty seconds by a scheduled background sync. This sync causes an fsync to disk and when there are a large number of shards allocated to slow disks, these fsyncs can pile up and can severely impact the system. This commit addresses this by only persisting and fsyncing the retention leases if they have changed since the last time that we persisted and fsynced the retention leases.

Today we are persisting the retention leases at least every thirty
seconds by a scheduled background sync. This sync causes an fsync to
disk and when there are a large number of shards allocated to slow
disks, these fsyncs can pile up and can severely impact the system. This
commit addresses this by only persisting and fsyncing the retention
leases if they have changed since the last time that we persisted and
fsynced the retention leases.
@jasontedor jasontedor added >bug :Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. v8.0.0 v7.2.0 v6.8.1 v7.1.1 labels May 21, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

…retention-lease-persistence

* elastic/master:
  Mute all ml_datafeed_crud rolling upgrade tests
Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jasontedor jasontedor merged commit 5f9c8ba into elastic:master May 21, 2019
jasontedor added a commit that referenced this pull request May 21, 2019
Today we are persisting the retention leases at least every thirty
seconds by a scheduled background sync. This sync causes an fsync to
disk and when there are a large number of shards allocated to slow
disks, these fsyncs can pile up and can severely impact the system. This
commit addresses this by only persisting and fsyncing the retention
leases if they have changed since the last time that we persisted and
fsynced the retention leases.
jasontedor added a commit that referenced this pull request May 21, 2019
Today we are persisting the retention leases at least every thirty
seconds by a scheduled background sync. This sync causes an fsync to
disk and when there are a large number of shards allocated to slow
disks, these fsyncs can pile up and can severely impact the system. This
commit addresses this by only persisting and fsyncing the retention
leases if they have changed since the last time that we persisted and
fsynced the retention leases.
jasontedor added a commit that referenced this pull request May 21, 2019
Today we are persisting the retention leases at least every thirty
seconds by a scheduled background sync. This sync causes an fsync to
disk and when there are a large number of shards allocated to slow
disks, these fsyncs can pile up and can severely impact the system. This
commit addresses this by only persisting and fsyncing the retention
leases if they have changed since the last time that we persisted and
fsynced the retention leases.
@jasontedor jasontedor deleted the avoid-unnecessary-retention-lease-persistence branch May 21, 2019 18:10
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request May 21, 2019
* master: (176 commits)
  Avoid unnecessary persistence of retention leases (elastic#42299)
  [ML][TEST] Fix limits in AutodetectMemoryLimitIT (elastic#42279)
  [ML Data Frame] Persist and restore checkpoint and position (elastic#41942)
  mute failing filerealm hash caching tests (elastic#42304)
  Safer Wait for Snapshot Success in ClusterPrivilegeTests (elastic#40943)
  Remove 7.0.2 (elastic#42282)
  Revert "Remove 7.0.2 (elastic#42282)"
  [DOCS] Copied note on slicing support to Slicing section. Closes 26114 (elastic#40426)
  Remove 7.0.2 (elastic#42282)
  Mute all ml_datafeed_crud rolling upgrade tests
  Move the FIPS configuration back to the build plugin (elastic#41989)
  Remove stray back tick that's messing up table format (elastic#41705)
  Add missing comma in code section (elastic#41678)
  add 7.1.1 and 6.8.1 versions (elastic#42253)
  Use spearate testkit dir for each run (elastic#42013)
  Add experimental and warnings to vector functions (elastic#42205)
  Fix version in tests since elastic#41906 was merged
  Bump version in BWC check after backport
  Prevent in-place downgrades and invalid upgrades (elastic#41731)
  Mute date_histo interval bwc test
  ...
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request May 22, 2019
* master: (82 commits)
  Fix off-by-one error in an index shard test
  Cleanup Redundant BlobStoreFormat Class (elastic#42195)
  remove backcompat handling of 6.2.x versions (elastic#42044)
  Mute testDelayedOperationsBeforeAndAfterRelocated
  Execute actions under permit in primary mode only (elastic#42241)
  Mute another transforms_stats yaml test
  Deprecate support for chained multi-fields. (elastic#41926)
  Mute transforms_stats yaml test
  Make unwrapCorrupt Check Suppressed Ex. (elastic#41889)
  Remove Dead Code from Azure Repo Plugin (elastic#42178)
  Reorganize Painless doc structure (elastic#42303)
  Avoid unnecessary persistence of retention leases (elastic#42299)
  [ML][TEST] Fix limits in AutodetectMemoryLimitIT (elastic#42279)
  [ML Data Frame] Persist and restore checkpoint and position (elastic#41942)
  mute failing filerealm hash caching tests (elastic#42304)
  Safer Wait for Snapshot Success in ClusterPrivilegeTests (elastic#40943)
  Remove 7.0.2 (elastic#42282)
  Revert "Remove 7.0.2 (elastic#42282)"
  [DOCS] Copied note on slicing support to Slicing section. Closes 26114 (elastic#40426)
  Remove 7.0.2 (elastic#42282)
  ...
gurkankaymak pushed a commit to gurkankaymak/elasticsearch that referenced this pull request May 27, 2019
Today we are persisting the retention leases at least every thirty
seconds by a scheduled background sync. This sync causes an fsync to
disk and when there are a large number of shards allocated to slow
disks, these fsyncs can pile up and can severely impact the system. This
commit addresses this by only persisting and fsyncing the retention
leases if they have changed since the last time that we persisted and
fsynced the retention leases.
@DaveCTurner
Copy link
Contributor

A user in the discussion forum noticed increased write activity after reindexing some 6.7 indices using a 7.0 cluster and reports that upgrading to 7.1.1 has resolved this. I suspect this PR is what helped.

https://discuss.elastic.co/t/increased-disk-writes-on-elasticsearch/182949

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. v6.8.1 v7.1.1 v7.2.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants