Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Also mmap cfs files for hybridfs #38940

Merged

Conversation

Projects
None yet
4 participants
@danielmitterdorfer
Copy link
Member

commented Feb 15, 2019

With this commit we add the .cfs file extension to the list of file
types that are memory-mapped by hybridfs. .cfs files combine all files
of a Lucene segment into a single file in order to save file handles. As
this strategy is only used for "small" segments (less than 10% of the
shard size), it is benefical to memory-map them instead of accessing
them via NIO.

Relates #36668

Also mmap cfs files for hybridfs
With this commit we add the `.cfs` file extension to the list of file
types that are memory-mapped by hybridfs. `.cfs` files combine all files
of a Lucene segment into a single file in order to save file handles. As
this strategy is only used for "small" segments (less than 10% of the
shard size), it is benefical to memory-map them instead of accessing
them via NIO.

Relates #36668
@elasticmachine

This comment has been minimized.

Copy link
Collaborator

commented Feb 15, 2019

@danielmitterdorfer

This comment has been minimized.

Copy link
Member Author

commented Feb 15, 2019

I have run several experiments to judge the impact of this change. A baseline benchmark with mmapfs for a workload with external ids on a "small" index with roughly 3GB (on a system with 32GB RAM), we see a median throughput of 68700 docs/s. Without this change, hybridfs results in 56900 docs/s (a reduction of 17%). With this change, we are again on par (as expected).

I also ran an update-heavy workload with 40% id conflicts on a larger index of 75GB on a system that only has 8GB available page cache. The baseline (mmapfs) results in 12000 docs/s median indexing throughput whereas hybridfs with this change results in 21800 docs/s.

@danielmitterdorfer danielmitterdorfer merged commit 2ab88e2 into elastic:master Feb 15, 2019

8 checks passed

CLA Commit author has signed the CLA
Details
elasticsearch-ci/1 Build finished.
Details
elasticsearch-ci/2 Build finished.
Details
elasticsearch-ci/bwc Build finished.
Details
elasticsearch-ci/default-distro Build finished.
Details
elasticsearch-ci/docbldesx Build finished.
Details
elasticsearch-ci/oss-distro-docs Build finished.
Details
elasticsearch-ci/packaging-sample Build finished.
Details

@danielmitterdorfer danielmitterdorfer deleted the danielmitterdorfer:hybrid-fs-with-cfs branch Feb 15, 2019

danielmitterdorfer added a commit to danielmitterdorfer/elasticsearch that referenced this pull request Feb 15, 2019

Also mmap cfs files for hybridfs (elastic#38940)
With this commit we add the `.cfs` file extension to the list of file
types that are memory-mapped by hybridfs. `.cfs` files combine all files
of a Lucene segment into a single file in order to save file handles. As
this strategy is only used for "small" segments (less than 10% of the
shard size), it is benefical to memory-map them instead of accessing
them via NIO.

Relates elastic#36668

danielmitterdorfer added a commit to danielmitterdorfer/elasticsearch that referenced this pull request Feb 15, 2019

Also mmap cfs files for hybridfs (elastic#38940)
With this commit we add the `.cfs` file extension to the list of file
types that are memory-mapped by hybridfs. `.cfs` files combine all files
of a Lucene segment into a single file in order to save file handles. As
this strategy is only used for "small" segments (less than 10% of the
shard size), it is benefical to memory-map them instead of accessing
them via NIO.

Relates elastic#36668

danielmitterdorfer added a commit that referenced this pull request Feb 15, 2019

Also mmap cfs files for hybridfs (#38940) (#38947)
With this commit we add the `.cfs` file extension to the list of file
types that are memory-mapped by hybridfs. `.cfs` files combine all files
of a Lucene segment into a single file in order to save file handles. As
this strategy is only used for "small" segments (less than 10% of the
shard size), it is benefical to memory-map them instead of accessing
them via NIO.

Relates #36668

danielmitterdorfer added a commit that referenced this pull request Feb 15, 2019

Also mmap cfs files for hybridfs (#38940) (#38948)
With this commit we add the `.cfs` file extension to the list of file
types that are memory-mapped by hybridfs. `.cfs` files combine all files
of a Lucene segment into a single file in order to save file handles. As
this strategy is only used for "small" segments (less than 10% of the
shard size), it is benefical to memory-map them instead of accessing
them via NIO.

Relates #36668
@danielmitterdorfer

This comment has been minimized.

Copy link
Member Author

commented Feb 15, 2019

Backports:

jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Feb 15, 2019

Merge remote-tracking branch 'elastic/master' into retention-lease-ccr
* elastic/master:
  Avoid double term construction in DfsPhase (elastic#38716)
  Fix typo in DateRange docs (yyy → yyyy) (elastic#38883)
  Introduced class reuses follow parameter code between ShardFollowTasks (elastic#38910)
  Ensure random timestamps are within search boundary (elastic#38753)
  [CI] Muting  method testFollowIndex in IndexFollowingIT
  Update Lucene snapshot repo for 7.0.0-beta1 (elastic#38946)
  SQL: Doc on syntax (identifiers in particular) (elastic#38662)
  Upgrade to Gradle 5.2.1 (elastic#38880)
  Tie break search shard iterator comparisons on cluster alias (elastic#38853)
  Also mmap cfs files for hybridfs (elastic#38940)
  Build: Fix issue with test status logging (elastic#38799)
  Adapt FullClusterRestartIT on master (elastic#38856)
  Fix testAutoFollowing test to use createLeaderIndex() helper method.
  Migrate muted auto follow rolling upgrade test and unmute this test (elastic#38900)
  ShardBulkAction ignore primary response on primary (elastic#38901)
  Recover peers from translog, ignoring soft deletes (elastic#38904)
  Fix NPE on Stale Index in IndicesService (elastic#38891)
  Smarter CCR concurrent file chunk fetching (elastic#38841)
  Fix intermittent failure in ApiKeyIntegTests (elastic#38627)
  re-enable SmokeTestWatcherWithSecurityIT (elastic#38814)

danielmitterdorfer added a commit that referenced this pull request Mar 19, 2019

Also mmap cfs files for hybridfs (#38940) (#40189)
With this commit we add the `.cfs` file extension to the list of file
types that are memory-mapped by hybridfs. `.cfs` files combine all files
of a Lucene segment into a single file in order to save file handles. As
this strategy is only used for "small" segments (less than 10% of the
shard size), it is benefical to memory-map them instead of accessing
them via NIO.

Relates #36668

@jakelandis jakelandis added v7.0.0-rc2 and removed v7.0.0 labels Apr 3, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.