FSync translog outside of the writers global lock #18360

s1monw · 2016-05-15T11:57:35Z

Today we aquire a write global lock that blocks all modification to the
translog file while we fsync / checkpoint the file. Yet, we don't necessarily
needt to block concurrent operations here. This can lead to a lot of blocked
threads if the machine has high concurrency (lot os CPUs) but uses slow disks
(spinning disks) which is absolutely unnecessary. We just need to protect from
fsyncing / checkpointing concurrently but we can fill buffers and write to the
underlying file in a concurrent fashion.

This change introduces an additional lock that we hold while fsyncing but moves
the checkpointing code outside of the writers global lock.

I had some conversation with @mikemccand who saw congestion on these locks. @mikemccand can you give this patch a go if you see better concurrency? I also think we need to run the powertester on this one again :)

Today we aquire a write global lock that blocks all modification to the translog file while we fsync / checkpoint the file. Yet, we don't necessarily needt to block concurrent operations here. This can lead to a lot of blocked threads if the machine has high concurrency (lot os CPUs) but uses slow disks (spinning disks) which is absolutely unnecessary. We just need to protect from fsyncing / checkpointing concurrently but we can fill buffers and write to the underlying file in a concurrent fashion. This change introduces an additional lock that we hold while fsyncing but moves the checkpointing code outside of the writers global lock.

mikemccand · 2016-05-15T13:54:07Z

Thanks @s1monw I will test this!

mikemccand · 2016-05-15T14:07:18Z

core/src/main/java/org/elasticsearch/index/translog/TranslogWriter.java

@@ -139,30 +140,15 @@ private synchronized final void closeWithTragicEvent(Throwable throwable) throws
        return new Translog.Location(generation, offset, data.length());
    }

+    private final ReentrantLock syncLock = new ReentrantLock();


Maybe add a lock order comment here, e.g. "lock order syncLock -> synchronized(this)"?

mikemccand · 2016-05-15T14:07:23Z

LGTM

mikemccand · 2016-05-16T18:38:32Z

I think this change is potentially a biggish performance gain on highly concurrent CPUs with slowish IO.

I tested on 72 core box, on a spinning disk, and without the change (2 runs):

  67.3 M docs in 948.4 seconds
  67.3 M docs in 946.3 seconds

and 2 runs with the change:

  67.3 M docs in 858.1 seconds
  67.3 M docs in 884.7 seconds

I also ran the "random power loss tester" and there were no corruptions after 18 power loss events (at which point I hit disk full!).

s1monw · 2016-05-17T07:38:19Z

@mikemccand I pushed a new commit

bleskes · 2016-05-17T07:42:52Z

@mikemccand out of curiosity - how many concurrent threads did you use to index and how many shards are in the index?

mikemccand · 2016-05-17T14:34:32Z

LGTM, thanks @s1monw

mikemccand · 2016-05-17T14:36:44Z

how many concurrent threads did you use to index and how many shards are in the index?

6 shards, 12 client side threads, and I told ES to use 36 cores (though I did have a local mod to relax its current 32 core max). I disabled auto-IO merge throttling, and ran with translog durability async.

mikemccand · 2016-05-17T18:49:55Z

I re-ran perf test with default translog durability (request) fsync:

before.log.0: Indexer: 56952280 docs: 1139.35 sec [49986.8 dps, 16.0 MB/sec]
before.log.1: Indexer: 56952280 docs: 1133.59 sec [50240.5 dps, 16.1 MB/sec]
before.log.2: Indexer: 56952280 docs: 1156.01 sec [49266.4 dps, 15.8 MB/sec]

 after.log.0: Indexer: 56952280 docs: 1058.93 sec [53782.8 dps, 17.2 MB/sec]
 after.log.1: Indexer: 56952280 docs: 1064.17 sec [53518.0 dps, 17.2 MB/sec]
 after.log.2: Indexer: 56952280 docs: 1046.26 sec [54433.9 dps, 17.4 MB/sec]

This is powerful CPU (36 real cores) against slowish IO (single spinning 8 TB disk) ... I think there are real gains here.

elastic#18360 introduced an extra lock in order to allow writes while syncing the translog. This caused a potential deadlock with snapshotting code where we first acquire the instance lock, followed by a sync (which acquires the syncLock). However, the sync logic acquires the syncLock first, followed by the instance lock. I considered solving this by not syncing the translog on snapshot - I think we can get away with just flushing it. That however will create subtleties around snapshoting and whether operations in them are persisted. I opted instead to have slightly uglier code with nest synchronized, where the scope of the change is contained to the TranslogWriter class alone.

#18360 introduced an extra lock in order to allow writes while syncing the translog. This caused a potential deadlock with snapshotting code where we first acquire the instance lock, followed by a sync (which acquires the syncLock). However, the sync logic acquires the syncLock first, followed by the instance lock. I considered solving this by not syncing the translog on snapshot - I think we can get away with just flushing it. That however will create subtleties around snapshoting and whether operations in them are persisted. I opted instead to have slightly uglier code with nest synchronized, where the scope of the change is contained to the TranslogWriter class alone.

s1monw added >enhancement review v5.0.0-alpha3 labels May 15, 2016

mikemccand reviewed May 15, 2016
View reviewed changes

Add lock order comment and in-line fsync in higher level call

80676ee

s1monw merged commit 2b972f1 into elastic:master May 19, 2016

s1monw deleted the sync_without_global_lock branch May 19, 2016 07:41

bleskes mentioned this pull request May 20, 2016

Snapshotting and sync could cause a dead lock TranslogWriter #18481

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FSync translog outside of the writers global lock #18360

FSync translog outside of the writers global lock #18360

s1monw commented May 15, 2016

mikemccand commented May 15, 2016

mikemccand May 15, 2016

mikemccand commented May 15, 2016

mikemccand commented May 16, 2016

s1monw commented May 17, 2016

bleskes commented May 17, 2016

mikemccand commented May 17, 2016

mikemccand commented May 17, 2016

mikemccand commented May 17, 2016

FSync translog outside of the writers global lock #18360

FSync translog outside of the writers global lock #18360

Conversation

s1monw commented May 15, 2016

mikemccand commented May 15, 2016

mikemccand May 15, 2016

Choose a reason for hiding this comment

mikemccand commented May 15, 2016

mikemccand commented May 16, 2016

s1monw commented May 17, 2016

bleskes commented May 17, 2016

mikemccand commented May 17, 2016

mikemccand commented May 17, 2016

mikemccand commented May 17, 2016