New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache midpoints in Ptable footer for faster startup times #1491

Merged
merged 1 commit into from Dec 12, 2017

Conversation

3 participants
@shaan1337
Member

shaan1337 commented Nov 15, 2017

Large databases with very large index files (10+ GB) can take a lot of time to start up due to the time taken to cache the midpoints into memory. Although there are not many midpoints (up to 2^28, 24 bytes each - around 6GB max), it takes much time to load them from disk since they are not stored sequentially and the OS reads data in large blocks.

This change caches the midpoints at the end of the PTable file for faster loading times. It will not improve speeds for existing large databases, until the next large PTable merges or if the indexes are rebuilt

Add a new version of PTable, V4:

  1. keep a 128 byte footer before the md5 hash
  2. first 4 bytes of the footer stores an unsigned int: the number of midpoints cached in the PTable
  3. Midpoints will be cached after index entries
  4. Each midpoint index entry has the same size as an index entry

Midpoints will be cached during the next PTable merge (only if no index entries have been removed during the merge, due to missing database entries)
On startup, if index verification is disabled (--skip-index-verify) and cached midpoints are present in the file, they will be loaded directly into the midpoints memory cache.
Otherwise, they will be loaded normally.

@gregoryyoung

This comment has been minimized.

Show comment
Hide comment
@gregoryyoung

gregoryyoung Nov 15, 2017

Member

might it make sense to mmap the cached midpoints?

Member

gregoryyoung commented Nov 15, 2017

might it make sense to mmap the cached midpoints?

@shaan1337

This comment has been minimized.

Show comment
Hide comment
@shaan1337

shaan1337 Nov 15, 2017

Member

in the worst case, it looks like the midpoints cache would be around 6GB, taking around 12s for a read speed of 500MB/s but we need to do some tests to confirm this. it could help save some memory though.

Member

shaan1337 commented Nov 15, 2017

in the worst case, it looks like the midpoints cache would be around 6GB, taking around 12s for a read speed of 500MB/s but we need to do some tests to confirm this. it could help save some memory though.

@shaan1337

This comment has been minimized.

Show comment
Hide comment
@shaan1337

shaan1337 Nov 20, 2017

Member

What are your views on handling corrupted index entries (might happen if index verification has been disabled)?

@hayley-jean and I proposed the following:
We could add some assertions like the following (if not already present):

  • When reading a stream, all index entries for that stream should have same stream hash/events should be ordered
  • File reading positions should be a multiple of the entry sizes

If any of these assertions fail, add a flag that forces index & chunk file verification next time Eventstore is started so that indexes can be rebuilt if corrupted (md5 hashes don't match).

What do you think?

Member

shaan1337 commented Nov 20, 2017

What are your views on handling corrupted index entries (might happen if index verification has been disabled)?

@hayley-jean and I proposed the following:
We could add some assertions like the following (if not already present):

  • When reading a stream, all index entries for that stream should have same stream hash/events should be ordered
  • File reading positions should be a multiple of the entry sizes

If any of these assertions fail, add a flag that forces index & chunk file verification next time Eventstore is started so that indexes can be rebuilt if corrupted (md5 hashes don't match).

What do you think?

@gregoryyoung

This comment has been minimized.

Show comment
Hide comment
@gregoryyoung

gregoryyoung Nov 20, 2017

Member

When reading a stream, all index entries for that stream should have same stream hash/events should be ordered
File reading positions should be a multiple of the entry sizes

I believe both of these are already there.

Member

gregoryyoung commented Nov 20, 2017

When reading a stream, all index entries for that stream should have same stream hash/events should be ordered
File reading positions should be a multiple of the entry sizes

I believe both of these are already there.

@gregoryyoung

This comment has been minimized.

Show comment
Hide comment
Member

gregoryyoung commented Nov 20, 2017

@shaan1337

This comment has been minimized.

Show comment
Hide comment
@shaan1337

shaan1337 Nov 22, 2017

Member

Startup Performance tests

Tests with a 10GB PTable
With index verification: 78s
Without index verification, with cached midpoints: 0.8s
Without index verification and without loading cached midpoints: 62s (based on previous tests)

Generate a 10GB PTable

Header: 128 bytes
Index Entries: 450M * 24 bytes = 10800000000 bytes
Number of midpoints: (2*2**(21-1))*4096 < 10800000000 = Depth: 21, 2^21 = 2097152 midpoints * 24 bytes = 50331648
PTable footer: 128 bytes
MD5 hash: 16 bytes

#Write Header
echo -n -e \\x01\\x04 >> 10gb.dat
dd if=/dev/zero bs=126 count=1 >> 10gb.dat

#Write index entries: 10299*1024*1024 + 715776 remainder bytes
dd if=/dev/urandom bs=1M count=10299 >> 10gb.dat
dd if=/dev/urandom bs=715776 count=1 >> 10gb.dat

#Midpoint entries: 50331648 bytes
dd if=/dev/urandom bs=1M count=48 >> 10gb.dat

#Footer
echo -n -e \\x01\\x04 >> 10gb.dat
#Midpoint count, 2097152
echo -n -e \\x00\\x00\\x20\\x00 >> 10gb.dat
dd if=/dev/zero bs=122 count=1 >> 10gb.dat

#MD5 Hash
dd if=/dev/urandom bs=16 count=1 >> 10gb.dat

#Check file size:
du -b 10gb.dat
10850331920     10gb.dat

#Replace a ptable in index folder with 10gb.dat (rename 10gb.dat to the index file name, in this example: 9981ceda-5c11-41f6-b53c-512463b5c0d0)

Running EventStore

#Clear disk cache
echo 3 > /proc/sys/vm/drop_caches

#Run EventStore without index verification: 0.8s
mono ./bin/clusternode/EventStore.ClusterNode.exe --db ../db --log ../logs --skip-db-verify --skip-index-verify

[19664,15,06:43:57.964] Loading of PTable '9981ceda-5c11-41f6-b53c-512463b5c0d0' started...
[19664,15,06:43:57.968] Disabling Verification of PTable
[19664,15,06:43:57.968] Loading 2097152 cached midpoints from PTable
[19664,07,06:43:58.674] CACHED TFChunk #7-7 (chunk-000007.000000) in 00:00:01.0601896.
[19664,15,06:43:58.779] Loading PTable (Version: 4) '9981ceda-5c11-41f6-b53c-512463b5c0d0' (450000000 entries, cache depth 21) done in 00:00:00.8156375.

#Clear disk cache
echo 3 > /proc/sys/vm/drop_caches

#Run EventStore with index verification: 78s
mono ./bin/clusternode/EventStore.ClusterNode.exe --db ../db --log ../logs --skip-db-verify

[19769,16,06:44:55.132] Loading and Verification of PTable '9981ceda-5c11-41f6-b53c-512463b5c0d0' started...
[19769,12,06:44:55.460] CACHED TFChunk #7-7 (chunk-000007.000000) in 00:00:00.9763872.
[19769,16,06:46:13.113] ReadIndex is corrupted...
Member

shaan1337 commented Nov 22, 2017

Startup Performance tests

Tests with a 10GB PTable
With index verification: 78s
Without index verification, with cached midpoints: 0.8s
Without index verification and without loading cached midpoints: 62s (based on previous tests)

Generate a 10GB PTable

Header: 128 bytes
Index Entries: 450M * 24 bytes = 10800000000 bytes
Number of midpoints: (2*2**(21-1))*4096 < 10800000000 = Depth: 21, 2^21 = 2097152 midpoints * 24 bytes = 50331648
PTable footer: 128 bytes
MD5 hash: 16 bytes

#Write Header
echo -n -e \\x01\\x04 >> 10gb.dat
dd if=/dev/zero bs=126 count=1 >> 10gb.dat

#Write index entries: 10299*1024*1024 + 715776 remainder bytes
dd if=/dev/urandom bs=1M count=10299 >> 10gb.dat
dd if=/dev/urandom bs=715776 count=1 >> 10gb.dat

#Midpoint entries: 50331648 bytes
dd if=/dev/urandom bs=1M count=48 >> 10gb.dat

#Footer
echo -n -e \\x01\\x04 >> 10gb.dat
#Midpoint count, 2097152
echo -n -e \\x00\\x00\\x20\\x00 >> 10gb.dat
dd if=/dev/zero bs=122 count=1 >> 10gb.dat

#MD5 Hash
dd if=/dev/urandom bs=16 count=1 >> 10gb.dat

#Check file size:
du -b 10gb.dat
10850331920     10gb.dat

#Replace a ptable in index folder with 10gb.dat (rename 10gb.dat to the index file name, in this example: 9981ceda-5c11-41f6-b53c-512463b5c0d0)

Running EventStore

#Clear disk cache
echo 3 > /proc/sys/vm/drop_caches

#Run EventStore without index verification: 0.8s
mono ./bin/clusternode/EventStore.ClusterNode.exe --db ../db --log ../logs --skip-db-verify --skip-index-verify

[19664,15,06:43:57.964] Loading of PTable '9981ceda-5c11-41f6-b53c-512463b5c0d0' started...
[19664,15,06:43:57.968] Disabling Verification of PTable
[19664,15,06:43:57.968] Loading 2097152 cached midpoints from PTable
[19664,07,06:43:58.674] CACHED TFChunk #7-7 (chunk-000007.000000) in 00:00:01.0601896.
[19664,15,06:43:58.779] Loading PTable (Version: 4) '9981ceda-5c11-41f6-b53c-512463b5c0d0' (450000000 entries, cache depth 21) done in 00:00:00.8156375.

#Clear disk cache
echo 3 > /proc/sys/vm/drop_caches

#Run EventStore with index verification: 78s
mono ./bin/clusternode/EventStore.ClusterNode.exe --db ../db --log ../logs --skip-db-verify

[19769,16,06:44:55.132] Loading and Verification of PTable '9981ceda-5c11-41f6-b53c-512463b5c0d0' started...
[19769,12,06:44:55.460] CACHED TFChunk #7-7 (chunk-000007.000000) in 00:00:00.9763872.
[19769,16,06:46:13.113] ReadIndex is corrupted...
@shaan1337

This comment has been minimized.

Show comment
Hide comment
@shaan1337

shaan1337 Nov 22, 2017

Member

PTable Merge Performance tests

Results merging 2 10GB files

Note: since we're using same 2 10GB files, the index entries are the same and will be merged into a ~10GB file.

cache-midpoints-in-ptable branch,--skip-db-verify --skip-index-verify: 362s merge, 0.1s load without verification: Total 362s
Note: There were missing entries while going through the index files, so a second pass was automatically made to cache the midpoints. It'll be faster when there are no missing entries.

cache-midpoints-in-ptable branch, --skip-db-verify : 372s merge, 37s load with verification: Total 409s

oss-v4.0.3 branch, --skip-db-verify : 327s merge, 78s load with verification: Total 405s

Creating an indexmap file

Spin up a database, modify an existing index map file, to the following content (keep the checkpoints the same, don't miss first newline):

1
136897089/136897089
10,0,00000000-0000-0000-0000-000000000000
10,1,00000000-0000-0000-0000-000000000001

md5sum indexmap| awk '{print toupper($0)}'
DB98BC2B501CDB34575FC89F6D89A0F5  INDEXMAP

Add MD5Sum to the index map:
DB98BC2B501CDB34575FC89F6D89A0F5
1
136897089/136897089
10,0,00000000-0000-0000-0000-000000000000
10,1,00000000-0000-0000-0000-000000000001

Make two copies of the 10GB PTable to index/00000000-0000-0000-0000-000000000000 and index/00000000-0000-0000-0000-000000000001 respectively (see steps above how to generate the PTable)
cp 10gb.dat ./00000000-0000-0000-0000-000000000000
cp 10gb.dat ./00000000-0000-0000-0000-000000000001

Prevent exceptions during test

diff --git a/src/EventStore.Core/Index/PTable.cs b/src/EventStore.Core/Index/PTable.cs
index f9b4ffa..0a67692 100644
--- a/src/EventStore.Core/Index/PTable.cs
+++ b/src/EventStore.Core/Index/PTable.cs
@@ -235,7 +235,8 @@ namespace EventStore.Core.Index
                         md5.TransformFinalBlock(Empty.ByteArray, 0, 0);
                         var fileHash = new byte[MD5Size];
                         stream.Read(fileHash, 0, MD5Size);
-                        ValidateHash(md5.Hash, fileHash);
+                        //ValidateHash(md5.Hash, fileHash);

diff --git a/src/EventStore.Core/TransactionLog/Chunks/ChunkHeader.cs b/src/EventStore.Core/TransactionLog/Chunks/ChunkHeader.cs
index 6ed6ba3..ed104af 100644
--- a/src/EventStore.Core/TransactionLog/Chunks/ChunkHeader.cs
+++ b/src/EventStore.Core/TransactionLog/Chunks/ChunkHeader.cs
@@ -78,8 +78,9 @@ namespace EventStore.Core.TransactionLog.Chunks
         {
             if (globalLogicalPosition < ChunkStartPosition || globalLogicalPosition > ChunkEndPosition)
             {
-                throw new Exception(string.Format("globalLogicalPosition {0} is out of chunk logical positions [{1}, {2}].",
-                                                  globalLogicalPosition, ChunkStartPosition, ChunkEndPosition));
+                //throw new Exception(string.Format("globalLogicalPosition {0} is out of chunk logical positions [{1}, {2}].",
+                    //                              globalLogicalPosition, ChunkStartPosition, ChunkEndPosition));
+                return 0;
             }
             return globalLogicalPosition - ChunkStartPosition;
         }
diff --git a/src/EventStore.Core/TransactionLog/Chunks/TFChunkManager.cs b/src/EventStore.Core/TransactionLog/Chunks/TFChunkManager.cs
index ffcab8b..382a013 100644
--- a/src/EventStore.Core/TransactionLog/Chunks/TFChunkManager.cs
+++ b/src/EventStore.Core/TransactionLog/Chunks/TFChunkManager.cs
@@ -288,8 +288,10 @@ namespace EventStore.Core.TransactionLog.Chunks
         public TFChunk.TFChunk GetChunkFor(long logPosition)
         {
             var chunkNum = (int)(logPosition / _config.ChunkSize);
-            if (chunkNum < 0 || chunkNum >= _chunksCount)
-                throw new ArgumentOutOfRangeException("logPosition", string.Format("LogPosition {0} does not have corresponding chunk in DB.", logPosition));
+            if (chunkNum < 0 || chunkNum >= _chunksCount){
+                //throw new ArgumentOutOfRangeException("logPosition", string.Format("LogPosition {0} does not have corresponding chunk in DB.", logPosition));
+                chunkNum = 0;
+            }
 
             var chunk = _chunks[chunkNum];
             if (chunk == null)

Running EventStore

#Clear disk cache
echo 3 > /proc/sys/vm/drop_caches

#Run EventStore
mono ./bin/clusternode/EventStore.ClusterNode.exe --db ../db --log ../logs --max-mem-table-size=1000 --skip-db-verify --skip-index-verify

#Write ~1200 events to trigger a PTable merge
mono bin/testclient/EventStore.TestClient.exe
>>> WRLT 1 20 20 1 a

Results with different options

cache-midpoints-in-ptable branch, --skip-db-verify --skip-index-verify : 362s merge, 0.1s load

[05326,10,12:04:54.286] PTables merge started (specialized for <= 2 tables).
[05326,13,12:04:54.304] === Writing E9@139300747:{ecff7ffb-07c4-4dad-a957-747ed43865f6} (previous epoch at 138716442).
[05326,13,12:04:54.310] === Update Last Epoch E9@139300747:{ecff7ffb-07c4-4dad-a957-747ed43865f6} (previous epoch at 138716442).
[05326,19,12:04:54.316] PROJECTIONS: Starting Projections Core Coordinator. (Node State : Master)
[05326,15,12:04:54.317] ========== [127.0.0.1:2112] Sub System 'Projections' initialized.
[05326,08,12:04:54.328] Created stats stream '$stats-127.0.0.1:2113', code = WrongExpectedVersion
[05326,19,12:04:54.338] PROJECTIONS: Starting Projections Manager. (Node State : Master)
[05326,19,12:04:54.339] PROJECTIONS: SubComponent Started: EventReaderCoreService
[05326,19,12:04:54.350] PROJECTIONS: Resetting Worker Writer
[05326,10,12:10:56.278] Cached 2097152 index midpoints to PTable
[05326,10,12:10:56.396] PTables merge finished in 00:06:02.0994750 ([450000000, 450000000] entries merged into 450004760).
[05326,10,12:10:56.396] Loading of PTable '8048d233-59aa-4f59-a7ca-97e3296ce33c' started...
[05326,10,12:10:56.397] Disabling Verification of PTable
[05326,10,12:10:56.397] Loading 2097152 cached midpoints from PTable
[05326,10,12:10:56.506] Loading PTable (Version: 4) '8048d233-59aa-4f59-a7ca-97e3296ce33c' (450004760 entries, cache depth 21) done in 00:00:00.1096874.

cache-midpoints-in-ptable branch, --skip-db-verify : 372s merge, 37s load

[03579,11,11:37:05.241] PTables merge started (specialized for <= 2 tables).
[03579,13,11:37:05.265] === Writing E8@138716442:{7ef0b647-131a-4c4e-b1d8-d4fd06660c02} (previous epoch at 138497188).
[03579,13,11:37:05.271] === Update Last Epoch E8@138716442:{7ef0b647-131a-4c4e-b1d8-d4fd06660c02} (previous epoch at 138497188).
[03579,20,11:37:05.279] PROJECTIONS: Starting Projections Core Coordinator. (Node State : Master)
[03579,15,11:37:05.280] ========== [127.0.0.1:2112] Sub System 'Projections' initialized.
[03579,12,11:37:05.291] Created stats stream '$stats-127.0.0.1:2113', code = WrongExpectedVersion
[03579,20,11:37:05.301] PROJECTIONS: Starting Projections Manager. (Node State : Master)
[03579,20,11:37:05.301] PROJECTIONS: SubComponent Started: EventReaderCoreService
[03579,20,11:37:05.311] PROJECTIONS: Resetting Worker Writer
[03579,11,11:43:17.177] Cached 2097152 index midpoints to PTable
[03579,11,11:43:17.288] PTables merge finished in 00:06:12.0402902 ([450000000, 450000000] entries merged into 450004760).
[03579,11,11:43:17.288] Loading and Verification of PTable '3035b688-f67e-4586-bdd3-f255df9e2c81' started...
[03579,11,11:43:54.567] Loading PTable (Version: 4) '3035b688-f67e-4586-bdd3-f255df9e2c81' (450004760 entries, cache depth 21) done in 00:00:37.2788743.

oss-v4.0.3, --skip-db-verify : 327s merge, 78s load

[09242,11,12:44:13.034] PTables merge started (specialized for <= 2 tables).
[09242,14,12:44:13.036] ========== [127.0.0.1:2112] IS UNKNOWN...
[09242,20,12:44:13.048] Subscriptions received state change to Unknown stopping listening.
[09242,14,12:44:13.058] ELECTIONS: STARTING ELECTIONS.
[09242,14,12:44:13.058] ELECTIONS: (V=0) SHIFT TO LEADER ELECTION.
[09242,14,12:44:13.059] ELECTIONS: (V=0) VIEWCHANGE FROM [127.0.0.1:2112, {fcb5b77f-ac15-4ab1-aac0-e5d2cf8bda87}].
[09242,14,12:44:13.059] ELECTIONS: (V=0) MAJORITY OF VIEWCHANGE.
[09242,14,12:44:13.060] ELECTIONS: (V=0) SHIFT TO PREPARE PHASE.
[09242,14,12:44:13.060] ELECTIONS: (V=0) PREPARE_OK FROM [127.0.0.1:2112,{fcb5b77f-ac15-4ab1-aac0-e5d2cf8bda87}](L=139695236,W=139709810,C=139709810,E10@139505353:{ffeb9893-ca9f-430f-8e70-60f53d30f715}).
[09242,14,12:44:13.061] ELECTIONS: (V=0) SHIFT TO REG_LEADER.
[09242,14,12:44:13.064] ELECTIONS: (V=0) SENDING PROPOSAL CANDIDATE: [127.0.0.1:2112,{fcb5b77f-ac15-4ab1-aac0-e5d2cf8bda87}](L=139695236,W=139709810,C=139709810,E10@139505353:{ffeb9893-ca9f-430f-8e70-60f53d30f715}), ME: [127.0.0.1:2112,{fcb5b77f-ac15-4ab1-aac0-e5d2cf8bda87}](L=139695236,W=139709810,C=139709810,E10@139505353:{ffeb9893-ca9f-430f-8e70-60f53d30f715}).
[09242,14,12:44:13.065] ELECTIONS: (V=0) ACCEPT FROM [127.0.0.1:2112,{fcb5b77f-ac15-4ab1-aac0-e5d2cf8bda87}] M=[127.0.0.1:2112,{fcb5b77f-ac15-4ab1-aac0-e5d2cf8bda87}]).
[09242,14,12:44:13.065] ELECTIONS: (V=0) DONE. ELECTED MASTER = [127.0.0.1:2112,{fcb5b77f-ac15-4ab1-aac0-e5d2cf8bda87}](L=139695236,W=139709810,C=139709810,E10@139505353:{ffeb9893-ca9f-430f-8e70-60f53d30f715}). ME=[127.0.0.1:2112,{fcb5b77f-ac15-4ab1-aac0-e5d2cf8bda87}](L=139695236,W=139709810,C=139709810,E10@139505353:{ffeb9893-ca9f-430f-8e70-60f53d30f715}).
[09242,14,12:44:13.066] ========== [127.0.0.1:2112] PRE-MASTER STATE, WAITING FOR CHASER TO CATCH UP...
[09242,06,12:44:13.066] Subscriptions received state change to PreMaster stopping listening.
[09242,14,12:44:13.067] ========== [127.0.0.1:2112] IS MASTER... SPARTA!
[09242,10,12:44:13.068] Subscriptions Became Master so now handling subscriptions
[09242,12,12:44:13.096] === Writing E11@139709810:{a25b9059-d068-4147-9636-7fa03cae319a} (previous epoch at 139505353).
[09242,12,12:44:13.102] === Update Last Epoch E11@139709810:{a25b9059-d068-4147-9636-7fa03cae319a} (previous epoch at 139505353).
[09242,22,12:44:13.107] PROJECTIONS: Starting Projections Core Coordinator. (Node State : Master)
[09242,14,12:44:13.108] ========== [127.0.0.1:2112] Sub System 'Projections' initialized.
[09242,22,12:44:13.119] PROJECTIONS: Starting Projections Manager. (Node State : Master)
[09242,18,12:44:13.131] Created stats stream '$stats-127.0.0.1:2113', code = WrongExpectedVersion
[09242,22,12:44:13.141] PROJECTIONS: Resetting Worker Writer
[09242,12,12:49:37.988] SLOW QUEUE MSG [StorageWriterQueue]: WritePrepares - 1184ms. Q: 0/0.
[09242,11,12:49:40.493] PTables merge finished in 00:05:27.4554345 ([450000000, 450000000] entries merged into 450004760).
[09242,11,12:49:40.493] Loading and Verification of PTable '2d65e3b9-6194-4438-bf1a-be1625d5087a' started...
[09242,11,12:50:59.078] Loading PTable (Version: 4) '2d65e3b9-6194-4438-bf1a-be1625d5087a' (450004760 entries, cache depth 21) done in 00:01:18.5845333.
Member

shaan1337 commented Nov 22, 2017

PTable Merge Performance tests

Results merging 2 10GB files

Note: since we're using same 2 10GB files, the index entries are the same and will be merged into a ~10GB file.

cache-midpoints-in-ptable branch,--skip-db-verify --skip-index-verify: 362s merge, 0.1s load without verification: Total 362s
Note: There were missing entries while going through the index files, so a second pass was automatically made to cache the midpoints. It'll be faster when there are no missing entries.

cache-midpoints-in-ptable branch, --skip-db-verify : 372s merge, 37s load with verification: Total 409s

oss-v4.0.3 branch, --skip-db-verify : 327s merge, 78s load with verification: Total 405s

Creating an indexmap file

Spin up a database, modify an existing index map file, to the following content (keep the checkpoints the same, don't miss first newline):

1
136897089/136897089
10,0,00000000-0000-0000-0000-000000000000
10,1,00000000-0000-0000-0000-000000000001

md5sum indexmap| awk '{print toupper($0)}'
DB98BC2B501CDB34575FC89F6D89A0F5  INDEXMAP

Add MD5Sum to the index map:
DB98BC2B501CDB34575FC89F6D89A0F5
1
136897089/136897089
10,0,00000000-0000-0000-0000-000000000000
10,1,00000000-0000-0000-0000-000000000001

Make two copies of the 10GB PTable to index/00000000-0000-0000-0000-000000000000 and index/00000000-0000-0000-0000-000000000001 respectively (see steps above how to generate the PTable)
cp 10gb.dat ./00000000-0000-0000-0000-000000000000
cp 10gb.dat ./00000000-0000-0000-0000-000000000001

Prevent exceptions during test

diff --git a/src/EventStore.Core/Index/PTable.cs b/src/EventStore.Core/Index/PTable.cs
index f9b4ffa..0a67692 100644
--- a/src/EventStore.Core/Index/PTable.cs
+++ b/src/EventStore.Core/Index/PTable.cs
@@ -235,7 +235,8 @@ namespace EventStore.Core.Index
                         md5.TransformFinalBlock(Empty.ByteArray, 0, 0);
                         var fileHash = new byte[MD5Size];
                         stream.Read(fileHash, 0, MD5Size);
-                        ValidateHash(md5.Hash, fileHash);
+                        //ValidateHash(md5.Hash, fileHash);

diff --git a/src/EventStore.Core/TransactionLog/Chunks/ChunkHeader.cs b/src/EventStore.Core/TransactionLog/Chunks/ChunkHeader.cs
index 6ed6ba3..ed104af 100644
--- a/src/EventStore.Core/TransactionLog/Chunks/ChunkHeader.cs
+++ b/src/EventStore.Core/TransactionLog/Chunks/ChunkHeader.cs
@@ -78,8 +78,9 @@ namespace EventStore.Core.TransactionLog.Chunks
         {
             if (globalLogicalPosition < ChunkStartPosition || globalLogicalPosition > ChunkEndPosition)
             {
-                throw new Exception(string.Format("globalLogicalPosition {0} is out of chunk logical positions [{1}, {2}].",
-                                                  globalLogicalPosition, ChunkStartPosition, ChunkEndPosition));
+                //throw new Exception(string.Format("globalLogicalPosition {0} is out of chunk logical positions [{1}, {2}].",
+                    //                              globalLogicalPosition, ChunkStartPosition, ChunkEndPosition));
+                return 0;
             }
             return globalLogicalPosition - ChunkStartPosition;
         }
diff --git a/src/EventStore.Core/TransactionLog/Chunks/TFChunkManager.cs b/src/EventStore.Core/TransactionLog/Chunks/TFChunkManager.cs
index ffcab8b..382a013 100644
--- a/src/EventStore.Core/TransactionLog/Chunks/TFChunkManager.cs
+++ b/src/EventStore.Core/TransactionLog/Chunks/TFChunkManager.cs
@@ -288,8 +288,10 @@ namespace EventStore.Core.TransactionLog.Chunks
         public TFChunk.TFChunk GetChunkFor(long logPosition)
         {
             var chunkNum = (int)(logPosition / _config.ChunkSize);
-            if (chunkNum < 0 || chunkNum >= _chunksCount)
-                throw new ArgumentOutOfRangeException("logPosition", string.Format("LogPosition {0} does not have corresponding chunk in DB.", logPosition));
+            if (chunkNum < 0 || chunkNum >= _chunksCount){
+                //throw new ArgumentOutOfRangeException("logPosition", string.Format("LogPosition {0} does not have corresponding chunk in DB.", logPosition));
+                chunkNum = 0;
+            }
 
             var chunk = _chunks[chunkNum];
             if (chunk == null)

Running EventStore

#Clear disk cache
echo 3 > /proc/sys/vm/drop_caches

#Run EventStore
mono ./bin/clusternode/EventStore.ClusterNode.exe --db ../db --log ../logs --max-mem-table-size=1000 --skip-db-verify --skip-index-verify

#Write ~1200 events to trigger a PTable merge
mono bin/testclient/EventStore.TestClient.exe
>>> WRLT 1 20 20 1 a

Results with different options

cache-midpoints-in-ptable branch, --skip-db-verify --skip-index-verify : 362s merge, 0.1s load

[05326,10,12:04:54.286] PTables merge started (specialized for <= 2 tables).
[05326,13,12:04:54.304] === Writing E9@139300747:{ecff7ffb-07c4-4dad-a957-747ed43865f6} (previous epoch at 138716442).
[05326,13,12:04:54.310] === Update Last Epoch E9@139300747:{ecff7ffb-07c4-4dad-a957-747ed43865f6} (previous epoch at 138716442).
[05326,19,12:04:54.316] PROJECTIONS: Starting Projections Core Coordinator. (Node State : Master)
[05326,15,12:04:54.317] ========== [127.0.0.1:2112] Sub System 'Projections' initialized.
[05326,08,12:04:54.328] Created stats stream '$stats-127.0.0.1:2113', code = WrongExpectedVersion
[05326,19,12:04:54.338] PROJECTIONS: Starting Projections Manager. (Node State : Master)
[05326,19,12:04:54.339] PROJECTIONS: SubComponent Started: EventReaderCoreService
[05326,19,12:04:54.350] PROJECTIONS: Resetting Worker Writer
[05326,10,12:10:56.278] Cached 2097152 index midpoints to PTable
[05326,10,12:10:56.396] PTables merge finished in 00:06:02.0994750 ([450000000, 450000000] entries merged into 450004760).
[05326,10,12:10:56.396] Loading of PTable '8048d233-59aa-4f59-a7ca-97e3296ce33c' started...
[05326,10,12:10:56.397] Disabling Verification of PTable
[05326,10,12:10:56.397] Loading 2097152 cached midpoints from PTable
[05326,10,12:10:56.506] Loading PTable (Version: 4) '8048d233-59aa-4f59-a7ca-97e3296ce33c' (450004760 entries, cache depth 21) done in 00:00:00.1096874.

cache-midpoints-in-ptable branch, --skip-db-verify : 372s merge, 37s load

[03579,11,11:37:05.241] PTables merge started (specialized for <= 2 tables).
[03579,13,11:37:05.265] === Writing E8@138716442:{7ef0b647-131a-4c4e-b1d8-d4fd06660c02} (previous epoch at 138497188).
[03579,13,11:37:05.271] === Update Last Epoch E8@138716442:{7ef0b647-131a-4c4e-b1d8-d4fd06660c02} (previous epoch at 138497188).
[03579,20,11:37:05.279] PROJECTIONS: Starting Projections Core Coordinator. (Node State : Master)
[03579,15,11:37:05.280] ========== [127.0.0.1:2112] Sub System 'Projections' initialized.
[03579,12,11:37:05.291] Created stats stream '$stats-127.0.0.1:2113', code = WrongExpectedVersion
[03579,20,11:37:05.301] PROJECTIONS: Starting Projections Manager. (Node State : Master)
[03579,20,11:37:05.301] PROJECTIONS: SubComponent Started: EventReaderCoreService
[03579,20,11:37:05.311] PROJECTIONS: Resetting Worker Writer
[03579,11,11:43:17.177] Cached 2097152 index midpoints to PTable
[03579,11,11:43:17.288] PTables merge finished in 00:06:12.0402902 ([450000000, 450000000] entries merged into 450004760).
[03579,11,11:43:17.288] Loading and Verification of PTable '3035b688-f67e-4586-bdd3-f255df9e2c81' started...
[03579,11,11:43:54.567] Loading PTable (Version: 4) '3035b688-f67e-4586-bdd3-f255df9e2c81' (450004760 entries, cache depth 21) done in 00:00:37.2788743.

oss-v4.0.3, --skip-db-verify : 327s merge, 78s load

[09242,11,12:44:13.034] PTables merge started (specialized for <= 2 tables).
[09242,14,12:44:13.036] ========== [127.0.0.1:2112] IS UNKNOWN...
[09242,20,12:44:13.048] Subscriptions received state change to Unknown stopping listening.
[09242,14,12:44:13.058] ELECTIONS: STARTING ELECTIONS.
[09242,14,12:44:13.058] ELECTIONS: (V=0) SHIFT TO LEADER ELECTION.
[09242,14,12:44:13.059] ELECTIONS: (V=0) VIEWCHANGE FROM [127.0.0.1:2112, {fcb5b77f-ac15-4ab1-aac0-e5d2cf8bda87}].
[09242,14,12:44:13.059] ELECTIONS: (V=0) MAJORITY OF VIEWCHANGE.
[09242,14,12:44:13.060] ELECTIONS: (V=0) SHIFT TO PREPARE PHASE.
[09242,14,12:44:13.060] ELECTIONS: (V=0) PREPARE_OK FROM [127.0.0.1:2112,{fcb5b77f-ac15-4ab1-aac0-e5d2cf8bda87}](L=139695236,W=139709810,C=139709810,E10@139505353:{ffeb9893-ca9f-430f-8e70-60f53d30f715}).
[09242,14,12:44:13.061] ELECTIONS: (V=0) SHIFT TO REG_LEADER.
[09242,14,12:44:13.064] ELECTIONS: (V=0) SENDING PROPOSAL CANDIDATE: [127.0.0.1:2112,{fcb5b77f-ac15-4ab1-aac0-e5d2cf8bda87}](L=139695236,W=139709810,C=139709810,E10@139505353:{ffeb9893-ca9f-430f-8e70-60f53d30f715}), ME: [127.0.0.1:2112,{fcb5b77f-ac15-4ab1-aac0-e5d2cf8bda87}](L=139695236,W=139709810,C=139709810,E10@139505353:{ffeb9893-ca9f-430f-8e70-60f53d30f715}).
[09242,14,12:44:13.065] ELECTIONS: (V=0) ACCEPT FROM [127.0.0.1:2112,{fcb5b77f-ac15-4ab1-aac0-e5d2cf8bda87}] M=[127.0.0.1:2112,{fcb5b77f-ac15-4ab1-aac0-e5d2cf8bda87}]).
[09242,14,12:44:13.065] ELECTIONS: (V=0) DONE. ELECTED MASTER = [127.0.0.1:2112,{fcb5b77f-ac15-4ab1-aac0-e5d2cf8bda87}](L=139695236,W=139709810,C=139709810,E10@139505353:{ffeb9893-ca9f-430f-8e70-60f53d30f715}). ME=[127.0.0.1:2112,{fcb5b77f-ac15-4ab1-aac0-e5d2cf8bda87}](L=139695236,W=139709810,C=139709810,E10@139505353:{ffeb9893-ca9f-430f-8e70-60f53d30f715}).
[09242,14,12:44:13.066] ========== [127.0.0.1:2112] PRE-MASTER STATE, WAITING FOR CHASER TO CATCH UP...
[09242,06,12:44:13.066] Subscriptions received state change to PreMaster stopping listening.
[09242,14,12:44:13.067] ========== [127.0.0.1:2112] IS MASTER... SPARTA!
[09242,10,12:44:13.068] Subscriptions Became Master so now handling subscriptions
[09242,12,12:44:13.096] === Writing E11@139709810:{a25b9059-d068-4147-9636-7fa03cae319a} (previous epoch at 139505353).
[09242,12,12:44:13.102] === Update Last Epoch E11@139709810:{a25b9059-d068-4147-9636-7fa03cae319a} (previous epoch at 139505353).
[09242,22,12:44:13.107] PROJECTIONS: Starting Projections Core Coordinator. (Node State : Master)
[09242,14,12:44:13.108] ========== [127.0.0.1:2112] Sub System 'Projections' initialized.
[09242,22,12:44:13.119] PROJECTIONS: Starting Projections Manager. (Node State : Master)
[09242,18,12:44:13.131] Created stats stream '$stats-127.0.0.1:2113', code = WrongExpectedVersion
[09242,22,12:44:13.141] PROJECTIONS: Resetting Worker Writer
[09242,12,12:49:37.988] SLOW QUEUE MSG [StorageWriterQueue]: WritePrepares - 1184ms. Q: 0/0.
[09242,11,12:49:40.493] PTables merge finished in 00:05:27.4554345 ([450000000, 450000000] entries merged into 450004760).
[09242,11,12:49:40.493] Loading and Verification of PTable '2d65e3b9-6194-4438-bf1a-be1625d5087a' started...
[09242,11,12:50:59.078] Loading PTable (Version: 4) '2d65e3b9-6194-4438-bf1a-be1625d5087a' (450004760 entries, cache depth 21) done in 00:01:18.5845333.
Cache midpoints in PTables to allow faster startup times.
- Add a new version of PTable, V4:
1) keep a 128 byte footer before the md5 hash
2) first 4 bytes of the footer stores an unsigned int: the number of midpoints cached in the PTable
3) Midpoints will be cached between the index entries and the footer
4) Each midpoint index entry has the same size as an index entry (24 bytes for V4)

Midpoints will be cached during the next PTable merge.
On startup, if index verification is disabled (--skip-index-verify) and cached midpoints (with the same depth requested) are present in the file, they will be loaded directly into the midpoints memory cache.
Otherwise, they will be loaded normally, by computing the midpoint locations and reading these entries from the PTable.

- Refactor Index tests to be able to parametrize them
skipIndexVerify added as parameter to multiple tests

- Add additional assertions to be safer when running without index verification.
If any assertion fails when reading an index entry, we force index verification (by creating a .forceverify file in index/) on next startup.

@hayley-jean hayley-jean merged commit 804fac0 into release-v4.0.4 Dec 12, 2017

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
wercker/build-mono4 Wercker pipeline passed
Details

@hayley-jean hayley-jean deleted the cache-midpoints-in-ptable branch Dec 12, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment