Scavenge index #1638

lscpike · 2018-05-24T17:17:09Z

This is based on PR #1637.

The index currently gets scavenged during a level merge. This can happen at any time and can impact performance of the node. It is also very inefficient as each level performs a new scavenge even though it may have just been scavenged for a lower level. If the database hasn't been scavenged since the last merge this is a complete waste of time.

This PR adds an explicit scavenge operation to the TableIndex. This is called at the end of a DB scavenge meaning there is definite benefit to doing the work. It also means it can be stopped during high load situations (thanks to PR #1632.)

Notes

The index scavenge blocks the awaiting tables queue. This shouldn't be a problem as the impact is no different to a normal merge.
The scavenge code is based upon the merge code.
There's a decent test suite in the PR.
The merge still has a lot of the scavenge code in there for the v1 upgrade path. If v1 support is dropped in future this code gets a lot smaller.
The activity is logged to the $scavenges stream to fit the general pattern.

lscpike · 2018-05-24T17:31:40Z

Just to note this my last PR around scavenging.

gregoryyoung · 2018-05-25T04:45:11Z

Can you squash the changes?

lscpike · 2018-05-25T06:59:27Z

Sure. You want the whole lot as 1 commit? There's a lot of files touched across the 7 prs. Each PR is mergable/working if you do them in order. The diffs will obviously get smaller each time.

…

On Fri, 25 May 2018, 05:45 Greg Young, ***@***.***> wrote: Can you squash the changes? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#1638 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGr263vFAvLVd2WZPMPbaVIDiV91_l1Qks5t14xcgaJpZM4UMp6Z> .

gregoryyoung · 2018-05-25T07:19:24Z

I meant just for this one (in case we later need to revert). Its far easier to have the commit history with one commit for the feature.

lscpike · 2018-05-25T07:45:39Z

Makes sense! Want the same for the others; faster scavenge and scavenge log clean up? I need to rebase anyway.

lscpike · 2018-06-01T12:05:59Z

We realized that the OptimizeIndexMerge flag results in a lot of memory sitting unused when the index scavenge is complete. As this optimisation now only applies when doing a scavenge it makes sense to free the memory at the end of the scavenge process.

I can squash in if you want, but figured it's easier to see what we've changed as is for now.

megakid · 2018-06-20T09:20:25Z

Any movement on this or if not, timelines for progression?

src/EventStore.Core/TransactionLog/Chunks/TFChunkScavengerLogManager.cs

riccardone · 2018-07-27T09:05:55Z

@lscpike @megakid we like the stop operation that is implemented in your PR. Thanks for that.
We are not sure about the changes that you have done with multi-threading. The scavenge operation as is today is a very simple operation and we would like to keep it simple in order to serve all the different scenarios and environments where our clients are running ES.
Could it be possible keep as is or with minimal modification the current implementation and provide a separate implementation with the parallel/multithread execution?
Something to be driven by command line param (--parallel-scavenge) or http?

lscpike · 2018-07-27T09:16:39Z

By default it is single threaded. The thread count is exposed as a http option. This feature is very important for us as a 3 week scavenge time is unmanagable where a 4 day scavenge time is more reasonable (real world example).

…

On Fri, 27 Jul 2018, 10:06 Riccardo Di Nuzzo, ***@***.***> wrote: @lscpike <https://github.com/lscpike> @megakid <https://github.com/megakid> we like the stop operation that is implemented in your PR. Thanks for that. We are not sure about the changes that you have done with multi-threading. The scavenge operation as is today is a very simple operation and we would like to keep it simple in order to serve all the different scenarios and environments where our clients are running ES. Could it be possible keep as is or with minimal modification the current implementation and provide a separate implementation with the parallel/multithread execution? Something to be driven by command line param (--parallel-scavenge) or http? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1638 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGr26_Cta-O69MM18TmjUmtRMwOTZHohks5uKtgBgaJpZM4UMp6Z> .

riccardone · 2018-07-27T09:46:01Z

I get the fact that there is a thread param count. I was asking about the possibility to keep separate your new scavenge multi-threaded implementation from the existing and use your one on demand. That's the reason for the --parallel-scavenge suggestion. An alternative to the command line param, the presence of the http thread count option could be used to decide to use your new implementation or the original.

lscpike · 2018-07-27T10:13:00Z

The parallel implementation is only about 3 lines of code. Are you more concerned about the performance optimizations we've made?

Would it be possible to start merging the prior PRs to make the faster scavenge and index scavenge PRs easier to review? I'll rebase the others as each PR is merged to give a clean diff. We can then focus on getting the scavenge optimizations as low risk as possible for you.

gregoryyoung · 2018-07-27T10:15:09Z

The parallel can work out to be slower was more the comment due to disk head thrashing vs sequential reading

lscpike · 2018-07-27T10:38:00Z

The default thread count of 1 will result in sequential reads of the chunks. It's then up to the user not to use the hidden option (not in the UI) that might have a decrease in performance for their disks. That can be in the documentation if required.

lscpike · 2018-09-20T10:29:02Z

We're running this in production as a custom build now. Thought you might be interested to see the scavenge log.

shaan1337 · 2018-11-05T06:36:47Z

@lscpike @megakid Kindly also rebase these commits (including 6b18816)

lscpike · 2018-11-05T15:04:20Z

Rebased as requested.

riccardone · 2018-11-09T15:49:20Z

@lscpike I'm doing the final tests of this PR.
I wonder if it could be possible to log the 'total space saved' as it was before the changes?
https://github.com/lscpike/EventStore/blob/feature/scavenge_index/src/EventStore.Core/TransactionLog/Chunks/TFChunkScavenger.cs#L204
example:
SCAVENGING: total time taken: 00:08:47.3769352, total space saved: 802301658

riccardone · 2018-11-12T12:01:37Z

@lscpike in PTableConstruction can I ask why the new method is called Scavenged and not Scavenge?

lscpike · 2018-11-12T12:06:39Z

Probably a typo. I'll fix in a bit.

…

On Mon, 12 Nov 2018, 12:01 Riccardo Di Nuzzo ***@***.*** wrote: @lscpike <https://github.com/lscpike> in PTableConstruction can I ask why the new method is called Scavenged and not Scavenge? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1638 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGr26_5KdhIjb5emHp0dyqdXn4_fquAuks5uuWMngaJpZM4UMp6Z> .

lscpike · 2018-11-12T15:08:27Z

Just went to fix it and realised it's a factory method. It's createing a "Scavenged" PTable. I'll leave as is unless you have a different opinion.

riccardone

I tested these changes with small, medium and large databases and I confirm that the code works as expected. Well done, this is a very good improvement. Thanks.

riccardone

I tested these changes with small, medium and large databases and I confirm that the code works as expected. Well done, this is a very good improvement. Thanks.

riccardone · 2018-11-26T13:17:39Z

@lscpike it's probably better if you rebase your PR (not me). Now fw is on 4.7.1 and there is only a simple conflict to fix on index_map. After that I will proceed merging the Faster_scavenge and this PR into master. Thanks

- Ensure we don't have too many threads - More efficient memory use to avoid BitArray. - Smaller struct for in memory copy of chunk.

… scavenge twice to collect up commits spanning multiple chunks.

- Test the TableIndex scavenge process. - Test the index is scavenged at the end of a DB scavenge. - Remove the ExistsAt check when doing an index table merge. The existsAt is still used when upgrading from a v1 index.

lscpike · 2018-11-26T16:26:52Z

I rebased and resolved this. This branch includes the faster_scavenge changes. Do you want me to rebase that PR too or are you happy to merge all from here?

riccardone · 2018-11-26T18:01:44Z

I'm happy to merge all from here, thanks.

riccardone

I tested these changes with small, medium and large databases and I confirm that the code works as expected. Well done, this is a very good improvement. Thanks.

ChrisChinchilla · 2018-11-27T16:29:30Z

Docs noted @riccardone and @shaan1337

lscpike mentioned this pull request May 24, 2018

Scavenge log changes EventStore/EventStore.UI#195

Merged

lscpike force-pushed the feature/scavenge_index branch 3 times, most recently from e30f853 to 724043a Compare May 29, 2018 10:07

riccardone self-requested a review July 26, 2018 10:24

riccardone suggested changes Jul 26, 2018

View reviewed changes

src/EventStore.Core/TransactionLog/Chunks/TFChunkScavengerLogManager.cs Outdated Show resolved Hide resolved

riccardone self-requested a review July 26, 2018 14:48

lscpike force-pushed the feature/scavenge_index branch from 5f1a4f3 to 1349278 Compare July 27, 2018 10:30

riccardone added this to the v4.2 (RC) milestone Aug 22, 2018

shaan1337 mentioned this pull request Nov 5, 2018

Scavenge start from chunk #1636

Merged

lscpike force-pushed the feature/scavenge_index branch from 1349278 to 4520665 Compare November 5, 2018 15:00

riccardone previously approved these changes Nov 9, 2018

View reviewed changes

lscpike dismissed riccardone’s stale review via f5bbf06 November 9, 2018 17:28

shaan1337 added the area/documentation Issues relating to project documentation label Nov 12, 2018

riccardone previously approved these changes Nov 13, 2018

View reviewed changes

riccardone mentioned this pull request Nov 13, 2018

Large Index Merge to be triggered manually #1738

Closed

riccardone dismissed their stale review via 98bf1df November 26, 2018 12:16

riccardone previously approved these changes Nov 26, 2018

View reviewed changes

riccardone dismissed their stale review via f5bbf06 November 26, 2018 13:15

riccardone force-pushed the feature/scavenge_index branch from 98bf1df to f5bbf06 Compare November 26, 2018 13:15

James Connor and others added 10 commits November 26, 2018 16:01

Fast scavenge work

5c0490b

Re-work of scavenger to cache in memory.

3343f50

TFChunkScavenger

958eef0

- Ensure we don't have too many threads - More efficient memory use to avoid BitArray. - Smaller struct for in memory copy of chunk.

Removing test that validates behaviour that no longer exists. Need to…

c29e922

… scavenge twice to collect up commits spanning multiple chunks.

Added HTTP parameter for multithreaded scavenging. Defaults to 1.

643ad87

Improved memory sharing using ObjectPool for multithreaded scavenge.

d18d0d7

Fix transposed merge space saved calcuation.

27cb352

Scavenge the index after scavenging the db:

55c5448

- Test the TableIndex scavenge process. - Test the index is scavenged at the end of a DB scavenge. - Remove the ExistsAt check when doing an index table merge. The existsAt is still used when upgrading from a v1 index.

DeOptimizeAll after index scavenge

2099c05

Revert the end of scavenge log line to the same content as before.

3e7f3ba

lscpike force-pushed the feature/scavenge_index branch from f5bbf06 to 3e7f3ba Compare November 26, 2018 16:23

riccardone approved these changes Nov 26, 2018

View reviewed changes

lscpike mentioned this pull request Nov 26, 2018

Faster scavenge #1637

Closed

riccardone merged commit ae4d359 into EventStore:master Nov 27, 2018

lscpike deleted the feature/scavenge_index branch December 6, 2018 11:46

megakid mentioned this pull request Dec 27, 2018

5.0.0-rc1 Release Notes #1806

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scavenge index #1638

Scavenge index #1638

lscpike commented May 24, 2018

lscpike commented May 24, 2018

gregoryyoung commented May 25, 2018

lscpike commented May 25, 2018 via email

gregoryyoung commented May 25, 2018

lscpike commented May 25, 2018

lscpike commented Jun 1, 2018

megakid commented Jun 20, 2018

riccardone commented Jul 27, 2018

lscpike commented Jul 27, 2018 via email

riccardone commented Jul 27, 2018 •

edited

lscpike commented Jul 27, 2018

gregoryyoung commented Jul 27, 2018

lscpike commented Jul 27, 2018

lscpike commented Sep 20, 2018

shaan1337 commented Nov 5, 2018

lscpike commented Nov 5, 2018

riccardone commented Nov 9, 2018

riccardone commented Nov 12, 2018

lscpike commented Nov 12, 2018 via email

lscpike commented Nov 12, 2018

riccardone left a comment

riccardone left a comment

riccardone commented Nov 26, 2018 •

edited

lscpike commented Nov 26, 2018

riccardone commented Nov 26, 2018

riccardone left a comment

ChrisChinchilla commented Nov 27, 2018

Scavenge index #1638

Scavenge index #1638

Conversation

lscpike commented May 24, 2018

Notes

lscpike commented May 24, 2018

gregoryyoung commented May 25, 2018

lscpike commented May 25, 2018 via email

gregoryyoung commented May 25, 2018

lscpike commented May 25, 2018

lscpike commented Jun 1, 2018

megakid commented Jun 20, 2018

riccardone commented Jul 27, 2018

lscpike commented Jul 27, 2018 via email

riccardone commented Jul 27, 2018 • edited

lscpike commented Jul 27, 2018

gregoryyoung commented Jul 27, 2018

lscpike commented Jul 27, 2018

lscpike commented Sep 20, 2018

shaan1337 commented Nov 5, 2018

lscpike commented Nov 5, 2018

riccardone commented Nov 9, 2018

riccardone commented Nov 12, 2018

lscpike commented Nov 12, 2018 via email

lscpike commented Nov 12, 2018

riccardone left a comment

Choose a reason for hiding this comment

riccardone left a comment

Choose a reason for hiding this comment

riccardone commented Nov 26, 2018 • edited

lscpike commented Nov 26, 2018

riccardone commented Nov 26, 2018

riccardone left a comment

Choose a reason for hiding this comment

ChrisChinchilla commented Nov 27, 2018

riccardone commented Jul 27, 2018 •

edited

riccardone commented Nov 26, 2018 •

edited