New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not publish StorageMessage.EventCommitted messages when rebuilding index (1.8x index rebuild speedup) #1829

Merged
merged 1 commit into from Jan 25, 2019

Conversation

2 participants
@shaan1337
Copy link
Member

shaan1337 commented Jan 16, 2019

Symptops:

  • Can take 1-2 minutes for a database to startup even after index rebuild of partial MemTable is complete.
  • 100% MainQueue usage during and after index rebuild for around 1.5 minutes
  • EventStore UI unresponsive during these ~1.5 minutes

Description

  • Publishing StorageMessage.EventCommitted messages during index rebuilds are unnecessary (since these are not live events) and cause slow startup times when there are many events to rebuild.

  • It is particularly visible after the switch to mono 5.16 since the main queue subscribers initialize faster and the event committed messages are actually processed by all subscribers. It can be reproduced quite reliably (~80% of the time) when there are just less than 1M events - an almost full MemTable - to be rebuilt. There is a race condition between the subscribers initializing and subscribing to the MainQueue and the StorageMessage.EventCommitted messages being published. When the subscribers have already initialized and subscribed it can take up to ~1.5 minutes to start up, during which EventStore is unresponsive. If subscribers haven't yet initialized, the startup is done in less than 5-10 seconds.

  • This fix consequently also speeds up Full index rebuilds by approximately 1.8x (11M records in the example below):
    4.1.1-hotfix1, Mono 4.6.2, before fix:
    ReadIndex rebuilding done: total processed 11963929 records, time elapsed: 00:01:41.7778130.
    5.0.0 RC2, Mono 5.16, before fix:
    ReadIndex rebuilding done: total processed 11963981 records, time elapsed: 00:01:29.4323720.
    5.0.0 RC2, Mono 5.16, after fix:
    ReadIndex rebuilding done: total processed 11964084 records, time elapsed: 00:00:49.3172040.

Do not publish event committed messages when rebuilding index
Co-Authored-By: Lokhesh Ujhoodha <iamyog@hotmail.co.uk>
Co-Authored-By: Avish Cheetaram <avishcheetaram.ai@gmail.com>

@shaan1337 shaan1337 force-pushed the fix-index-init branch from a4a8653 to 9ab261e Jan 16, 2019

@Lougarou Lougarou merged commit 14976c9 into master Jan 25, 2019

@Lougarou Lougarou deleted the fix-index-init branch Jan 25, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment