Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zero downtime upgrades, merge-based index construction #42

Merged
merged 30 commits into from Aug 29, 2023

Conversation

vlofgren
Copy link
Contributor

@vlofgren vlofgren commented Aug 24, 2023

This makes modifications to the loader so that the live production environment doesn't need to be taken offline to prepare a new index.

A big change is keeping the URL database in a separate sqlite db instead of mariadb. This removes the need to take the system offline during loading.

It also moves the index construction bits out of the index-server and into a process, to make it possible to process with a different version of the logic from the index. A very neat side-effect of this is that you get a sort of dehydrated index you can back-up and restore to roll a problematic release with minimal downtime.

The pull request also deprecates the lexicon service altogether, and almost completely rewrites index-construction to use an index merging based approach that does not require as much RAM.

It also reduces the RAM requirements for the index service by a lot, since it no longer needs a lexicon. This makes the index faster because it can use a sub-32 Gb heap and CompressedOOPs. The index service also no longer needs to load the lexicon on start-up, enabling it to restart instantaneously.

Have a single class responsible for encoding and decoding URL ids, as it's a bit finicky and used all over.
Deprecate the LoadUrl instruction entirely. We no longer need to be told upfront about which URLs to expect, as IDs are generated from the domain id and document ordinal.

For now, we no longer store new URLs in different domains.  We need to re-implement this somehow, probably in a different job or a as a different output.
Also refactor along the way.  Really needs an additional pass, these tests are very hairy.
They seemed like a good idea at the time, but in practice they're wasting resources and not really providing the clarity I had hoped.
SWAP_LEXICON doesn't instruct the index service to do anything.  It just moves the file.
It's not necessary anymore with the new linkdb.
This provides a much cleaner separation of concerns, and makes it possible to get rid of a lot of the gunkier parts of the index service.  It will also permit lowering the Xmx on the index service a fair bit, so we can get CompressedOOps again :D
@vlofgren vlofgren changed the title WIP: No downtime upgrades WIP: Zero downtime upgrades Aug 25, 2023
This is a system-wide change.  The index used to have a lexicon, mapping words to wordIds using a large in-memory hash table.   This made index-construction easier, but it
also added a fairly significant RAM penalty to both the index service and the loader.

The new design moves to 64 bit word identifiers calculated using the murmur hash of the keyword, and an index construction based on merging smaller indices.

It also became necessary half-way through to upgrade guice as its error reporting wasn't *quite* compatible with JDK20.
@vlofgren vlofgren changed the title WIP: Zero downtime upgrades WIP: Zero downtime upgrades, merge-based index construction Aug 28, 2023
* Reduce memory churn in LoaderIndexJournalWriter, fix bug with keyword mappings as well
* Remove remains of OldDomains
* Ensure LOADER_PROCESS_OPTS gets fed to the processes
* LinkdbStatusWriter won't execute batch after each added item post 100 items
@vlofgren vlofgren marked this pull request as ready for review August 29, 2023 15:05
@vlofgren vlofgren merged commit bdcbfb1 into master Aug 29, 2023
@vlofgren vlofgren changed the title WIP: Zero downtime upgrades, merge-based index construction Zero downtime upgrades, merge-based index construction Sep 14, 2023
@vlofgren vlofgren deleted the no-downtime-upgrades branch March 21, 2024 13:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant