Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
We developed YoctoDB as a replacement for Apache Lucene used in Yandex.Auto as a document search engine. As the size of the document database grew larger we started to feel dissatisfied with overall latency and CPU resource consumption. See our internal benchmarks to get an impression of what we were able to achieve.
Yandex.Auto backend continuously indexes partitioned set of documents. Each partition is indexed independently and concurrently. Partition indexing produces a search index for each partition which is transferred to search machines. Search machines periodically rebuild (reopen) a composite database upon recent versions of all the partition indexes to fetch the latest updates. A composite database is used in a self-sufficient read-only way for processing user search requests.
The main idea behind YoctoDB development is to provide a way to build an efficient read-only database taking into account all the data supplied and using as many indexing resources as necessary to improve search characteristics.