Skip to content
Mark Papadakis edited this page May 24, 2017 · 3 revisions

Trinity relies on implementations of codecs for creating and accessing segments/index sources. See concepts for more information on those concepts. You should read codecs.h, where all classes/interfaces and methods are documented.

Two codecs are included in the Trinity distribution. One based on Google’s codec described in Challenges in Building Large-Scale Information Retrieval Systems by Jeff Dean, and another based on the default codec of the open source Apache Lucene.

It’s easy to create new codecs, all you need to do is subclass the various Trinity::Codecs classes and implement the virtual methods in your derived classes. You should look at the shipped codecs implementations for how that works.

The lucene-based codec included is somewhat more appropriate for faster search and reduced index size, but it depends on the contents you index. You may want to try them both out and figure out which works better for you. You can of course just extend them or create your own -- it's pretty easy, all you need to do is implement a few interfaces, and it's probably a good idea if you want a memory-resident optimised index(though you can use the included codecs for strictly in-memory access for that purpose as well).

Clone this wiki locally