Description
Is your feature request related to a problem? Please describe
docValues
are a columnar storage format used by Lucene to store indexed data in a way that facilitates efficient aggregation, sorting etc.
Stored fields, on the other hand, are used to store the actual values of fields as they were inserted into the index.
Currently the _source
field stores the original documents as stored field. We can possibly skip storing the _source field in cases where the values are already store in docValues and retrieve the field values from docValue instead. This will help in reducing the storage cost significantly. Based on the nature of the query we can skip or fetch some or all of the fields from docValues to serve the search queries.
Describe the solution you'd like
Add a new parameter in MappedFieldType
to indicate whether deriving source is supported for the filed or not, something similar to below
private final boolean derivedSourceSupported;
public boolean isDerivedSourceSupported() {
return derivedSourceSupported;
}
Add a new method in FieldMapper with the following signature
public void buildDerivedSource(XContentBuilder builder, LeafReader leafReader, int docId) throws IOException {
// Implement this method in respective Mappers
}
We will implement this method in all the sub-classes where we will be supporting deriving source.
Usage - will call this method in the Search or Get path and pass the leafReader(to read the docValues from) and docId(for which we need the docValues). We will also be passing a XContentBuilder to which we will be adding the source after deriving. Will be calling this method for all the FieldMappers that we have and will finally set the source to the builder object that we will get.
Related component
Indexing:Performance
Describe alternatives you've considered
N/A