Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When fetching nested inner hits only access stored fields when needed #25864

Conversation

@martijnvg
Copy link
Member

commented Jul 24, 2017

Stored fields were still being accessed for nested inner hits even if the _source was not requested.
This was done to figure out the id of the root document. However this is already known higher up the stack. So instead this change adds the id to the nested search context, so that it is no longer required to be fetched via the stored fields.

In case the _source is large and no source is requested then hot threads like these ones would still appear:

100.3% (501.3ms out of 500ms) cpu usage by thread 'elasticsearch[AfXKKfq][search][T#6]'
     2/10 snapshots sharing following 22 elements
       org.apache.lucene.store.DataInput.skipBytes(DataInput.java:352)
       org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.skipField(CompressingStoredFieldsReader.java:246)
       org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:601)
       org.apache.lucene.index.CodecReader.document(CodecReader.java:88)
       org.apache.lucene.index.FilterLeafReader.document(FilterLeafReader.java:411)
       org.elasticsearch.search.fetch.FetchPhase.loadStoredFields(FetchPhase.java:347)
       org.elasticsearch.search.fetch.FetchPhase.createNestedSearchHit(FetchPhase.java:219)
       org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:150)
       org.elasticsearch.search.fetch.subphase.InnerHitsFetchSubPhase.hitsExecute(InnerHitsFetchSubPhase.java:73)
       org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:166)
       org.elasticsearch.search.fetch.subphase.InnerHitsFetchSubPhase.hitsExecute(InnerHitsFetchSubPhase.java:73)
       org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:166)
       org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:422)

and:

8/10 snapshots sharing following 27 elements
       org.apache.lucene.codecs.compressing.LZ4.decompress(LZ4.java:135)
       org.apache.lucene.codecs.compressing.CompressionMode$4.decompress(CompressionMode.java:138)
       org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader$BlockState$1.fillBuffer(CompressingStoredFieldsReader.java:531)
       org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader$BlockState$1.readBytes(CompressingStoredFieldsReader.java:550)
       org.apache.lucene.store.DataInput.readBytes(DataInput.java:87)
       org.apache.lucene.store.DataInput.skipBytes(DataInput.java:350)
       org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.skipField(CompressingStoredFieldsReader.java:246)
       org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:601)
       org.apache.lucene.index.CodecReader.document(CodecReader.java:88)
       org.apache.lucene.index.FilterLeafReader.document(FilterLeafReader.java:411)
       org.elasticsearch.search.fetch.FetchPhase.loadStoredFields(FetchPhase.java:347)
       org.elasticsearch.search.fetch.FetchPhase.createNestedSearchHit(FetchPhase.java:219)
       org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:150)
       org.elasticsearch.search.fetch.subphase.InnerHitsFetchSubPhase.hitsExecute(InnerHitsFetchSubPhase.java:73)
       org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:166)
       org.elasticsearch.search.fetch.subphase.InnerHitsFetchSubPhase.hitsExecute(InnerHitsFetchSubPhase.java:73)
       org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:166)
       org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:422)
@@ -211,68 +215,81 @@ private SearchHit createSearchHit(SearchContext context, FieldsVisitor fieldsVis
return searchHit;
}

private SearchHit createNestedSearchHit(SearchContext context, int nestedTopDocId, int nestedSubDocId, int rootSubDocId, Set<String> fieldNames, List<String> fieldNamePatterns, LeafReaderContext subReaderContext) throws IOException {
private SearchHit createNestedSearchHit(SearchContext context, int nestedTopDocId, int nestedSubDocId,

This comment has been minimized.

Copy link
@martijnvg

martijnvg Jul 24, 2017

Author Member

Note: other changes outside this method were to remove the checkstyle suppression.

@jpountz
Copy link
Contributor

left a comment

I left a minor comment. Otherwise LGTM.

core/src/main/java/org/elasticsearch/search/fetch/FetchPhase.java Outdated
final FieldsVisitor rootFieldsVisitor = new FieldsVisitor(context.sourceRequested() || context.highlight() != null);
loadStoredFields(context, subReaderContext, rootFieldsVisitor, rootSubDocId);
rootFieldsVisitor.postProcess(context.mapperService());
FieldsVisitor rootFieldsVisitor = null;

This comment has been minimized.

Copy link
@jpountz

jpountz Jul 24, 2017

Contributor

Since what we need later is the source rather than the visitor, maybe we should declare BytesReference source = null here and assign it in the below if statement?

This comment has been minimized.

Copy link
@martijnvg

martijnvg Jul 24, 2017

Author Member

👍

@martijnvg martijnvg force-pushed the martijnvg:inner_hits_only_access_stored_fields_when_needed branch Jul 24, 2017

@clintongormley clintongormley added v6.1.0 and removed v5.6.0 labels Jul 25, 2017

inner hits: Only access stored fields when needed
Stored fields were still being accessed for nested inner hits even if the _source was not requested.
This was done to figure out the id of the root document. However this is already known higher up the stack.
So instead this change adds the id to the nested search context, so that it is no longer required to be fetched via the stored fields.

In case the _source is large and no source is requested then hot threads like these ones would still appear:

```
100.3% (501.3ms out of 500ms) cpu usage by thread 'elasticsearch[AfXKKfq][search][T#6]'
     2/10 snapshots sharing following 22 elements
       org.apache.lucene.store.DataInput.skipBytes(DataInput.java:352)
       org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.skipField(CompressingStoredFieldsReader.java:246)
       org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:601)
       org.apache.lucene.index.CodecReader.document(CodecReader.java:88)
       org.apache.lucene.index.FilterLeafReader.document(FilterLeafReader.java:411)
       org.elasticsearch.search.fetch.FetchPhase.loadStoredFields(FetchPhase.java:347)
       org.elasticsearch.search.fetch.FetchPhase.createNestedSearchHit(FetchPhase.java:219)
       org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:150)
       org.elasticsearch.search.fetch.subphase.InnerHitsFetchSubPhase.hitsExecute(InnerHitsFetchSubPhase.java:73)
       org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:166)
       org.elasticsearch.search.fetch.subphase.InnerHitsFetchSubPhase.hitsExecute(InnerHitsFetchSubPhase.java:73)
       org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:166)
       org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:422)
```

and:

```
8/10 snapshots sharing following 27 elements
       org.apache.lucene.codecs.compressing.LZ4.decompress(LZ4.java:135)
       org.apache.lucene.codecs.compressing.CompressionMode$4.decompress(CompressionMode.java:138)
       org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader$BlockState$1.fillBuffer(CompressingStoredFieldsReader.java:531)
       org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader$BlockState$1.readBytes(CompressingStoredFieldsReader.java:550)
       org.apache.lucene.store.DataInput.readBytes(DataInput.java:87)
       org.apache.lucene.store.DataInput.skipBytes(DataInput.java:350)
       org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.skipField(CompressingStoredFieldsReader.java:246)
       org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:601)
       org.apache.lucene.index.CodecReader.document(CodecReader.java:88)
       org.apache.lucene.index.FilterLeafReader.document(FilterLeafReader.java:411)
       org.elasticsearch.search.fetch.FetchPhase.loadStoredFields(FetchPhase.java:347)
       org.elasticsearch.search.fetch.FetchPhase.createNestedSearchHit(FetchPhase.java:219)
       org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:150)
       org.elasticsearch.search.fetch.subphase.InnerHitsFetchSubPhase.hitsExecute(InnerHitsFetchSubPhase.java:73)
       org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:166)
       org.elasticsearch.search.fetch.subphase.InnerHitsFetchSubPhase.hitsExecute(InnerHitsFetchSubPhase.java:73)
       org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:166)
       org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:422)
```

@martijnvg martijnvg force-pushed the martijnvg:inner_hits_only_access_stored_fields_when_needed branch to a9ae52e Jul 25, 2017

@martijnvg martijnvg merged commit a9ae52e into elastic:master Jul 25, 2017

1 of 2 checks passed

elasticsearch-ci Build started sha1 is merged.
Details
CLA Commit author is a member of Elasticsearch
Details

@colings86 colings86 added v6.0.0-beta1 and removed v6.0.0 labels Jul 31, 2017

@lcawl lcawl removed the v6.1.0 label Dec 12, 2017

@jimczi jimczi added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.