New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose ShardId via LeafReader rather than Directory API #8812
Conversation
@mikemccand can you take another look? |
One small typo still, else LGTM. Thanks for simplifying the deletes logging! |
Today we try to fetch a shard Id for a given IndexReader / LeafReader by walking it's tree until the lucene internal SegmentReader and then casting the directory into a StoreDirecotory. This class is fully internal to Elasticsearch and should not be exposed outside of the Store. This commit makes StoreDirectory a private inner class and adds dedicated ElasticsearchDirectoryReader / ElasticserachLeafReader exposing a ShardId getter to obtain information about the shard the index / segment belogs to. These classes can be used to expose other segment specific information in the future more easily.
635ea79
to
8d7ce3c
Compare
* A {@link org.apache.lucene.index.FilterLeafReader} that exposes | ||
* Elasticsearch internal per shard / index information like the shard ID. | ||
*/ | ||
public final class ElasticsearchLeafReader extends FilterLeafReader { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs to override getCoreCacheKey() & co, or things are not really working per-segment :)
From javadocs:
NOTE: If this FilterLeafReader does not change the content the contained reader, you could consider overriding
getCoreCacheKey() so that CachingWrapperFilter shares the same entries for this atomic reader
and the wrapped one. getCombinedCoreAndDeletesKey() could be overridden as well if the live docs are not changed either.
Added comment: we should fix the method overrides here. |
+1 How come no tests caught this? |
I committed a fix. I suspect nothing tests these cache key methods. It may be that no tests failed... because they are all integration tests and only test core-listener methods, which I think is the mechanism that happens to be used today (which could easily change, and core cache key might be used elsewhere, etc) |
i reviewed more, getCoreCacheKey() is used e.g. as part of every filter key, so NRT was really broken with filter caching for example. I still dont think integration tests for that are the right way: better would be to have 'nrt is working' tests at a lower level, turning off merging and calling reopen and asserting the core cache key did not change for this filterreader, and then moving our way up... |
Today we try to fetch a shard Id for a given IndexReader / LeafReader
by walking it's tree until the lucene internal SegmentReader and then
casting the directory into a StoreDirecotory. This class is fully internal
to Elasticsearch and should not be exposed outside of the Store.
This commit makes StoreDirectory a private inner class and adds dedicated
ElasticsearchDirectoryReader / ElasticserachLeafReader exposing a ShardId
getter to obtain information about the shard the index / segment belogs to.
These classes can be used to expose other segment specific information in
the future more easily.