Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduced the notion of a FixedBitSetFilter that guarantees to produce a FixedBitSet #7037

Closed
wants to merge 1 commit into from

Conversation

Projects
None yet
4 participants
@martijnvg
Copy link
Member

martijnvg commented Jul 25, 2014

Nested and parent/child rely on the fact that type filters produce a FixedBitSet, the RandomAccessFilter does this. By moving away from filter cache this filters will also never be evicted because of LRU reasons, which is what is desired for nested and parent/child.

Also if nested and parent/child is configured the type filters are eagerly loaded by default.

PR for #7031

@jpountz

View changes

src/main/java/org/elasticsearch/index/randomaccess/RandomAccessFilter.java Outdated

/**
* Indicates that this filter returns a {@link org.apache.lucene.search.DocIdSet} implementation that supports
* random access.

This comment has been minimized.

Copy link
@jpountz

jpountz Jul 28, 2014

Contributor

Actually it guarantees more than that, it guarantees that a FixedBitSet is returned, so maybe this class should even be called FixedBitSetFilter?

This comment has been minimized.

Copy link
@martijnvg

martijnvg Jul 28, 2014

Author Member

Ok, make sense. I will then rename the service to FixedBitSetFilterService.

@jpountz

View changes

src/main/java/org/elasticsearch/index/randomaccess/RandomAccessFilterService.java Outdated
Cache<Filter, FixedBitSet> filterToFbs = loadedFilters.get(coreCacheReader, new Callable<Cache<Filter, FixedBitSet>>() {
@Override
public Cache<Filter, FixedBitSet> call() throws Exception {
SegmentReaderUtils.registerCoreListener(context.reader(), RandomAccessFilterService.this);

This comment has been minimized.

Copy link
@jpountz

jpountz Jul 28, 2014

Contributor

I think it should try to register this as a listener only once per reader?

This comment has been minimized.

Copy link
@martijnvg

martijnvg Jul 28, 2014

Author Member

I think that is what is happening here, the loadedFilters contains per reader cache key a filter to fbs cache. Only when the there is no cache per reader cache key then a new empty cache will be created and a listener for the core cache key will be registered.

@jpountz

View changes

src/main/java/org/elasticsearch/index/query/QueryParseContext.java Outdated
@@ -174,6 +175,10 @@ public MapperQueryParser queryParser(QueryParserSettings settings) {
return queryParser;
}

public RandomAccessFilter randomAccess(Filter filter) {

This comment has been minimized.

Copy link
@jpountz

jpountz Jul 28, 2014

Contributor

Since this class doesn't only deal with filters, maybe the method name should contain filter?

@jpountz

View changes

src/main/java/org/elasticsearch/search/facet/nested/NestedFacetExecutor.java Outdated
parentFilter = context.filterCache().cache(NonNestedDocsFilter.INSTANCE);
childFilter = context.filterCache().cache(objectMapper.nestedTypeFilter());
parentFilter = context.filterRandomAccess().getRandomAccess(NonNestedDocsFilter.INSTANCE);
childFilter = context.filterRandomAccess().getRandomAccess(objectMapper.nestedTypeFilter());

This comment has been minimized.

Copy link
@jpountz

jpountz Jul 28, 2014

Contributor

Should objectMapper.nestedTypeFilter() always return a RandomAccessFilter so that we don't need to call getRandomAccess on it?

This comment has been minimized.

Copy link
@martijnvg

martijnvg Jul 28, 2014

Author Member

make sense, but then the nestedTypeFilter method needs the have the FixedBitSetService as an argument.

This comment has been minimized.

Copy link
@martijnvg

martijnvg Jul 29, 2014

Author Member

If we want to objectMapper.nestedTypeFilter() let return a FixedBitSetFilter then the FixedBitSetFilterCache should be part of the ObjectMapper class or it should be passed as an argument. Both I don't like, so I think the way it is now is best.

@jpountz

This comment has been minimized.

Copy link
Contributor

jpountz commented Jul 28, 2014

Left some comments, but I really like the idea of having a different caching infrastructure for these filters with eager loading, no evictions and the warranty to get fixed bit sets!

@jpountz jpountz removed the review label Jul 28, 2014

@martijnvg

This comment has been minimized.

Copy link
Member Author

martijnvg commented Jul 29, 2014

@jpountz I've updated the PR and addressed your comments. In addition I moved the FixedBitSetCache to the index.cache package and add fixed_bit_set_memory_in_bytes statistic to segments stats.


@Override
public void onClose(Object ownerCoreCacheKey) {
loadedFilters.invalidate(ownerCoreCacheKey);

This comment has been minimized.

Copy link
@jpountz

jpountz Aug 1, 2014

Contributor

Does it work? (given that loadedFilters uses a Key object as key while you are invalidating with a reader core key here?)

This comment has been minimized.

Copy link
@martijnvg

martijnvg Aug 4, 2014

Author Member

oops... that wouldn't work

@jpountz

View changes

src/main/java/org/elasticsearch/index/cache/fixedbitset/FixedBitSetFilterCache.java Outdated
private FixedBitSet getAndLoadIfNotPresent(final Filter filter, final AtomicReaderContext context) throws IOException, ExecutionException {
final Object coreCacheReader = context.reader().getCoreCacheKey();
final ShardId shardId = ShardUtils.extractShardId(context.reader());
Key key = new Key(coreCacheReader, shardId);

This comment has been minimized.

Copy link
@jpountz

jpountz Aug 1, 2014

Contributor

Why do we need the shard id in the key?

This comment has been minimized.

Copy link
@jpountz

jpountz Aug 1, 2014

Contributor

Wouldn't the core cache key be enough?

This comment has been minimized.

Copy link
@jpountz

jpountz Aug 1, 2014

Contributor

Oh I see, it is for the removal notification. But maybe it would make more sense to have it as part of the value rather than the key since it is not used for hashCode/equality?

This comment has been minimized.

Copy link
@martijnvg

martijnvg Aug 4, 2014

Author Member

good point, I've changed that.

@jpountz

View changes

src/main/java/org/elasticsearch/index/cache/fixedbitset/FixedBitSetFilterCache.java Outdated
}
}

public static final class Value {

This comment has been minimized.

Copy link
@jpountz

jpountz Aug 1, 2014

Contributor

I don't think we need to cache the ram bytes used, it is super fast to compute?

This comment has been minimized.

Copy link
@martijnvg

martijnvg Aug 4, 2014

Author Member

I didn't know that this was cheap, I will compute it on the fly then.

@jpountz

View changes

src/main/java/org/elasticsearch/index/service/IndexService.java Outdated
@@ -50,6 +51,8 @@

IndexFieldDataService fieldData();

FixedBitSetFilterCache randomAccess();

This comment has been minimized.

Copy link
@jpountz

jpountz Aug 1, 2014

Contributor

s/randomAccess/fixedBitSet/ ?

@jpountz

View changes

src/main/java/org/elasticsearch/percolator/PercolateContext.java Outdated
@@ -471,6 +472,11 @@ public FilterCache filterCache() {
}

@Override
public FixedBitSetFilterCache filterRandomAccess() {
return indexService.randomAccess();

This comment has been minimized.

Copy link
@jpountz

jpountz Aug 1, 2014

Contributor

s/randomAccess/fixedBitSet/ ?

@jpountz

This comment has been minimized.

Copy link
Contributor

jpountz commented Aug 1, 2014

@martijnvg Left a couple more comments. I think it would be nice to also check the size of this cache after integration tests finish to make sure it gets cleaned when indices are closed?

martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request Aug 4, 2014

Introduced the notion of a RandomAccessFilter that guarantees to prod…
…uce a FixedBitSet.

Nested and parent/child rely on the fact that type filters produce a FixedBitSet, the RandomAccessFilter does this.
Also if nested and parent/child is configured the type filters are eagerly loaded by default.

Closes elastic#7037
Closes elastic#7031
@martijnvg

This comment has been minimized.

Copy link
Member Author

martijnvg commented Aug 4, 2014

@jpountz Thanks for looking into this. I update the PR to address your comments.

@jpountz

This comment has been minimized.

Copy link
Contributor

jpountz commented Aug 6, 2014

LGTM

Maybe @kimchy should look at the caching logic before getting this in?

@kimchy

This comment has been minimized.

Copy link
Member

kimchy commented Aug 22, 2014

LGTM, I would add some docs to the FixedBitSet cache that its intentionally not bounded (by size or time), and that it should be used very carefully only for components that require a FixedBitSet that is always there.

@jpountz

This comment has been minimized.

Copy link
Contributor

jpountz commented Aug 26, 2014

@martijnvg I think it would be nice to merge this one as it would help clean up a couple of other nested and parent/child related changes?

@martijnvg

This comment has been minimized.

Copy link
Member Author

martijnvg commented Aug 26, 2014

@jpountz I totally agree and I'll merge it in this week before I work on any other major PR.

Introduced FixedBitSetFilterCache that guarantees to produce a FixedB…
…itSet and does evict based on size or time.

Only when segments are merged away due to merging then entries in this cache are cleaned up.

Nested and parent/child rely on the fact that type filters produce a FixedBitSet, the FixedBitSetFilterCache does this.
Also if nested and parent/child is configured the type filters are eagerly loaded by default via the FixedBitSetFilterCache.

Closes #7037
Closes #7031

@martijnvg martijnvg force-pushed the martijnvg:improvements/fbs_cache branch from 0d919d6 to 531d6ab Aug 27, 2014

martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request Aug 27, 2014

Introduced FixedBitSetFilterCache that guarantees to produce a FixedB…
…itSet and does evict based on size or time.

Only when segments are merged away due to merging then entries in this cache are cleaned up.

Nested and parent/child rely on the fact that type filters produce a FixedBitSet, the FixedBitSetFilterCache does this.
Also if nested and parent/child is configured the type filters are eagerly loaded by default via the FixedBitSetFilterCache.

Closes elastic#7037
Closes elastic#7031

@martijnvg martijnvg closed this in 94eed4e Aug 27, 2014

martijnvg added a commit that referenced this pull request Aug 27, 2014

Introduced FixedBitSetFilterCache that guarantees to produce a FixedB…
…itSet and does evict based on size or time.

Only when segments are merged away due to merging then entries in this cache are cleaned up.

Nested and parent/child rely on the fact that type filters produce a FixedBitSet, the FixedBitSetFilterCache does this.
Also if nested and parent/child is configured the type filters are eagerly loaded by default via the FixedBitSetFilterCache.

Closes #7037
Closes #7031

@martijnvg martijnvg removed the review label Aug 27, 2014

@martijnvg martijnvg changed the title Introduced the notion of a RandomAccessFilter that guarantees to produce a FixedBitSet. Core: Introduced the notion of a FixedBitSetFilter that guarantees to produce a FixedBitSet. Aug 27, 2014

jpountz added a commit to jpountz/elasticsearch that referenced this pull request Sep 3, 2014

Core: Change the default cache filter impl from FixedBitSet to WAH8Do…
…cIdSet.

A few months ago, Lucene switched from FixedBitSet to WAH8DocIdSet in order
to cache filters. WAH8DocIdSet is especially better when dealing with sparse
sets: iteration is faster, memory usage is lower and there is an index that
helps keep advance fast.

This doesn't break existing code since elastic#6280 already made sure that there was no
abusive cast from DocIdSet to Bits or FixedBitSet and elastic#7037 moved consumers of
the filter cache that absolutely need to get fixed bitsets to their own cache.

Since cached filters will be more memory-efficient, the filter cache size has
been decreased from 10 to 5%. Although smaller, this might still allow to cache
more filters.

jpountz added a commit to jpountz/elasticsearch that referenced this pull request Sep 4, 2014

martijnvg added a commit that referenced this pull request Sep 8, 2014

Introduced FixedBitSetFilterCache that guarantees to produce a FixedB…
…itSet and does evict based on size or time.

Only when segments are merged away due to merging then entries in this cache are cleaned up.

Nested and parent/child rely on the fact that type filters produce a FixedBitSet, the FixedBitSetFilterCache does this.
Also if nested and parent/child is configured the type filters are eagerly loaded by default via the FixedBitSetFilterCache.

Closes #7037
Closes #7031

@clintongormley clintongormley changed the title Core: Introduced the notion of a FixedBitSetFilter that guarantees to produce a FixedBitSet. Internal: Introduced the notion of a FixedBitSetFilter that guarantees to produce a FixedBitSet Sep 8, 2014

jpountz added a commit that referenced this pull request Sep 11, 2014

jpountz added a commit that referenced this pull request Sep 11, 2014

@martijnvg martijnvg deleted the martijnvg:improvements/fbs_cache branch May 18, 2015

@clintongormley clintongormley changed the title Internal: Introduced the notion of a FixedBitSetFilter that guarantees to produce a FixedBitSet Introduced the notion of a FixedBitSetFilter that guarantees to produce a FixedBitSet Jun 7, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.