Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduced the notion of a FixedBitSetFilter that guarantees to produce a FixedBitSet #7037

Closed
wants to merge 1 commit into from

Conversation

martijnvg
Copy link
Member

Nested and parent/child rely on the fact that type filters produce a FixedBitSet, the RandomAccessFilter does this. By moving away from filter cache this filters will also never be evicted because of LRU reasons, which is what is desired for nested and parent/child.

Also if nested and parent/child is configured the type filters are eagerly loaded by default.

PR for #7031


/**
* Indicates that this filter returns a {@link org.apache.lucene.search.DocIdSet} implementation that supports
* random access.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it guarantees more than that, it guarantees that a FixedBitSet is returned, so maybe this class should even be called FixedBitSetFilter?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, make sense. I will then rename the service to FixedBitSetFilterService.

@jpountz
Copy link
Contributor

jpountz commented Jul 28, 2014

Left some comments, but I really like the idea of having a different caching infrastructure for these filters with eager loading, no evictions and the warranty to get fixed bit sets!

@jpountz jpountz removed the review label Jul 28, 2014
@martijnvg
Copy link
Member Author

@jpountz I've updated the PR and addressed your comments. In addition I moved the FixedBitSetCache to the index.cache package and add fixed_bit_set_memory_in_bytes statistic to segments stats.


@Override
public void onClose(Object ownerCoreCacheKey) {
loadedFilters.invalidate(ownerCoreCacheKey);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it work? (given that loadedFilters uses a Key object as key while you are invalidating with a reader core key here?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops... that wouldn't work

@jpountz
Copy link
Contributor

jpountz commented Aug 1, 2014

@martijnvg Left a couple more comments. I think it would be nice to also check the size of this cache after integration tests finish to make sure it gets cleaned when indices are closed?

martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request Aug 4, 2014
…uce a FixedBitSet.

Nested and parent/child rely on the fact that type filters produce a FixedBitSet, the RandomAccessFilter does this.
Also if nested and parent/child is configured the type filters are eagerly loaded by default.

Closes elastic#7037
Closes elastic#7031
@martijnvg
Copy link
Member Author

@jpountz Thanks for looking into this. I update the PR to address your comments.

@jpountz
Copy link
Contributor

jpountz commented Aug 6, 2014

LGTM

Maybe @kimchy should look at the caching logic before getting this in?

@kimchy
Copy link
Member

kimchy commented Aug 22, 2014

LGTM, I would add some docs to the FixedBitSet cache that its intentionally not bounded (by size or time), and that it should be used very carefully only for components that require a FixedBitSet that is always there.

@jpountz
Copy link
Contributor

jpountz commented Aug 26, 2014

@martijnvg I think it would be nice to merge this one as it would help clean up a couple of other nested and parent/child related changes?

@martijnvg
Copy link
Member Author

@jpountz I totally agree and I'll merge it in this week before I work on any other major PR.

…itSet and does evict based on size or time.

Only when segments are merged away due to merging then entries in this cache are cleaned up.

Nested and parent/child rely on the fact that type filters produce a FixedBitSet, the FixedBitSetFilterCache does this.
Also if nested and parent/child is configured the type filters are eagerly loaded by default via the FixedBitSetFilterCache.

Closes elastic#7037
Closes elastic#7031
martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request Aug 27, 2014
…itSet and does evict based on size or time.

Only when segments are merged away due to merging then entries in this cache are cleaned up.

Nested and parent/child rely on the fact that type filters produce a FixedBitSet, the FixedBitSetFilterCache does this.
Also if nested and parent/child is configured the type filters are eagerly loaded by default via the FixedBitSetFilterCache.

Closes elastic#7037
Closes elastic#7031
@martijnvg martijnvg closed this in 94eed4e Aug 27, 2014
martijnvg added a commit that referenced this pull request Aug 27, 2014
…itSet and does evict based on size or time.

Only when segments are merged away due to merging then entries in this cache are cleaned up.

Nested and parent/child rely on the fact that type filters produce a FixedBitSet, the FixedBitSetFilterCache does this.
Also if nested and parent/child is configured the type filters are eagerly loaded by default via the FixedBitSetFilterCache.

Closes #7037
Closes #7031
@martijnvg martijnvg removed the review label Aug 27, 2014
@martijnvg martijnvg changed the title Introduced the notion of a RandomAccessFilter that guarantees to produce a FixedBitSet. Core: Introduced the notion of a FixedBitSetFilter that guarantees to produce a FixedBitSet. Aug 27, 2014
jpountz added a commit to jpountz/elasticsearch that referenced this pull request Sep 3, 2014
…cIdSet.

A few months ago, Lucene switched from FixedBitSet to WAH8DocIdSet in order
to cache filters. WAH8DocIdSet is especially better when dealing with sparse
sets: iteration is faster, memory usage is lower and there is an index that
helps keep advance fast.

This doesn't break existing code since elastic#6280 already made sure that there was no
abusive cast from DocIdSet to Bits or FixedBitSet and elastic#7037 moved consumers of
the filter cache that absolutely need to get fixed bitsets to their own cache.

Since cached filters will be more memory-efficient, the filter cache size has
been decreased from 10 to 5%. Although smaller, this might still allow to cache
more filters.
jpountz added a commit to jpountz/elasticsearch that referenced this pull request Sep 4, 2014
martijnvg added a commit that referenced this pull request Sep 8, 2014
…itSet and does evict based on size or time.

Only when segments are merged away due to merging then entries in this cache are cleaned up.

Nested and parent/child rely on the fact that type filters produce a FixedBitSet, the FixedBitSetFilterCache does this.
Also if nested and parent/child is configured the type filters are eagerly loaded by default via the FixedBitSetFilterCache.

Closes #7037
Closes #7031
@clintongormley clintongormley changed the title Core: Introduced the notion of a FixedBitSetFilter that guarantees to produce a FixedBitSet. Internal: Introduced the notion of a FixedBitSetFilter that guarantees to produce a FixedBitSet Sep 8, 2014
jpountz added a commit that referenced this pull request Sep 11, 2014
jpountz added a commit that referenced this pull request Sep 11, 2014
@martijnvg martijnvg deleted the improvements/fbs_cache branch May 18, 2015 23:30
@clintongormley clintongormley changed the title Internal: Introduced the notion of a FixedBitSetFilter that guarantees to produce a FixedBitSet Introduced the notion of a FixedBitSetFilter that guarantees to produce a FixedBitSet Jun 7, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants