Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lookup Terms Filter _cache parameter not being taken into account #3219

Closed
lmenezes opened this issue Jun 24, 2013 · 7 comments
Closed

Lookup Terms Filter _cache parameter not being taken into account #3219

lmenezes opened this issue Jun 24, 2013 · 7 comments

Comments

@lmenezes
Copy link
Contributor

The parameter is taken into account for normal terms filter(when you pass the list of terms), but not when you are using the look up mechanism.

I won't paste a test case since it has to be long enough as for the response time to be meaningful, but I guess in this line there should be a check on wether or not to cache:

https://github.com/elasticsearch/elasticsearch/blob/master/src/main/java/org/elasticsearch/index/query/TermsFilterParser.java#L172

same way there is in: https://github.com/elasticsearch/elasticsearch/blob/master/src/main/java/org/elasticsearch/index/query/TermsFilterParser.java#L193-L195

@lmenezes
Copy link
Contributor Author

Thinking about it, its also not possible to NOT cache the result of the lookup...
Maybe that should also be possible? Or at least, when explicitly not caching the total filter, the lookup shouldn't be cached as well(it just makes no sense in this case).

@kimchy
Copy link
Member

kimchy commented Jun 24, 2013

I will fix the cache flag to be taken into account, I think caching the lookup makes sense almost all times, and its bounded by size by default, so you should be ok with defining a small size if you are concerned by it?

@lmenezes
Copy link
Contributor Author

Not really concerned about size as much as with consistency.
I do think it might be useful to not cache the lookup. It's not that expensive of an operation(at least in my use case), and this way I don't have to worry about keeping the cache up to date(afaik, the only way to guarantee that is assigning it a key and clearing when needed?)

@kimchy
Copy link
Member

kimchy commented Jun 24, 2013

in that case, you can simply set indices.cache.filter.terms.size to 0, or are you looking to do it per query call? If so, then we probably need another flag, cause the cache flag always means caching the filter itself (the bitsets per segment).

@lmenezes
Copy link
Contributor Author

I agree this should be possible to set per call, since setting that to 0 basically would interfere with other needs as well.
But I do think if you explicitly say _cache -> false, but then, the "content" of the filter is cached, makes it kind of pointless not caching in the first place, since the resulting filter will always be the same. Or am I missing something?
If its so, then I think its okay just using the global _cache flag as a hint for not caching the result of the lookup. I just dont see the use case where you dont want to cache the total filter, but want to cache the lookup...

@lmenezes
Copy link
Contributor Author

@kimchy
As a follow up on that...
I ran some tests on this feature, where the lookup would fetch around 35K-60K terms, and the average lookup time was below 40ms(the lookup index has one single shard replicated across all nodes). Applying the filter was actually much more expensive, ranging on 700ms-1s.

This way, maybe it even makes sense caching the total filter using as key the hashed content of the lookup? This way if you are not caching the lookup, you might still get some benefits. Does it make sense?

@lmenezes
Copy link
Contributor Author

I creatd a ticket for that on #3236

mute pushed a commit to mute/elasticsearch that referenced this issue Jul 29, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants