Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't use AllTokenStream if no fields were boosted #6187

Closed
mikemccand opened this issue May 15, 2014 · 1 comment
Closed

Don't use AllTokenStream if no fields were boosted #6187

mikemccand opened this issue May 15, 2014 · 1 comment
Labels

Comments

@mikemccand
Copy link
Contributor

I noticed we create a new AllTokenStream per indexed doc to index the _all field, and remap any per-field boosts to payloads, but if the AllEntries saw no boosts (it already has a boolean customBoost() method to check this) then I think we can skip wrapping with AllTokenStream?

The cost of AllTokenStream.incrementToken is non-trivial because on each token it does a binary search to look up the boost for that entry. Separately, I think this binary search may not be necessary (can't it just use the "current" entry's boost?).

But stepping back, can't ES just add multiple instances of the _all field, rather than making a custom Reader impl (AllEntries) and TokenFilter (AllTokenStream) that does the concatenating on the fly? When Lucene inverts the multi-valued field it logically appends them together.

@mikemccand
Copy link
Contributor Author

Pull request here: #6219

I also noticed & fixed a possible bug in AllFieldMapper.queryTermToString that would fail to return AllTermQuery if the field was index with offsets (for postings highlighter)...

mikemccand added a commit that referenced this issue May 20, 2014
AllTokenStream, used to index the _all field, adds some overhead, but
it's not necessary when no fields were boosted or when positions are
not indexed the _all field.

Closes #6187 Closes #6219
@clintongormley clintongormley changed the title Don't use AllTokenStream if no fields were boosted Indexing: Don't use AllTokenStream if no fields were boosted Jul 16, 2014
@clintongormley clintongormley added the :Core/Infra/Core Core issues without another label label Jun 7, 2015
@clintongormley clintongormley changed the title Indexing: Don't use AllTokenStream if no fields were boosted Don't use AllTokenStream if no fields were boosted Jun 7, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants