Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lucene search performance can be greatly improved #493

Closed
mirkosertic opened this issue Sep 7, 2013 · 8 comments
Closed

Lucene search performance can be greatly improved #493

mirkosertic opened this issue Sep 7, 2013 · 8 comments
Milestone

Comments

@mirkosertic
Copy link

@mirkosertic mirkosertic commented Sep 7, 2013

There seems to ba a problem in AbstractLuceneQuery with QueryDSL 3.2.2 while using Lucene 4.4.0. Providing no Sort instances for no sorting queries is quite slow. If i replace the following line

scoreDocs = searcher.search(createQuery(), filter, sumOfLimitAndOffset).scoreDocs;

with

scoreDocs = searcher.search(createQuery(), filter, sumOfLimitAndOffset, new Sort()).scoreDocs;

The query is executed >8 times faster!!! It seems to be Lucene related, i already filed an issue on the usergroup. Till it is clear what is the cause for this i use the optimized version, the query results are the same.

@timowest
Copy link
Member

@timowest timowest commented Sep 8, 2013

This looks like a bug on the Lucene side. Let me know when you find the reason.

@mirkosertic
Copy link
Author

@mirkosertic mirkosertic commented Sep 8, 2013

I created the following thread:

http://www.gossamer-threads.com/lists/lucene/java-user/206728?do=post_view_threaded

From my point of view QueryDSL does not need Lucene scoring, it just need sorting in some cases. So probably the first methods involves sorting by score, the later sorting by field. So just providing an empty Sort() instance disables sorting by score, which greatly improves search speed.

I will track this...

@mirkosertic
Copy link
Author

@mirkosertic mirkosertic commented Sep 8, 2013

Seems like the default lucene sort order is by relevance, which includes intensive computation...

@mirkosertic
Copy link
Author

@mirkosertic mirkosertic commented Sep 9, 2013

Ok, according to Mike McCandless the solution should be in AbstractLuceneQuery:

            if (sort != null) {
                scoreDocs = searcher.search(createQuery(), filter, sumOfLimitAndOffset, sort, false, false).scoreDocs;
            } else {
                scoreDocs = searcher.search(createQuery(), filter, sumOfLimitAndOffset, Sort.INDEXORDER, false, false).scoreDocs;
            }
@mirkosertic
Copy link
Author

@mirkosertic mirkosertic commented Sep 9, 2013

If no sorting is required, we could also use a custom Collector, this should be the fastest solution of all....

@timowest
Copy link
Member

@timowest timowest commented Sep 10, 2013

What about the other search usage in AbstractLuceneQuery?

@mirkosertic
Copy link
Author

@mirkosertic mirkosertic commented Sep 11, 2013

Ok, i only see the oneResult() method, and here i think the INDEXORDER sort order should also be used, so replacing

final ScoreDoc[] scoreDocs = searcher.search(createQuery(), filter, maxDoc).scoreDocs;

with

final ScoreDoc[] scoreDocs = searcher.search(createQuery(), filter, sumOfLimitAndOffset, Sort.INDEXORDER, false, false).scoreDocs;

We are not using onrResult() or singleResult() in our usecases, but i think the current implementation will lead to performance problems under heavy load as well.

timowest added a commit that referenced this issue Sep 11, 2013
@timowest
Copy link
Member

@timowest timowest commented Oct 20, 2013

Released in 3.2.4

@timowest timowest closed this Oct 20, 2013
@timowest timowest added this to the 3.2.4 milestone Apr 13, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants