Skip to content

Optimize IndexSearcher.count() to rewrite the query once#16001

Merged
dsmiley merged 2 commits into
apache:mainfrom
dsmiley:countRewriteQueryOnce
May 6, 2026
Merged

Optimize IndexSearcher.count() to rewrite the query once#16001
dsmiley merged 2 commits into
apache:mainfrom
dsmiley:countRewriteQueryOnce

Conversation

@dsmiley
Copy link
Copy Markdown
Contributor

@dsmiley dsmiley commented Apr 30, 2026

IndexSearcher.count was invoking Query.rewrite and then calling search() which would rewrite the query again. Rewriting the query can be expensive.

Credit to my colleague Niyant for noticing this when doing some JFR analysis.

Copy link
Copy Markdown
Contributor

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change looks reasonable to me, left a small comment on the constant score query wrapping logic.

// Use the already-rewritten query directly, avoiding a redundant rewrite in search(query, collector)
var collectorManager = new TotalHitCountCollectorManager(getSlices());
var firstCollector = collectorManager.newCollector();
final Weight weight = createWeight(query, firstCollector.scoreMode(), 1);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be exactly the same as before, shouldn't the weight be created providing new ConstantScoreQuery(innerQuery) ? Not sure if there are edge cases around this, but it seems like the rewritten query may not be a constant score query, in which case we would still wrap it here with a constant score query? Am I reading this correctly?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

query has already been rewritten with ConstantScoreQuery, and thus calling createWeight(query,...) takes into consideration CSQ (basically benefitting from CSQ's rewrite optimizations that have already occurred). I don't think the code down here I'm replacing should reference innerQuery as that var exists soley to detect the 2-clause disjunction optimization.

We could probably get away with referencing innerQuery down here thus simply unwrapping the CSQ but I'd think there's not really a benefit, and it muddies the clarity of the structure of this method IMO.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose your inquiry indirectly challenges my opinion that this method has structure, as you don't see it. I recommend introducing a pair of { } to scope the innerQuery & rewriteTwoClauseDisjunctionWithTermsForCount check so that the code down here can't reference it.

I'd also like to comment that the inserted CSQ has some optimizations that rewrite will simplify.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. It was not a matter of structure on my end, more that I assumed the existing logic was there for good reasons that I could not entirely explain, and changing it may cause issues. We do rewrite to the constant score query, and I wondered why we'd need to forcibly wrap again ahead of the weight creation / additional rewrite.

Looking at history, I see the change that introduced the duplicated rewrite was #13036 . That one did not touch the weight creation logic, but it probably should have in the first place, to remove the additional wrapping and prevent duplicated rewrite.

Add changelog for 10.5
@github-actions github-actions Bot added this to the 10.5.0 milestone May 1, 2026
@dsmiley dsmiley merged commit 038e167 into apache:main May 6, 2026
13 checks passed
@dsmiley dsmiley deleted the countRewriteQueryOnce branch May 6, 2026 03:14
dsmiley added a commit that referenced this pull request May 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants