-
Notifications
You must be signed in to change notification settings - Fork 1.2k
LUCENE-9965: Add tooling to introspect query execution time #144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
.../sandbox/src/java/org/apache/lucene/sandbox/queries/profile/AbstractInternalProfileTree.java
Outdated
Show resolved
Hide resolved
.../sandbox/src/java/org/apache/lucene/sandbox/queries/profile/AbstractInternalProfileTree.java
Outdated
Show resolved
Hide resolved
lucene/sandbox/src/java/org/apache/lucene/sandbox/queries/profile/AbstractProfiler.java
Outdated
Show resolved
Hide resolved
lucene/sandbox/src/java/org/apache/lucene/sandbox/queries/profile/ProfileResult.java
Outdated
Show resolved
Hide resolved
/** | ||
* A simple extension of {@link IndexSearcher} to add a {@link QueryProfiler} that can be set to | ||
* test query timings. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add an example of how it may be used in the javadocs?
lucene/sandbox/src/java/org/apache/lucene/sandbox/queries/profile/AbstractProfileBreakdown.java
Outdated
Show resolved
Hide resolved
lucene/sandbox/src/java/org/apache/lucene/sandbox/queries/profile/AbstractProfileBreakdown.java
Outdated
Show resolved
Hide resolved
lucene/sandbox/src/java/org/apache/lucene/sandbox/queries/profile/AbstractProfileBreakdown.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a good direction to me, just added a few suggestions.
.../sandbox/src/java/org/apache/lucene/sandbox/queries/profile/AbstractInternalProfileTree.java
Outdated
Show resolved
Hide resolved
super(reader); | ||
} | ||
|
||
public void setProfiler(QueryProfiler profiler) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could add QueryProfiler
as a constructor parameter and ensure it's always non-null. That'd simplify this class a bit, and make it clear that it's always meant to be used for profiling.
We could even take this further and remove the QueryProfiler
class, folding its logic into ProfileIndexSearcher
. It doesn't seem like a helpful abstraction on its own?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this idea a lot as fewer classes in this case is ideal. My question here is do want to have to create a new IndexSearcher for each query?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong intuition here -- maybe @jpountz has an opinion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's totally fine. IndexReaders are costly to create, but IndexSearchers are very cheap.
lucene/sandbox/src/test/org/apache/lucene/sandbox/queries/profile/TestProfileQuery.java
Outdated
Show resolved
Hide resolved
MatcherAssert.assertThat(rewriteTime, greaterThan(0L)); | ||
} | ||
|
||
public void testCollector() throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be nice to have one test showing how the collector would actually be used in a search, through something like IndexSearcher#search(Query query, Collector results)
.
@jpountz @jtibshirani Thank you for the feedback! I will address these soon. |
Just a note that these comments are still valid and not really outdated. I changed packages and renamed some of the files so it's no longer referencing the correct path. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some minor comments around visibility but otherwise this looks good to me!
lucene/sandbox/src/java/org/apache/lucene/sandbox/search/QueryProfiler.java
Outdated
Show resolved
Hide resolved
lucene/sandbox/src/java/org/apache/lucene/sandbox/search/QueryProfilerBreakdown.java
Outdated
Show resolved
Hide resolved
lucene/sandbox/src/java/org/apache/lucene/sandbox/search/QueryProfilerCollector.java
Outdated
Show resolved
Hide resolved
lucene/sandbox/src/java/org/apache/lucene/sandbox/search/QueryProfilerCollectorResult.java
Outdated
Show resolved
Hide resolved
lucene/sandbox/src/java/org/apache/lucene/sandbox/search/QueryProfilerCollectorResult.java
Outdated
Show resolved
Hide resolved
lucene/sandbox/src/java/org/apache/lucene/sandbox/search/QueryProfilerCollectorWrapper.java
Outdated
Show resolved
Hide resolved
lucene/sandbox/src/java/org/apache/lucene/sandbox/search/QueryProfilerResult.java
Outdated
Show resolved
Hide resolved
lucene/sandbox/src/java/org/apache/lucene/sandbox/search/QueryProfilerScorer.java
Outdated
Show resolved
Hide resolved
lucene/sandbox/src/java/org/apache/lucene/sandbox/search/QueryProfilerTimer.java
Outdated
Show resolved
Hide resolved
lucene/sandbox/src/java/org/apache/lucene/sandbox/search/QueryProfilerWeight.java
Outdated
Show resolved
Hide resolved
@jdconrad asked that I finish off this work as he'll be away for several weeks. I pushed a few changes:
I'll plan to merge soon unless there are more comments. |
This change adds new IndexSearcher and Collector implementations to profile search execution and break down the timings. The breakdown includes the total time spent in each of the following categories along with the number of times visited: create weight, build scorer, next doc, advance, score, match. Co-authored-by: Julie Tibshirani <julietibs@gmail.com>
The profile functionality relies on two custom collector implementations that wrap a given collector to monitor its execution time and expose the profile results tree. The functionality has been contributed to Lucene with apache/lucene#144 hence we can start migrating to relying on the functionalities that Lucene offers for it. We keep our own collector which extends ProfilerCollector from Lucene and exposes profile results that extends ProfilerCollectorResult which can be serialized over the wire as well as to xcontent.
The profile functionality relies on two custom collector implementations that wrap a given collector to monitor its execution time and expose the profile results tree. The functionality has been contributed to Lucene with apache/lucene#144 hence we can start migrating to relying on the functionalities that Lucene offers for it. We keep our own collector which extends ProfilerCollector from Lucene and exposes profile results that extends ProfilerCollectorResult which can be serialized over the wire as well as to xcontent.
Description
Based on the discussion from this email thread we could add a set of classes to compile timings for different pieces of a query or multiple queries. This could be used to better debug changes in performance moving forward.
Solution
This change adds a multitude of new classes to help profile query timings. These classes have all been added to the Lucene sandbox including a simple extension of a IndexSearcher with the name ProfileIndexSearcher that includes a QueryProfiler. The QueryProfiler includes the total time spent in each of the following categories along with the number of times visited:
Tests
Tests have been added to ensure that a simple query generate values within a ProfileIndexSearcher.
Checklist
Please review the following and check all that apply:
main
branch../gradlew check
.