LUCENE-9965: Add tooling to introspect query execution time #144

jdconrad · 2021-05-18T18:15:03Z

Description

Based on the discussion from this email thread we could add a set of classes to compile timings for different pieces of a query or multiple queries. This could be used to better debug changes in performance moving forward.

Solution

This change adds a multitude of new classes to help profile query timings. These classes have all been added to the Lucene sandbox including a simple extension of a IndexSearcher with the name ProfileIndexSearcher that includes a QueryProfiler. The QueryProfiler includes the total time spent in each of the following categories along with the number of times visited:

create weight
build scorer
next doc
advance
score
match

Tests

Tests have been added to ensure that a simple query generate values within a ProfileIndexSearcher.

Checklist

Please review the following and check all that apply:

I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
I have created a Jira issue and added the issue ID to my pull request title.
I have given Lucene maintainers access to contribute to my PR branch. (optional but recommended)
I have developed this patch against the main branch.
I have run ./gradlew check.
I have added tests for my changes.

.../sandbox/src/java/org/apache/lucene/sandbox/queries/profile/AbstractInternalProfileTree.java

lucene/sandbox/src/java/org/apache/lucene/sandbox/queries/profile/AbstractProfiler.java

lucene/sandbox/src/java/org/apache/lucene/sandbox/queries/profile/ProfileResult.java

jpountz · 2021-05-19T14:48:08Z

lucene/sandbox/src/java/org/apache/lucene/sandbox/queries/profile/ProfileIndexSearcher.java

+/**
+ * A simple extension of {@link IndexSearcher} to add a {@link QueryProfiler} that can be set to
+ * test query timings.
+ */


Add an example of how it may be used in the javadocs?

lucene/sandbox/src/java/org/apache/lucene/sandbox/queries/profile/AbstractProfileBreakdown.java

jtibshirani

This looks like a good direction to me, just added a few suggestions.

.../sandbox/src/java/org/apache/lucene/sandbox/queries/profile/AbstractInternalProfileTree.java

jtibshirani · 2021-05-19T22:35:25Z

lucene/sandbox/src/java/org/apache/lucene/sandbox/queries/profile/ProfileIndexSearcher.java

+    super(reader);
+  }
+
+  public void setProfiler(QueryProfiler profiler) {


Maybe we could add QueryProfiler as a constructor parameter and ensure it's always non-null. That'd simplify this class a bit, and make it clear that it's always meant to be used for profiling.

We could even take this further and remove the QueryProfiler class, folding its logic into ProfileIndexSearcher. It doesn't seem like a helpful abstraction on its own?

I like this idea a lot as fewer classes in this case is ideal. My question here is do want to have to create a new IndexSearcher for each query?

I don't have a strong intuition here -- maybe @jpountz has an opinion?

Yes, it's totally fine. IndexReaders are costly to create, but IndexSearchers are very cheap.

lucene/sandbox/src/test/org/apache/lucene/sandbox/queries/profile/TestProfileQuery.java

jtibshirani · 2021-05-19T22:47:06Z

lucene/sandbox/src/test/org/apache/lucene/sandbox/queries/profile/TestProfileQuery.java

+    MatcherAssert.assertThat(rewriteTime, greaterThan(0L));
+  }
+
+  public void testCollector() throws IOException {


It'd be nice to have one test showing how the collector would actually be used in a search, through something like IndexSearcher#search(Query query, Collector results).

jdconrad · 2021-05-20T14:02:59Z

@jpountz @jtibshirani Thank you for the feedback! I will address these soon.

jdconrad · 2021-05-20T20:25:31Z

Just a note that these comments are still valid and not really outdated. I changed packages and renamed some of the files so it's no longer referencing the correct path.

jpountz

I left some minor comments around visibility but otherwise this looks good to me!

lucene/sandbox/src/java/org/apache/lucene/sandbox/search/QueryProfiler.java

lucene/sandbox/src/java/org/apache/lucene/sandbox/search/QueryProfilerBreakdown.java

lucene/sandbox/src/java/org/apache/lucene/sandbox/search/QueryProfilerCollector.java

lucene/sandbox/src/java/org/apache/lucene/sandbox/search/QueryProfilerCollectorResult.java

lucene/sandbox/src/java/org/apache/lucene/sandbox/search/QueryProfilerCollectorWrapper.java

lucene/sandbox/src/java/org/apache/lucene/sandbox/search/QueryProfilerResult.java

lucene/sandbox/src/java/org/apache/lucene/sandbox/search/QueryProfilerScorer.java

lucene/sandbox/src/java/org/apache/lucene/sandbox/search/QueryProfilerTimer.java

lucene/sandbox/src/java/org/apache/lucene/sandbox/search/QueryProfilerWeight.java

jtibshirani · 2021-06-09T17:56:30Z

@jdconrad asked that I finish off this work as he'll be away for several weeks. I pushed a few changes:

Adjust class visibility and other small clean-ups
Fold QueryProfiler into QueryProfilerIndexSearcher
Rename profiler collectors and create dedicated test

I'll plan to merge soon unless there are more comments.

This change adds new IndexSearcher and Collector implementations to profile search execution and break down the timings. The breakdown includes the total time spent in each of the following categories along with the number of times visited: create weight, build scorer, next doc, advance, score, match. Co-authored-by: Julie Tibshirani <julietibs@gmail.com>

The profile functionality relies on two custom collector implementations that wrap a given collector to monitor its execution time and expose the profile results tree. The functionality has been contributed to Lucene with apache/lucene#144 hence we can start migrating to relying on the functionalities that Lucene offers for it. We keep our own collector which extends ProfilerCollector from Lucene and exposes profile results that extends ProfilerCollectorResult which can be serialized over the wire as well as to xcontent.

jdconrad added 4 commits May 17, 2021 14:40

add a set of classes to profile query performance

86dae80

clean up JavaDoc

772b6d4

add package info

855fb36

Merge branch 'main' into profile

2a3602f

jpountz reviewed May 19, 2021

View reviewed changes

jtibshirani reviewed May 19, 2021

View reviewed changes

jdconrad added 2 commits May 20, 2021 09:21

Merge branch 'main' into profile

4472d1f

removed several unnecessary abstractions

3c0dda2

Adjust class visibility

aa36520

jpountz reviewed Jun 9, 2021

View reviewed changes

jtibshirani added 6 commits June 9, 2021 09:29

More clean-up

cd6df17

Fold QueryProfiler into QueryProfilerIndexSearcher

d9f6c57

Refactor collector profiler

9bf347e

Fix formatting

fccdc50

Merge remote-tracking branch 'upstream/main' into profile

7bc417a

Add changelog entry.

17ad7c0

jpountz approved these changes Jun 9, 2021

View reviewed changes

jtibshirani merged commit 40f66a4 into apache:main Jun 9, 2021

javanna mentioned this pull request Apr 25, 2023

Switch to Lucene ProfilerCollector elastic/elasticsearch#95526

Merged

jainankitk mentioned this pull request Mar 19, 2025

Handling concurrent search in QueryProfiler #14375

Open

LUCENE-9965: Add tooling to introspect query execution time #144

LUCENE-9965: Add tooling to introspect query execution time #144

Uh oh!

Conversation

jdconrad commented May 18, 2021

Description

Solution

Tests

Checklist

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jpountz May 19, 2021

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jtibshirani left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jtibshirani May 19, 2021

Choose a reason for hiding this comment

Uh oh!

jdconrad May 20, 2021

Choose a reason for hiding this comment

Uh oh!

jtibshirani May 25, 2021

Choose a reason for hiding this comment

Uh oh!

jpountz Jun 9, 2021

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jtibshirani May 19, 2021

Choose a reason for hiding this comment

Uh oh!

jdconrad commented May 20, 2021

Uh oh!

jdconrad commented May 20, 2021

Uh oh!

jpountz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jtibshirani commented Jun 9, 2021

Uh oh!

Uh oh!