Use ProfileCollectorManager in DfsPhase #96689

cbuescher · 2023-06-08T07:50:12Z

This is a draft that demonstrates how we might introduce a profiler wrapper for other parallelizeable collector managers.

javanna · 2023-06-08T09:14:20Z

server/src/main/java/org/elasticsearch/search/profile/query/MultiProfileCollectorManager.java

+    public Collector newCollector() throws IOException {
+        Collector collector = collectorManager.newCollector();
+        wrappedCollectors.add(collector);
+        InternalProfileCollector internalProfileCollector = new InternalProfileCollector(collector, CollectorResult.REASON_SEARCH_TOP_HITS);


don't we need to provide the child profile collectors in some cases? That is what I am struggling with in #96622

Yes, thats one thing I struggled with. I'm currently only trying to plug this in the DFSPhase above where I can work with the return type from reduce(), but we already talked about it that we probably don't want to rely on it. As you see I already keep references to both the wrapped and the original collectors for the reduce() calls here. Maybe overkill, I will see where this goes, this draft needs some more tlc anyway.

So, I did some more thinking on this and I think we have two options that depend on how we want to shape the profile output for the concurrent execution:

simplest: expose overall timing for each collector manager, it mostly replaces collector with collector manager but that's it. We would just need to reduce the timing info from the different profile collector, sum it up, and off we go

detailed: expose collectors, including details about concurrent execution. The challenge here is to rebuild the hierarchy of the collectors,we'd have to be able to extract children collectors when newCollector is called.

I am thinking 2) would be great to have, but we should not block on it if it takes too much effort.

server/src/main/java/org/elasticsearch/search/profile/query/MultiProfileCollectorManager.java

cbuescher · 2023-06-08T22:21:44Z

@javanna I added some changes and tests, will see if this breaks anything else. The profiler output in this iteration places the output for each individual collector (one per newCollector() call to the wrapped collector manager) under one parent collector result entry. That currently hast "ProfileCollectorManager" as its name, indicating its a composite collector, and the "reason" field is called "search_top_hits_max" for now since it contains the longest time of the nested collector results. Let me know what you think.

javanna

We need to add the tracking of child profile collectors, but given it's not needed in the dfs phase, I think that this is a great step forward. I left a couple of questions but I am onboard with the overall direction.

server/src/main/java/org/elasticsearch/search/dfs/DfsPhase.java

server/src/main/java/org/elasticsearch/search/profile/query/ProfileCollectorManager.java

javanna · 2023-06-12T11:23:38Z

server/src/main/java/org/elasticsearch/search/profile/query/ProfileCollectorManager.java

+        }
+        this.collectorTree = new CollectorResult(
+            this.getClass().getSimpleName(),
+            CollectorResult.REASON_SEARCH_TOP_HITS + "_max",


what is the rationale of tracking max? What are our options? I was wondering if we want to output the full tree structure of the collector manager, for each slice collector.

Also the reason should be a constructor argument?

what is the rationale of tracking max? What are our options?

I thought "max" is most natural. The earlier child profilers (one per thread) should account for "search_top_hits" time spent in each thread. Since they run simultaneously, the maximum value should be an upper bound for the threads as a whole. Alternatives would be to use "sum" I guess, but that would hide the benefits of the parallel execution and appear longer than execution actually is?

Also the reason should be a constructor argument?

I don't fully understand the question in this context

Reason: it is hardcoded, which works well for dfs, but other phases will need to provide it, I thought.

As for what to output, I am thinking that we should not aggregate the output of the different collectors together, but rather expose them as they are, instead of having one collector with children, in a tree structure, we'll have a collector manager, which holds a list of collectors each with children, in a tree structure. How does that sound?

javanna · 2023-06-12T11:28:23Z

server/src/test/java/org/elasticsearch/search/profile/query/ProfileCollectorManagerTests.java

+import java.util.Collections;
+import java.util.List;
+
+public class ProfileCollectorManagerTests extends ESTestCase {


would it make sense to have a test that simulates profiling a search with and without concurrency? if you call newIndexSearcher it will randomize the slices and randomly set the executor

Good idea, what should be the goal of that test? Would we want to e.g. prep the searcher with data and then see if all results are there (in this case the top docs?) The profiling times wouldn't make very much sense to check except that they are there.

I would test the output of profiling, timing per se does not make sense to check indeed.

elasticsearchmachine · 2023-06-13T08:53:30Z

Hi @cbuescher, I've created a changelog YAML for you.

elasticsearchmachine · 2023-06-13T08:53:30Z

Pinging @elastic/es-search (Team:Search)

javanna · 2023-06-14T07:54:49Z

@cbuescher I have integrated your work, plus the child collectors in my draft PR. See https://github.com/elastic/elasticsearch/pull/96622/files#diff-d584ed5c563e96a2f3c821072e5039bb0543ab7f8b94d76a19ca894dde79abd0 . Can you double check it and pull the changes into your PR? I don't expect mine to go in, but rather to take the different pieces and open separate PRs for each one.

cbuescher · 2023-06-14T10:26:33Z

I forked of #96828 for introduction of the new collector manager and will update this PR to
only concentrate on including it in the DfsPhase once the former is in.

cbuescher · 2023-06-22T08:17:35Z

I think the profiler code path isn't quite multi-thread ready just yet. The test fails when trying to start a Timer instance for QueryTimingType.BUILD_SCORER twice without stopping it.

@javanna I was able to solve the problem with concurrent timer access with some modifications to AbstractProfileBreakdown in 98c25a9 yesterday. The multithreaded test scenario for the DfsPhase passes now.
What this essentially does is to allow to keep a list of Timer instances per timing type (somewhat similar to what collector managers do for collectors). The final timing that is returned for the profiler output is the sum of all individual timers in this case. The main discussion whether to use sum, max or even be more verbose with the output is the same here as with the CollectorResults, but for e.g. the query profiler breakdown I didn't want to introduce a new output format because in contrast to the ColletorResult output there is no notion of "children" here in the current output format yet. Thats why I opted to sum the different timer measurements here.
Since the change in 98c25a9 is a bit more involved, it probably also makes sense to split it of into a separate PR and discuss details there, wdyt?

cbuescher · 2023-06-22T13:09:26Z

I split the profile breakdown changes to #97013 for easier review.

This changes AbstractProfileBreakdown to enable its use in multi-threaded contexts. It allows to keep a list of Timer instances per timing type. This is somewhat similar to what collector managers do for collectors. The final timing that is returned for the profiler output is the sum of all individual timers. Relates to #96689

cbuescher · 2023-06-29T10:25:18Z

@javanna I changed the ProfileCollectorManager output slightly again. This is how the output looks like now for the case where the manager only ever outputs one collector (the single threaded case), This stays as is:

{
	"name": "SimpleTopScoreDocCollector",
	"reason": "search_top_hits",
	"time_in_nanos": 527542
}

And this is what it would look like for more than one collector:

{
	"name": "SimpleTopScoreDocCollector_Manager",
	"reason": "search_top_hits",
	"time_in_nanos": 901935,
	"children": [{
		"name": "SimpleTopScoreDocCollector",
		"reason": "search_top_hits",
		"time_in_nanos": 482237
	}, {
		"name": "SimpleTopScoreDocCollector",
		"reason": "search_top_hits",
		"time_in_nanos": 419698
	}]
}

There are some changes I'm still considering worth discussing:

dropping all the children and simply leave that level of verbosity. Completely hide the fact that we have used multiple collectors in a potentially concurrent manner. That would mean simply use the name of the first wrapped collector and report the sum of all timings
change the name of the parent collector so something different. Maybe just also use the name of the first collector.

cbuescher · 2023-06-30T11:09:00Z

dropping all the children and simply leave that level of verbosity.

After some more discussion with @javanna thats what we want to go with. Profile output is quite verbose already, and we don't really know how if different collectors are actually executed in parallel. If we need this level of granularity in the future we will look into it then.
This should make the ProfileCollectorManager output transparent to the non-parallel case, so no special treatment in tests is needed as well.
I pushed the necessary changes.

javanna

I am good with the approach. There's a couple of improvements that I suggested but they can be worked on as follow-up tasks

javanna · 2023-06-30T13:50:14Z

server/src/main/java/org/elasticsearch/search/dfs/DfsPhase.java

+        );
+
+        if (profilers != null) {
+            ProfileCollectorManager ipcm = new ProfileCollectorManager(cm, CollectorResult.REASON_SEARCH_TOP_HITS);


Would you like to play, as a follow-up, to make the profile collector manager typed, so that it can return the same result as the collector manager it wraps? That way its reduce could reduce the profile outputs, but also return the top docs, and you would not need that adapter? I think that would help me as well later. have you tried this already and maybe hit blockers?

javanna · 2023-06-30T13:51:41Z

server/src/test/java/org/elasticsearch/search/dfs/DfsPhaseTests.java

+        terminate(threadPool);
+    }
+
+    public void testSingleKnnSearch() throws IOException {


Thanks for adding this test! Do we need to test this specific method or could we instead test executeKnnVectorQuery directly? Maybe as a follow-up we can unit test also the terms statistics collection? we should have totally written these tests 10 years ago but it seems like we haven't :)

javanna · 2023-06-30T13:52:06Z

server/src/test/java/org/elasticsearch/search/dfs/DfsPhaseTests.java

+
+            // run with profiling enabled
+            Profilers profilers = new Profilers(searcher);
+            dfsKnnResults = DfsPhase.singleKnnSearch(query, k, profilers, searcher);


should we have two tests, one with profiling and one without?

javanna · 2023-06-30T20:54:52Z

server/src/main/java/org/elasticsearch/search/profile/query/ProfileCollectorManager.java

+        long totalTime = resultsPerProfiler.stream().map(CollectorResult::getTime).reduce(0L, Long::sum);
+        String collectorName;
+        if (resultsPerProfiler.size() == 0) {
+            // in case no new collector was ever requested, create a new one just to get the name.


I think we could assert that this never happens. The contract is that if reduce is called, newCollector was called at least once?

javanna · 2023-06-30T20:56:31Z

server/src/main/java/org/elasticsearch/search/dfs/DfsPhase.java

+            // Set the current searcher profiler to gather query profiling information for gathering top K docs
+            searcher.setProfiler(knnProfiler);
+        }
+        TopDocs topDocs = searcher.search(knnQuery, cm);


This looks better than before. I think that you may need to update existing usages of the profile collector manager to address warnings around their unchecked usage now that it's typed?

WIP introduce MultiProfileCollectorManager

9a410bc

elasticsearchmachine added the v8.9.0 label Jun 8, 2023

javanna reviewed Jun 8, 2023

View reviewed changes

server/src/main/java/org/elasticsearch/search/profile/query/MultiProfileCollectorManager.java Outdated Show resolved Hide resolved

Iter, adding tests

4b94099

cbuescher changed the title ~~WIP introduce MultiProfileCollectorManager~~ WIP introduce ProfileCollectorManager Jun 8, 2023

javanna reviewed Jun 12, 2023

View reviewed changes

iter

14797fd

cbuescher force-pushed the dfs-phase-multithreadedcollectors-wip branch from b0c851f to 14797fd Compare June 12, 2023 15:59

cbuescher added 2 commits June 12, 2023 22:09

fix docs test

c34a877

Merge branch 'main' into dfs-phase-multithreadedcollectors-wip

82f8242

cbuescher changed the title ~~WIP introduce ProfileCollectorManager~~ Introduce ProfileCollectorManager Jun 13, 2023

cbuescher marked this pull request as ready for review June 13, 2023 08:49

elasticsearchmachine added the needs:triage Requires assignment of a team area label label Jun 13, 2023

cbuescher added :Search/Search Search-related issues that do not fall into other categories >enhancement and removed needs:triage Requires assignment of a team area label labels Jun 13, 2023

elasticsearchmachine added the Team:Search Meta label for search team label Jun 13, 2023

Update docs/changelog/96689.yaml

7c66bdc

cbuescher mentioned this pull request Jun 14, 2023

Introduce ProfileCollectorManager #96828

Merged

cbuescher changed the title ~~Introduce ProfileCollectorManager~~ Introduce ProfileCollectorManager in DfsPhase Jun 14, 2023

cbuescher changed the title ~~Introduce ProfileCollectorManager in DfsPhase~~ Use ProfileCollectorManager in DfsPhase Jun 14, 2023

Merge branch 'main' into dfs-phase-multithreadedcollectors-wip

55a172c

cbuescher force-pushed the dfs-phase-multithreadedcollectors-wip branch from c1e97ac to 3220699 Compare June 15, 2023 21:31

wip dfsPhase test

c77664c

cbuescher force-pushed the dfs-phase-multithreadedcollectors-wip branch from 77afecb to c5e5be4 Compare June 21, 2023 22:27

pugnascotia added v8.10.0 and removed v8.9.0 labels Jun 22, 2023

cbuescher mentioned this pull request Jun 22, 2023

Manage multiple timers per type in AbstractProfileBreakdown #97013

Merged

cbuescher added 3 commits June 26, 2023 17:34

Merge branch 'main' into dfs-phase-multithreadedcollectors-wip

dd1b242

multiple timers PR 97013 squashed

b366795

update skip version in yaml test

20537e5

cbuescher added 6 commits June 27, 2023 15:11

Merge branch 'main' into dfs-phase-multithreadedcollectors-wip

22aae35

cleanup

971fb58

small test changes

200e6e2

Changes to CollectorResult output

8fa91bf

spotless

39ea2ee

fixing tests

748dbb7

cbuescher force-pushed the dfs-phase-multithreadedcollectors-wip branch from de19a00 to 748dbb7 Compare June 29, 2023 10:18

cbuescher added 2 commits June 30, 2023 13:04

Don't report detailed collectors per manager

33187c2

Merge branch 'main' into dfs-phase-multithreadedcollectors-wip

5ee6598

javanna approved these changes Jun 30, 2023

View reviewed changes

make the profile collector manager typed

e141a97

javanna reviewed Jun 30, 2023

View reviewed changes

cbuescher added 2 commits June 30, 2023 23:17

assert newCollector has been called

0cdd332

iter test

0f6958c

cbuescher merged commit 5eeaecd into elastic:main Jul 3, 2023
12 checks passed

cbuescher deleted the dfs-phase-multithreadedcollectors-wip branch July 3, 2023 11:46

saikatsarkar056 mentioned this pull request Nov 6, 2023

Enhance stats for KNN Search in profile API #99352

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use ProfileCollectorManager in DfsPhase #96689

Use ProfileCollectorManager in DfsPhase #96689

cbuescher commented Jun 8, 2023

javanna Jun 8, 2023

cbuescher Jun 8, 2023

javanna Jun 8, 2023

cbuescher commented Jun 8, 2023

javanna left a comment

javanna Jun 12, 2023

javanna Jun 12, 2023

cbuescher Jun 12, 2023 •

edited

Loading

javanna Jun 13, 2023

javanna Jun 12, 2023

cbuescher Jun 12, 2023

javanna Jun 13, 2023

elasticsearchmachine commented Jun 13, 2023

elasticsearchmachine commented Jun 13, 2023

javanna commented Jun 14, 2023

cbuescher commented Jun 14, 2023

cbuescher commented Jun 22, 2023

cbuescher commented Jun 22, 2023

cbuescher commented Jun 29, 2023

cbuescher commented Jun 30, 2023

javanna left a comment

javanna Jun 30, 2023

javanna Jun 30, 2023

javanna Jun 30, 2023

javanna Jun 30, 2023

javanna Jun 30, 2023

Use ProfileCollectorManager in DfsPhase #96689

Use ProfileCollectorManager in DfsPhase #96689

Conversation

cbuescher commented Jun 8, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cbuescher commented Jun 8, 2023

javanna left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cbuescher Jun 12, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

elasticsearchmachine commented Jun 13, 2023

elasticsearchmachine commented Jun 13, 2023

javanna commented Jun 14, 2023

cbuescher commented Jun 14, 2023

cbuescher commented Jun 22, 2023

cbuescher commented Jun 22, 2023

cbuescher commented Jun 29, 2023

cbuescher commented Jun 30, 2023

javanna left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cbuescher Jun 12, 2023 •

edited

Loading