Add an option to SearchQuery to choose a search query execution strategy #3792

jihoonson · 2016-12-20T19:04:57Z

This PR is related to #3775

Supported strategies are

Index-only query execution
Cursor-based scan
Auto: choose an efficient strategy for a given query

This change is

…egy. Supported strategies are 1) Index-only query execution 2) Cursor-based scan 3) Auto: choose an efficient strategy for a given query

jihoonson · 2016-12-20T19:18:01Z

Here are some JMH benchmark results (SearchBenchmark.java) to decide proper thresholds for selecting a good strategy. The benchmark tests were conducted with my local machine.

Operation time with varying filter selectivity

Operation time with varying search dimension's cardinality

fjy · 2016-12-20T23:13:15Z

processing/src/main/java/io/druid/query/search/SearchQueryRunner.java

@@ -71,6 +73,12 @@
 public class SearchQueryRunner implements QueryRunner<Result<SearchResultValue>>
 {
  private static final EmittingLogger log = new EmittingLogger(SearchQueryRunner.class);
+
+  private static final double HIGH_FILTER_SELECTIVITY_THRESHOLD_FOR_CONCISE = 0.99;


can the strategies themselves return these constants? This way if we add additional bitmap algorithms we can just add new strategies

Actually reading more over this code, I think this might belong in the bitmap factories

@gianm thoughts?

I don't think these should be in the bitmap factories, since the thresholds are experimentally determined and specific to the Search Query algorithm. I doubt anything else would find them useful, so IMO, they belong in the Search Query.

I added SearchQueryDecisionHelper which provides such information.

fjy · 2016-12-20T23:15:23Z

processing/src/main/java/io/druid/query/search/SearchQueryRunner.java

-              }
-              if (retVal.size() >= limit) {
-                return makeReturnResult(limit, retVal);
+      switch (strategy) {


instead of having a switch statement here, can we instead have the strategy return the execution strategy?

return strategy.getExecutionPlan();

fjy · 2016-12-20T23:16:42Z

processing/src/main/java/io/druid/query/search/search/SearchQuery.java

@@ -49,6 +49,14 @@
  private final List<DimensionSpec> dimensions;
  private final SearchQuerySpec querySpec;
  private final int limit;
+  private final Strategy strategy;
+
+  public enum Strategy


i wonder if SearchStrategy should be an interface that we extend

fjy · 2016-12-20T23:19:45Z

processing/src/main/java/io/druid/query/search/SearchQueryRunner.java

-      throw new ISE(
-          "Null storage adapter found. Probably trying to issue a query against a segment being memory unmapped."
-      );
+  private static boolean isLowCardinality(final BitmapFactory bitmapFactory, final long totalCard) {


i think we should have a method on bitmapFactory that returns if its a low or high cardinality bitmap

Moved to SearchQueryDecisionHelper

I disagree, I think this method works better in the SearchQuery since the meaning of "low" and "high" is very subjective and, in this case, determined experimentally for a specific query. It's possible that for a different query, different thresholds would be needed.

fjy · 2016-12-20T23:20:13Z

processing/src/main/java/io/druid/query/search/SearchQueryRunner.java

-    final Iterable<DimensionSpec> dimsToSearch;
-    if (dimensions == null || dimensions.isEmpty()) {
-      dimsToSearch = Iterables.transform(adapter.getAvailableDimensions(), Druids.DIMENSION_IDENTITY);
+  private static boolean isHighSelectivityFilter(final QueryableIndex index,


same here, the bitmap should tell us this information

Moved to SearchQueryDecisionHelper

fjy · 2016-12-20T23:22:12Z

processing/src/main/java/io/druid/query/search/search/SearchQuery.java

@@ -254,6 +275,10 @@ public boolean equals(Object o)
      return false;
    }

+    if (!strategy.equals(that.strategy)) {


what about hashcode?

Hash code is already done here.
https://github.com/druid-io/druid/pull/3792/files/f8d5c5ee8cb9d536017f6aaa538dbdc21f2ff9d3#diff-0cd94720b9f3bbe8caa4e047dc3bb5e5R295

jihoonson · 2016-12-21T17:25:02Z

@fjy thank you for your review. I updated my patch.

gianm

This is a partial review since I started before @jihoonson's recent refactoring. Will review the new version again.

gianm · 2016-12-21T15:30:14Z

processing/src/main/java/io/druid/query/search/SearchQueryRunner.java

@@ -71,6 +73,12 @@
 public class SearchQueryRunner implements QueryRunner<Result<SearchResultValue>>
 {
  private static final EmittingLogger log = new EmittingLogger(SearchQueryRunner.class);
+
+  private static final double HIGH_FILTER_SELECTIVITY_THRESHOLD_FOR_CONCISE = 0.99;


I don't think these should be in the bitmap factories, since the thresholds are experimentally determined and specific to the Search Query algorithm. I doubt anything else would find them useful, so IMO, they belong in the Search Query.

gianm · 2016-12-21T15:33:48Z

processing/src/main/java/io/druid/query/search/SearchQueryRunner.java

@@ -94,6 +102,7 @@ public SearchQueryRunner(Segment segment)
    final SearchQuerySpec searchQuerySpec = query.getQuery();
    final int limit = query.getLimit();
    final boolean descending = query.isDescending();
+    final Strategy strategy = query.getStrategy();


Usually we do tunings like this as "context" flags rather than explicitly making objects for them. So I'd do something similar here. The code would be like:

final Strategy strategy = Strategy.valueOf(query.getContextValue(CTX_KEY_STRATEGY, DEFAULT_STRATEGY).toUpperCase());

Where CTX_KEY_STRATEGY and DEFAULT_STRATEGY are string constants that have values like "searchQueryStrategy" and "auto". See the groupBy "Query context" docs for a similar example. http://druid.io/docs/latest/querying/groupbyquery.html#query-context

gianm · 2016-12-21T16:10:21Z

processing/src/main/java/io/druid/query/search/SearchQueryRunner.java

+                                                               final int limit,
+                                                               final BitmapFactory bitmapFactory)
+  {
+    log.info("Index only query execution strategy is selected.");


Don't log anything at info level during a query, it will be way too many logs.

gianm · 2016-12-21T17:26:16Z

processing/src/main/java/io/druid/query/search/SearchQueryRunner.java

-      throw new ISE(
-          "Null storage adapter found. Probably trying to issue a query against a segment being memory unmapped."
-      );
+  private static boolean isLowCardinality(final BitmapFactory bitmapFactory, final long totalCard) {


I disagree, I think this method works better in the SearchQuery since the meaning of "low" and "high" is very subjective and, in this case, determined experimentally for a specific query. It's possible that for a different query, different thresholds would be needed.

gianm

Looking good so far, left a few comments. thanks @jihoonson.

gianm · 2016-12-21T17:39:01Z

processing/src/main/java/io/druid/query/search/SearchQueryRunner.java

-          for (int i = startIndex; i <= endIndex; i++) {
-            timeBitmap.add(i);
-          }
+    final SearchStrategy strategy = query.getStrategy();


Usually we do tunings like this as "context" flags rather than explicitly making objects for them. So I'd do something similar here. The code would be like:

final SearchStrategy strategy = SearchStrategy.fromString(query.getContextValue(CTX_KEY_STRATEGY, DEFAULT_STRATEGY));

Where CTX_KEY_STRATEGY and DEFAULT_STRATEGY are string constants that have values like "searchQueryStrategy" and "auto", and SearchStrategy.fromString is something like QueryGranularity.fromString.

To me the main reason to prefer context parameters is that it allows users to add tunings from newer versions of Druid without their client needing to be aware of a new query field. Almost all Druid clients allow users to add arbitrary context parameters but not all of them allow arbitrary top-level query parameters.

See also GroupByStrategySelector and the groupBy "Query context" http://druid.io/docs/latest/querying/groupbyquery.html#query-context for how strategy selection & tuning is handled in GroupBy.

Thanks. I'll change.

gianm · 2016-12-21T17:59:23Z

processing/src/main/java/io/druid/query/search/search/AutoStrategy.java

+      if (filter == null ||
+          index.getDecisionHelper().hasLowCardinality(index, dimsToSearch) ||
+          index.getDecisionHelper().hasHighSelectivity(index, timeFilteredBitmap)) {
+        log.info("Index-only execution strategy is selected");


Don't log anything at info level during query execution, there will be too many logs. debug or trace is preferred.

gianm · 2016-12-21T18:11:17Z

processing/src/main/java/io/druid/segment/QueryableIndex.java

+  Metadata getMetadata();
+  Map<String, DimensionHandler> getDimensionHandlers();
+
+  SearchQueryDecisionHelper getDecisionHelper();


IMO this decision-making should be self-contained within the Search query rather than bleeding into the QueryableIndex interface.

My rationale is that the thresholds embodied in this decision helper are experimentally determined and specific to the Search query algorithm, and ideally nothing else should know about them. Moving this to the Search query does mean that the Search query will need to know about different kinds of bitmaps, but I think that's better than the QueryableIndex knowing about the Search query.

Agree. How about making DecisionHelper based on index type in SearchStrategy? DecisionHelper can be used in only SearchStrategy as well as separating from index.

That sounds good to me.

gianm · 2016-12-21T18:14:50Z

processing/src/main/java/io/druid/query/search/search/AutoStrategy.java

+      // Index-only strategy is selected when
+      // 1) there is no filter,
+      // 2) the total cardinality is very low, or
+      // 3) the filter has a very high selectivity.


If the filter is highly selective, doesn't that mean we want to use the cursor-based strategy?

Ah, I think we're using different definitions of "high selectivity". I think of it the same way as this guy: https://blogs.msdn.microsoft.com/bartd/2011/01/25/query-tuning-fundamentals-density-predicates-selectivity-and-cardinality/

Selectivity for a filter predicate against a base table can be calculated as “[# rows that pass the predicate]/[# rows in the table]”. If the predicate passes all rows in the table, its selectivity is 1.0. If it disqualifies all rows, its selectivity is 0. (This can be confusing. Note that 0.000001 reflects a high selectivity even though the number is small, while 1.0 is low selectivity even though the number is higher.)

So > .99 selectivity would be "very low" if the two of us are thinking about it the right way.

oh, I was confused. Thanks.

gianm · 2016-12-21T18:46:48Z

processing/src/main/java/io/druid/query/search/search/AutoStrategy.java

+        throw new IAE("Should only have one interval, got[%s]", intervals);
+      }
+      final Interval interval = intervals.get(0);
+      final ImmutableBitmap timeFilteredBitmap = IndexOnlyExecutor.makeTimeFilteredBitmap(index,


There's a couple of performance issues here:

makeTimeFilteredBitmap isn't cheap, and if we use the index-based execution we will end up calling it twice.

filter.getBitmapIndex isn't cheap either, and if we end up doing the cursor-based execution, it'll get called twice too (once by makeTimeFilteredBitmap and once by makeCursors).

Some possible strategies to help:

Don't generate the timeFilteredBitmap until we actually need to call hasHighSelectivity. i.e. if the search dims are low cardinality, just go straight to the index-only strategy. And if they're very high cardinality, similar to the number of rows in the segment, then possibly go straight to the cursor-only strategy (please do some benchmarks to verify this guess though). We should only need to check the filter selectivity if search dimension cardinality is medium.

Save the timeFilteredBitmap after generating it, and pass it to the index-based executor if we choose that one.

I think one thing we should not do is save the bitmap and pass it to makeCursors. That's probably too leaky and I would rather accept the performance hit there.

gianm · 2016-12-21T18:49:21Z

processing/src/main/java/io/druid/query/search/search/SearchQueryExecutor.java

+  static Iterable<DimensionSpec> getDimsToSearch(Indexed<String> availableDimensions, List<DimensionSpec> dimensions)
+  {
+    if (dimensions == null || dimensions.isEmpty()) {
+      return Iterables.transform(availableDimensions, Druids.DIMENSION_IDENTITY);


This is lazily computed, to save on allocations you can instead wrap it in ImmutableList.copyOf, like

return ImmutableList.copyOf(Iterables.transform(availableDimensions, Druids.DIMENSION_IDENTITY));

gianm · 2016-12-21T18:52:07Z

benchmarks/src/main/java/io/druid/benchmark/query/SearchBenchmark.java

@@ -171,6 +159,39 @@ private void setupQueries()
      basicQueries.put("A", queryBuilderA);
    }

+    { // basic.B


Could you please attach results of this benchmark before and after the patch?

gianm · 2016-12-21T18:53:05Z

processing/src/main/java/io/druid/query/search/search/SearchStrategy.java

+    @JsonSubTypes.Type(name = "indexOnly", value = IndexOnlyStrategy.class),
+    @JsonSubTypes.Type(name = "cursorBased", value = CursorBasedStrategy.class)
+})
+public abstract class SearchStrategy


The Jackson annotations here are unnecessary if this is changed to be a context parameter.

gianm · 2016-12-21T18:54:13Z

@fjy could you chime in about a whether you agree with having the strategy be a context parameter (#3792 (comment)) and whether you agree that the search query should self-contain its own decision making with respect to the bitmap types (#3792 (comment))?

fjy · 2016-12-21T18:57:14Z

@gianm agree

jihoonson · 2016-12-27T10:28:37Z

Addressed comments. I'll fix the conflicts and add benchmark results soon.

…h-query-strategy

jihoonson · 2016-12-30T04:42:11Z

Here is a result of simple benchmark.

2 filters for dimUniform and dimHyperUnique dimensions.
- Each filter has 0.1 selectivity.
- The cardinality of both dimensions is 100,000.

With concise bitmap

cursorBased

Benchmark                                  (limit)  (numSegments)  (rowsPerSegment)  (schemaAndQuery)  Mode  Cnt       Score      Error  Units
SearchBenchmark.queryMultiQueryableIndex    750000              1            750000           basic.B  avgt   25  267250.791 ± 9636.399  us/op
SearchBenchmark.querySingleQueryableIndex   750000              1            750000           basic.B  avgt   25  242133.054 ± 7486.306  us/op


indexOnly (before patch)

Benchmark                                  (limit)  (numSegments)  (rowsPerSegment)  (schemaAndQuery)  Mode  Cnt         Score        Error  Units
SearchBenchmark.queryMultiQueryableIndex    750000              1            750000           basic.B  avgt   25  14607220.224 ± 133086.094  us/op
SearchBenchmark.querySingleQueryableIndex   750000              1            750000           basic.B  avgt   25  14819905.575 ± 180125.594  us/op


auto (after patch)

Benchmark                                  (limit)  (numSegments)  (rowsPerSegment)  (schemaAndQuery)  Mode  Cnt       Score       Error  Units
SearchBenchmark.queryMultiQueryableIndex    750000              1            750000           basic.B  avgt   25  491054.876 ± 25474.403  us/op
SearchBenchmark.querySingleQueryableIndex   750000              1            750000           basic.B  avgt   25  468489.379 ± 11342.444  us/op

With roaring bitmap

cursorBased

Benchmark                                  (limit)  (numSegments)  (rowsPerSegment)  (schemaAndQuery)  Mode  Cnt       Score      Error  Units
SearchBenchmark.queryMultiQueryableIndex    750000              1            750000           basic.B  avgt   25  438806.602 ± 7831.121  us/op
SearchBenchmark.querySingleQueryableIndex   750000              1            750000           basic.B  avgt   25  435590.852 ± 8456.003  us/op


indexOnly (before patch)

Benchmark                                  (limit)  (numSegments)  (rowsPerSegment)  (schemaAndQuery)  Mode  Cnt       Score       Error  Units
SearchBenchmark.queryMultiQueryableIndex    750000              1            750000           basic.B  avgt   25  1438733.826 ± 27503.264  us/op
SearchBenchmark.querySingleQueryableIndex   750000              1            750000           basic.B  avgt   25  1494863.759 ± 59221.451  us/op


auto (after patch)

Benchmark                                  (limit)  (numSegments)  (rowsPerSegment)  (schemaAndQuery)  Mode  Cnt       Score       Error  Units
SearchBenchmark.queryMultiQueryableIndex    750000              1            750000           basic.B  avgt   25  830984.652 ± 22974.701  us/op
SearchBenchmark.querySingleQueryableIndex   750000              1            750000           basic.B  avgt   25  824529.731 ± 18906.598  us/op

You can also see the comparison graph here.

Auto strategy is little bit slower than Cursor-based strategy. I suppose Auto strategy calls IndexOnlyStrategy.makeTimeFilteredBitmap() internally and this makes the difference.

gianm

Thanks @jihoonson. Other than the line comments I've got these two general ones:

please add docs for the new context flags to searchquery.md (see groupbyquery.md for an example)
please apply Druid code style to all files as mentioned in https://github.com/druid-io/druid/blob/master/CONTRIBUTING.md

gianm · 2017-01-04T01:38:29Z

processing/src/main/java/io/druid/query/search/SearchStrategySelector.java

+
+    switch (strategyString) {
+      case AutoStrategy.NAME:
+        log.debug("Auto strategy is selected");


Would be nice to include the query ID in these log messages.

gianm · 2017-01-04T02:13:20Z

processing/src/main/java/io/druid/query/search/search/AutoStrategy.java

+      if (filter == null ||
+          helper.hasLowCardinality(index, dimsToSearch) ||
+          helper.hasLowSelectivity(index, timeFilteredBitmap)) {
+        log.debug("Index-only execution strategy is selected");


Would be nice to include the query id in this log message too, from query.getId()

gianm · 2017-01-04T02:28:41Z

processing/src/main/java/io/druid/query/search/search/IndexOnlyStrategy.java

+
+public class IndexOnlyStrategy extends SearchStrategy
+{
+  public static final String NAME = "indexOnly";


"indexOnly" isn't really the right name for this, since we might not always use the index. How about naming the three strategies "cursorOnly", "useIndexes", and "auto"?

Sounds good. I changed names.

gianm · 2017-01-04T02:39:54Z

processing/src/main/java/io/druid/query/search/search/SearchQueryConfig.java

  @JsonProperty
  @Min(1)
  private int maxSearchLimit = 1000;

+  @JsonProperty
+  private String searchStrategy = AutoStrategy.NAME;


This made me wonder if auto is a good default. TLDR: I think default to auto is fine.

Longer answer: It looks to me like there will be no necessary performance degradation from the old code by defaulting to "auto". There could be some degradation for some queries if we choose cursor-based when index-based would actually be better, but I'm willing to take that risk in order to have #3775 fixed with the default config.

Sounds good. Let's make index-only strategy default. We can change later if auto strategy becomes sufficiently smart and efficient.

Okay, I think that's fine, in that case I'll tag this with "release notes" so we can let people know in the notes that there's a new option they can turn on.

After improving auto strategy by avoiding computing timeFilteredBitmap, I think we can use it as the default search strategy. It shows nearly optimal performance. Please refer to #3792 (comment).

gianm · 2017-01-04T02:41:39Z

processing/src/main/java/io/druid/query/search/search/ConciseBitmapDecisionHelper.java

@@ -0,0 +1,12 @@
+package io.druid.query.search.search;
+
+public class ConciseBitmapDecisionHelper extends SearchQueryDecisionHelper


let's make this a singleton.

gianm · 2017-01-04T02:43:39Z

processing/src/main/java/io/druid/segment/data/ConciseBitmapSerdeFactory.java

 import io.druid.collections.bitmap.BitmapFactory;
 import io.druid.collections.bitmap.ConciseBitmapFactory;
 import io.druid.collections.bitmap.ImmutableBitmap;
 import io.druid.collections.bitmap.WrappedImmutableConciseBitmap;
 import it.uniroma3.mat.extendedset.intset.ImmutableConciseSet;

+import java.nio.ByteBuffer;


Or this file.

gianm · 2017-01-04T02:44:12Z

processing/src/test/java/io/druid/query/TestQueryRunners.java

-    QueryRunnerFactory factory = new SearchQueryRunnerFactory(new SearchQueryQueryToolChest(
-          new SearchQueryConfig(),
+    final SearchQueryConfig config = new SearchQueryConfig();
+    QueryRunnerFactory factory = new SearchQueryRunnerFactory(


This supplier can be Suppliers.ofInstance(config).

gianm · 2017-01-04T03:18:03Z

processing/src/main/java/io/druid/query/search/search/AutoStrategy.java

+      // Index-only strategy is selected when
+      // 1) there is no filter,
+      // 2) the total cardinality is very low, or
+      // 3) the filter has a very low selectivity.


hmm, I wonder if we can do a better heuristic here. Instead of checking for "low cardinality" and "low selectivity" does it make sense to use cost functions like:

cursor-based is numRowsInSegment * filterSelectivity * costOfCheckingSingleValue

index-based with a filter is dimensionCardinality * (costOfCheckingSingleValue + searchQuerySelectivity * costOfFilterIntersection)

index-based without a filter is dimensionCardinality * costOfCheckingSingleValue

And choose the strategy with the lowest cost. Unfortunately we don't know what searchQuerySelectivity is in advance, so I think we'd have to just use some constant.

One behavior difference is that if dimensionCardinality is equal to numRowsInSegment, like a unique key, then cost functions like that would choose the cursor-based strategy regardless of filter selectivity (since dimensionCardinality >= numRowsInSegment * filterSelectivity for all values of filterSelectivity). The current code would still use the index-based approach if filter selectivity is lower than the threshold. I'd guess that using cursor-based is actually better, though.

We can also skip computing timeFilteredBitmap if cursor-based already beats index-based (or comes close) before applying the * filterSelectivity term. Based on your benchmarks, it looks like this can save substantial time.

@jihoonson what do you think?

I'm open to doing either the cost based approach or your current approach, so feel free to try to convince me either way.

Thank you for the good suggestion. I agree that cost-based approach is cool. Here are some issues what I'm considering.

Disk access patterns of cursor-based and index-only strategies look different. The disk access cost can be ignored like you said above only when they are almost same, but I'm not sure that they are.

When using cursor-based strategy, some filters can be pushed down into the cursor (preFilters), thereby actually skipping reading non-result data. However, it is difficult to figure out which part of filters are preFilters outside QueryableIndexStorageAdapter.makeCursors(). To do so, some refactoring is needed. This also makes implementing cost-based approach difficult.

I think we can simply ignore these details at first, and improve later. What do you think?

Sure, ignoring this for now sounds good.

Although your comment makes me realize that SearchQueryRunner has a bug right now… it calls filter.getBitmapIndex without first calling filter.supportsBitmapIndex to see if that call is valid. The Filter interface doesn't say this (it probably should) but it is bad to call getBitmapIndex if supportsBitmapIndex is false. Callers should use makeMatcher instead.

We should fix that; do you prefer to do it as part of this patch of a follow up?

Btw, since we are ignoring this for now, could you please write up a github issue about how we could make this smarter in the future? That way, if someone ever complains about the smartness, we can point them to that issue and ask them to help :)

It is equivalent now, but only by luck. We are planning to add some filters in the future that don't ever support bitmaps (like filters that filter based on expressions). Those would return false from "supportsBitmapIndex" even if the underlying dimensions did have indexes, since they'd always want to operate on a row by row basis.

Also, filter.getBitmapIndex() is called for the query-level filter, which might be on a different column than the dimension being search, so even with the current set of filters, things will not work properly. For example try filtering on a long column and searching on a different, string column with a single search query.

Right. I missed this is possible. I'll fix it.

@gianm, I created #3832 for improving cost-based planner.

And for timeFilteredBitmap, making it is expensive because it requires a lot of unions of bitmaps for filters.
I have a plan to avoid unions by estimating filter selectivity. I already have some benchmark result which shows nearly optimal performance. However, this work needs some new codes and refactoring, so I think it would be better to proceed in another issue. I'll make a PR soon.

gianm · 2017-01-04T03:19:31Z

processing/src/main/java/io/druid/query/search/search/ConciseBitmapDecisionHelper.java

+public class ConciseBitmapDecisionHelper extends SearchQueryDecisionHelper
+{
+  private static final double LOW_FILTER_SELECTIVITY_THRESHOLD = 0.99;
+  private static final int LOW_CARDINALITY_THRESHOLD = 5000;


Could you please briefly describe, in a comment, where these numbers came from? (if you decide to keep using them rather than the cost based approach)

gianm · 2017-01-04T03:19:53Z

processing/src/main/java/io/druid/query/search/search/RoaringBitmapDecisionHelper.java

+public class RoaringBitmapDecisionHelper extends SearchQueryDecisionHelper
+{
+  private static final double LOW_FILTER_SELECTIVITY_THRESHOLD = 0.65;
+  private static final int LOW_CARDINALITY_THRESHOLD = 1000;


Could you please briefly describe, in a comment, where these numbers came from? (if you decide to keep using them rather than the cost based approach)

jihoonson · 2017-01-09T16:49:41Z

@gianm, thank you for the review. I updated the patch.

fjy · 2016-12-21T18:29:20Z

processing/src/main/java/io/druid/query/search/search/AutoStrategy.java

@@ -0,0 +1,60 @@
+package io.druid.query.search.search;


missing license header here and in other new files

Seems you saw an old patch. I added license headers.

fjy · 2017-01-10T18:19:35Z

👍 from me

gianm

@jihoonson - looks good other than some minor comments about docs & tests.

gianm · 2017-01-09T23:41:08Z

docs/content/querying/searchquery.md

+
+#### Strategies
+
+Search queries can be executed using two different strategies. The default strategy is determeined by the


"determined" (spelling)

Fixed. Thanks.

gianm · 2017-01-10T19:55:31Z

docs/content/querying/searchquery.md

+only the rows which satisfy those filters, thereby saving I/O cost. However, it might be slow with filters of low selectivity.
+
+- "auto" strategy uses a cost-based planner for choosing an optimal search strategy. It estimates the cost of index-only
+and cursor-based execution plans, and chooses the optimal one. Currently, its performance is suboptimal due to the large


How about "it is not enabled by default due to the overhead of cost estimation."

gianm · 2017-01-10T20:36:06Z

docs/content/querying/searchquery.md

+
+#### Query context
+
+The following runtime properties apply:


These are "query context parameters" not "runtime properties".

gianm · 2017-01-10T21:53:56Z

processing/src/test/java/io/druid/query/search/SearchQueryRunnerWithCaseTest.java

+    configs[0] = new SearchQueryConfig();
+    configs[0].setSearchStrategy(UseIndexesStrategy.NAME);
+    configs[1] = new SearchQueryConfig();
+    configs[1].setSearchStrategy(CursorOnlyStrategy.NAME);


Please add a test for AutoStrategy too.

gianm · 2017-01-10T21:59:52Z

processing/src/main/java/io/druid/query/search/search/UseIndexesStrategy.java

+    return new UseIndexesStrategy(query, false, timeFilteredBitmap);
+  }
+
+  private UseIndexesStrategy(SearchQuery query,


Did this file have codestyle applied? Are you using Eclipse or IntelliJ? It's not the same as what my IntelliJ does when I apply the codestyle.xml.

Sorry about this problem. I'll apply the codestyle to all changed files.

gianm · 2017-01-10T22:57:23Z

processing/src/main/java/io/druid/query/search/search/UseIndexesStrategy.java

+            index
+        );
+
+        // Index-only plan is used only when any filter is not specified or every filter supports bitmap indexes.


Should be "the filter" instead of "every filter" (there's just one filter). Maybe this is "every" since you were thinking of decomposing ANDs, but that's not happening right now, so it should be "the"

Thanks. Fixed.

jihoonson · 2017-01-11T00:48:06Z

@gianm, thank you for your review. I updated the patch.

gianm

Cool, 👍

…egy (apache#3792) * Add an option to SearchQuery to choose a search query execution strategy. Supported strategies are 1) Index-only query execution 2) Cursor-based scan 3) Auto: choose an efficient strategy for a given query * Add SearchStrategy and SearchQueryExecutor * Address comments * Rename strategies and set UseIndexesStrategy as the default strategy * Add a cost-based planner for auto strategy * Add document * Fix code style * apply code style * apply comments

Add an option to SearchQuery to choose a search query execution strat…

f8d5c5e

…egy. Supported strategies are 1) Index-only query execution 2) Cursor-based scan 3) Auto: choose an efficient strategy for a given query

fjy added the Improvement label Dec 20, 2016

fjy added this to the 0.10.0 milestone Dec 20, 2016

fjy assigned fjy and gianm Dec 20, 2016

fjy reviewed Dec 20, 2016

View reviewed changes

fjy requested changes Dec 20, 2016

View reviewed changes

Add SearchStrategy and SearchQueryExecutor

3e5029b

gianm reviewed Dec 21, 2016

View reviewed changes

gianm requested changes Dec 21, 2016

View reviewed changes

Address comments

6a63f2b

Merge branch 'master' of https://github.com/druid-io/druid into searc…

e00e04e

…h-query-strategy

gianm reviewed Jan 4, 2017

View reviewed changes

gianm added the Release Notes label Jan 5, 2017

jihoonson added 4 commits January 5, 2017 17:34

Rename strategies and set UseIndexesStrategy as the default strategy

91239e6

Add a cost-based planner for auto strategy

bb180a7

Add document

ef6c88d

Fix code style

d616106

jihoonson mentioned this pull request Jan 9, 2017

Improve cost-based planner for search query #3832

Closed

fjy approved these changes Jan 10, 2017

View reviewed changes

gianm reviewed Jan 10, 2017

View reviewed changes

jihoonson added 2 commits January 11, 2017 08:30

apply code style

e137bc2

apply comments

5eaff00

gianm approved these changes Jan 11, 2017

View reviewed changes

gianm merged commit c099977 into apache:master Jan 11, 2017

gianm mentioned this pull request Jan 24, 2017

Filters on high cardinality dimensions should sometimes use dim index bitset + full scan instead of unioning bitsets of dim values #3878

Closed

gianm added Performance and removed Release Notes labels Feb 28, 2017

gianm mentioned this pull request Feb 28, 2017

Druid 0.10.0 release notes #3944

Closed

clambertus unassigned fjy and gianm Jul 6, 2018

		@@ -0,0 +1,12 @@
		package io.druid.query.search.search;

		public class ConciseBitmapDecisionHelper extends SearchQueryDecisionHelper


		#### Strategies

		Search queries can be executed using two different strategies. The default strategy is determeined by the

Add an option to SearchQuery to choose a search query execution strategy #3792

Add an option to SearchQuery to choose a search query execution strategy #3792

Conversation

jihoonson commented Dec 20, 2016 • edited

jihoonson commented Dec 20, 2016 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fjy Dec 20, 2016 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jihoonson commented Dec 21, 2016

gianm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gianm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gianm commented Dec 21, 2016 • edited

fjy commented Dec 21, 2016

jihoonson commented Dec 27, 2016

jihoonson commented Dec 30, 2016

gianm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jihoonson commented Jan 9, 2017

Choose a reason for hiding this comment

jihoonson commented Dec 20, 2016 •

edited

jihoonson commented Dec 20, 2016 •

edited

fjy Dec 20, 2016 •

edited

gianm commented Dec 21, 2016 •

edited