Skip to content

Follow up on post-merge review of performance work on sharded searcher#15

Merged
mrnugget merged 5 commits intomasterfrom
sharded_perf_follow_up
Jul 23, 2019
Merged

Follow up on post-merge review of performance work on sharded searcher#15
mrnugget merged 5 commits intomasterfrom
sharded_perf_follow_up

Conversation

@mrnugget
Copy link
Copy Markdown
Contributor

This adresses the comments by @keegancsmith in #14 after the merge.

I'm quoting the commit message of the first commit here:

This commit does a couple of things to make the added behavior correct:

  1. It reuses the simplified query produced by selectRepoSet which
    in turn reduces the work the shards have to do when evaluating the
    query
  2. It only evaluates the query.RepoSet if it's part of a top-level
    query.And. The previous code was too naive when considering that a
    query.RepoSet can appear anywhere in a query.Q. See this comment
    for more details:
    Improve performance of sharded search when using RepoSets #14 (comment)
  3. It changes the const value that's returned in the simplified version
    so that typeRepoSearcher, which evaluates query.TypeRepo queries to
    a query.RepoSet works (the tests in shards/eval_test.go break
    without this.

I think the behavior is now correct.

I don't think it's worth it, but if the usecase of multiple RepoSet queries ever arises we can easily remove the break and use a set to not append duplicates to the filtered slice.

mrnugget added 3 commits July 19, 2019 16:27
This commit does a couple of things to make the added behavior correct:

1. It reuses the simplified query produced by `selectRepoSet` which
   in turn reduces the work the shards have to do when evaluating the
   query
2. It only evaluates the `query.RepoSet` if it's part of a top-level
   `query.And`. The previous code was too naive when considering that a
   `query.RepoSet` can appear anywhere in a `query.Q`. See this comment
   for more details:
   #14 (comment)
3. It changes the const value that's returned in the simplified version
   so that `typeRepoSearcher`, which evaluates `query.TypeRepo` queries to
   a `query.RepoSet` works (the tests in `shards/eval_test.go` break
   without this.
@mrnugget mrnugget requested a review from keegancsmith July 19, 2019 15:07
Comment thread shards/shards.go Outdated
// Stop after first RepoSet, otherwise we might append duplicate
// shards to `filtered`
if len(filtered) != 0 {
return filtered, and
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can just always return here, don't have to check len of filtered.

Also return query.Simplify(and), to make Simplify do nothing per shard.

@mrnugget
Copy link
Copy Markdown
Contributor Author

mrnugget commented Jul 20, 2019 via email

Change-Id: I113b30a29a1b8f61e4969c52b155d7fa09e0cc75
@keegancsmith
Copy link
Copy Markdown
Member

Tried out add Simplify, and the tests pass. Also the benchmarks show an improvement:

name                                                 old time/op    new time/op    delta
ShardedSearch/substring_all_results-8                   1.54s ±12%     1.48s ± 8%     ~     (p=0.421 n=5+5)
ShardedSearch/substring_no_results-8                   5.43ms ± 2%    5.31ms ± 5%     ~     (p=0.548 n=5+5)
ShardedSearch/substring_some_results-8                 11.3ms ± 2%    10.6ms ± 3%   -5.99%  (p=0.008 n=5+5)
ShardedSearch/substring_all_results_and_repo_set-8      665ms ± 3%     659ms ± 3%     ~     (p=0.421 n=5+5)
ShardedSearch/substring_some_results_and_repo_set-8    6.21ms ± 4%    6.10ms ± 6%     ~     (p=0.690 n=5+5)
ShardedSearch/substring_no_results_and_repo_set-8      3.49ms ± 4%    3.25ms ± 3%   -6.96%  (p=0.008 n=5+5)

name                                                 old alloc/op   new alloc/op   delta
ShardedSearch/substring_all_results-8                  2.98GB ± 0%    2.98GB ± 0%     ~     (p=1.000 n=5+5)
ShardedSearch/substring_no_results-8                   8.35MB ± 0%    8.35MB ± 0%     ~     (p=0.286 n=4+5)
ShardedSearch/substring_some_results-8                 18.4MB ± 0%    18.4MB ± 0%     ~     (p=1.000 n=5+5)
ShardedSearch/substring_all_results_and_repo_set-8     1.50GB ± 0%    1.50GB ± 0%   -0.01%  (p=0.008 n=5+5)
ShardedSearch/substring_some_results_and_repo_set-8    9.32MB ± 0%    9.13MB ± 0%   -2.06%  (p=0.016 n=5+4)
ShardedSearch/substring_no_results_and_repo_set-8      4.59MB ± 0%    4.40MB ± 0%   -4.18%  (p=0.008 n=5+5)

name                                                 old allocs/op  new allocs/op  delta
ShardedSearch/substring_all_results-8                   27.2M ± 0%     27.2M ± 0%     ~     (p=0.686 n=4+4)
ShardedSearch/substring_no_results-8                     105k ± 0%      105k ± 0%     ~     (all equal)
ShardedSearch/substring_some_results-8                   240k ± 0%      240k ± 0%     ~     (p=1.000 n=4+5)
ShardedSearch/substring_all_results_and_repo_set-8      13.6M ± 0%     13.6M ± 0%   -0.04%  (p=0.008 n=5+5)
ShardedSearch/substring_some_results_and_repo_set-8      126k ± 0%      120k ± 0%   -4.75%  (p=0.000 n=5+4)
ShardedSearch/substring_no_results_and_repo_set-8       58.6k ± 0%     52.6k ± 0%  -10.24%  (p=0.008 n=5+5)

Change-Id: I44c3f3face83f5d737a5f9413cf3db3d00c02850
@mrnugget
Copy link
Copy Markdown
Contributor Author

Ah, sweet! Thanks for that! I guess I had a combination of the previous version and this that didn't work.

@mrnugget mrnugget merged commit cc476d2 into master Jul 23, 2019
@keegancsmith keegancsmith deleted the sharded_perf_follow_up branch July 23, 2019 14:20
msukkari added a commit to sourcebot-dev/zoekt that referenced this pull request Apr 21, 2026
- google.golang.org/grpc 1.75.0 -> 1.80.0 (addresses GHSA critical #11:
  authorization bypass via missing leading slash in :path).
- go.opentelemetry.io/otel* 1.42.0/1.33.0 -> 1.43.0 (addresses sourcegraph#15 high:
  BSD kenv PATH hijack, and sourcegraph#14 medium: unbounded OTLP HTTP response body).

Fixes Dependabot alerts 11, 14, 15 on sourcebot-dev/zoekt.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants