Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Profiling funny results when sampling rates align #75895

Open
nik9000 opened this issue Jul 30, 2021 · 2 comments
Open

Profiling funny results when sampling rates align #75895

nik9000 opened this issue Jul 30, 2021 · 2 comments
Labels
>bug :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team

Comments

@nik9000
Copy link
Member

nik9000 commented Jul 30, 2021

I was looking at some profiles results that involved LongDistanceFeatureQuery and saw super high timing on setMinCompetitiveScore. I had a look at the method and realized that it goes into a sampling mode:

      // Start sampling if we get called too much
      setMinCompetitiveScoreCounter++;
      if (setMinCompetitiveScoreCounter > 256 && (setMinCompetitiveScoreCounter & 0x1f) != 0x1f) {
        return;
      }

Our times uses a sampling mode already:

        // We measure the timing of each method call for the first 256
        // calls, then 1/2 call up to 512 then 1/3 up to 768, etc. with
        // a maximum interval of 1024, which is reached for 1024*2^8 ~= 262000
        // This allows to not slow down things too much because of calls
        // to System.nanoTime() when methods are called millions of time
        // in tight loops, while still providing useful timings for methods
        // that are only called a couple times per search execution.
        doTiming = (count - lastCount) >= Math.min(lastCount >>> 8, 1024);
        if (doTiming) {
            start = nanoTime();
        }
        count++;

These sampling modes are both powers of two - setMinCompetitiveScore is 32 and we're, well, what the comment says. I think we're overestimate the cost of this method because we'll line up with its counter more often than random.

@nik9000 nik9000 added >bug :Search/Search Search-related issues that do not fall into other categories team-discuss labels Jul 30, 2021
@elasticmachine elasticmachine added the Team:Search Meta label for search team label Jul 30, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@nik9000
Copy link
Member Author

nik9000 commented Jul 30, 2021

A bunch of us got together to talk about this and decided:

  • At least for now we're not going to fix this.
  • We'll update the documentation to talk about how these might not be accurate because we're sampling
  • We'll add some top level times for each shard. These can be quite useful to pick the slow shard and they won't be effected by sampling.
  • We'll add the last sampling rate to the profiler output. That way when you look at the profiler output you can tell if the results are sampled and know to be suspicious.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team
Projects
None yet
Development

No branches or pull requests

2 participants