Skip to content

Conversation

@adriansr
Copy link
Contributor

@adriansr adriansr commented Jul 21, 2022

This PR adds initial benchmarking support to pipeline tests.

Configuration

3 new flags are added to the test subcommand:

  • --bench: Enables benchmarking.
  • --bench-count: Number of documents to use for benchmarking (default 1000).
  • --bench-duration: Adjust the number of docs so that the benchmark runs for this duration (count above is ignored).

Output (text format)

╭─────────────────────────────────╮
│ parameters                      │
├──────────────────┬──────────────┤
│ package          │        azure │
│ data_stream      │ activitylogs │
│ source doc count │            5 │
│ doc count        │         1367 │
╰──────────────────┴──────────────╯
╭──────────────────────╮
│ ingest performance   │
├─────────────┬────────┤
│ ingest time │  4.87s │
│ eps         │ 280.64 │
╰─────────────┴────────╯
╭─────────────────────────────────────────────╮
│ processors by total time                    │
├─────────────────────────────────────┬───────┤
│ script @ default.yml:192            │ 4.09% │
│ script @ default.yml:220            │ 4.04% │
│ script @ default.yml:199            │ 3.61% │
│ script @ default.yml:206            │ 3.59% │
│ json @ default.yml:20               │ 3.20% │
│ grok @ azure-shared-pipeline.yml:72 │ 2.16% │
│ script @ default.yml:231            │ 1.99% │
│ script @ default.yml:92             │ 1.79% │
│ script @ default.yml:68             │ 1.75% │
│ grok @ azure-shared-pipeline.yml:52 │ 1.66% │
╰─────────────────────────────────────┴───────╯
╭─────────────────────────────────────────────────╮
│ processors by average time per doc              │
├─────────────────────────────────────┬───────────┤
│ script @ default.yml:192            │ 145.574µs │
│ script @ default.yml:220            │ 144.111µs │
│ script @ default.yml:199            │ 128.749µs │
│ script @ default.yml:206            │ 128.017µs │
│ json @ default.yml:20               │ 114.118µs │
│ grok @ azure-shared-pipeline.yml:72 │  95.978µs │
│ grok @ azure-shared-pipeline.yml:52 │   74.04µs │
│ grok @ azure-shared-pipeline.yml:62 │  71.297µs │
│ script @ default.yml:231            │  70.958µs │
│ grok @ azure-shared-pipeline.yml:31 │  68.555µs │
╰─────────────────────────────────────┴───────────╯

Output (xUnit)

Uses the xUnit format as defined in the Jenkins Benchmark plugin.

<?xml version="1.0" encoding="UTF-8"?>
<group name="pipeline benchmark for azure/activitylogs">
  <parameter name="package">
    <value>azure</value>
  </parameter>
  <parameter name="data_stream">
    <value>activitylogs</value>
  </parameter>
  <parameter name="source doc count">
    <value>5</value>
  </parameter>
  <parameter name="doc count">
    <value>1000</value>
  </parameter>
  <test name="ingest performance">
    <result name="ingest time">
      <description>time elapsed in ingest processors</description>
      <unit>s</unit>
      <value>0.267</value>
    </result>
    <result name="eps">
      <description>ingested events per second</description>
      <value>3745.318352059925</value>
    </result>
  </test>
</group>

For the plugin to work it's necessary to generate an output file for each datastream, so the report generation had to be modified a bit.

The detailed processor reports are not included in the xUnit format as they add a too much data for the Benchmark plugin to handle.

Related issues

@adriansr adriansr added Team:Security-External Integrations Team:Ecosystem Label for the Packages Ecosystem team labels Jul 21, 2022
@elasticmachine
Copy link
Collaborator

elasticmachine commented Jul 22, 2022

❕ Build Aborted

The PR is not allowed to run in the CI yet

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Start Time: 2022-09-07T08:05:27.231+0000

  • Duration: 2 min 7 sec

Steps errors 3

Expand to view the steps failures

Load a resource file from a library
  • Took 0 min 0 sec . View more details here
  • Description: approval-list/elastic/elastic-package.yml
Google Storage Download
  • Took 0 min 0 sec . View more details here
Error signal
  • Took 0 min 0 sec . View more details here
  • Description: githubApiCall: The REST API call https://api.github.com/orgs/elastic/members/adriansr return the message : java.lang.Exception: httpRequest: Failure connecting to the service https://api.github.com/orgs/elastic/members/adriansr : httpRequest: Failure connecting to the service https://api.github.com/orgs/elastic/members/adriansr : Code: 404Error: {"message":"User does not exist or is not a member of the organization","documentation_url":"https://docs.github.com/rest/reference/orgs#check-organization-membership-for-a-user"}

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

@elasticmachine
Copy link
Collaborator

elasticmachine commented Jul 22, 2022

🌐 Coverage report

Name Metrics % (covered/total) Diff
Packages 100.0% (33/33) 💚
Files 66.667% (82/123) 👎 -0.282
Classes 61.143% (107/175) 👎 -1.281
Methods 48.345% (336/695) 👎 -1.28
Lines 31.917% (3034/9506) 👎 -1.379
Conditionals 100.0% (0/0) 💚

@adriansr adriansr marked this pull request as ready for review July 22, 2022 09:32
@adriansr adriansr added the enhancement New feature or request label Jul 22, 2022
@adriansr adriansr requested review from a team July 22, 2022 09:32
@leehinman
Copy link

leehinman commented Jul 22, 2022

I was wondering how this would do against a known performance issue so I ran this against version 1.3.2, of the crowdstrike package. The --bench option shows a script processor that was taking up an outsized portion of the processing time.

1.3.2 output

╭────────────────────────────────╮
│ parameters                     │
├──────────────────┬─────────────┤
│ package          │ crowdstrike │
│ data_stream      │         fdr │
│ source doc count │         125 │
│ doc count        │       10000 │
╰──────────────────┴─────────────╯
╭───────────────────────╮
│ ingest performance    │
├─────────────┬─────────┤
│ ingest time │   6.32s │
│ eps         │ 1582.03 │
╰─────────────┴─────────╯
╭─────────────────────────────────────────╮
│ processors by total time                │
├────────────────────────────────┬────────┤
│ script @ default.yml:54        │ 32.15% │
│ script @ default.yml:1273      │  3.72% │
│ json @ default.yml:9           │  3.61% │
│ append @ default.yml:396       │  3.27% │
│ geoip @ default.yml:947        │  2.82% │
│ append @ default.yml:1139      │  2.58% │
│ community_id @ default.yml:876 │  2.56% │
│ script @ default.yml:522       │  2.18% │
│ append @ default.yml:401       │  1.41% │
│ date @ default.yml:36          │  1.27% │
╰────────────────────────────────┴────────╯
╭─────────────────────────────────────────────────╮
│ processors by average time per doc              │
├─────────────────────────────────────┬───────────┤
│ script @ default.yml:613            │  231.25µs │
│ script @ default.yml:54             │   203.2µs │
│ script @ default.yml:522            │  143.75µs │
│ registered_domain @ default.yml:999 │  106.25µs │
│ uri_parts @ default.yml:936         │ 104.166µs │
│ date @ default.yml:1194             │      75µs │
│ date @ default.yml:1188             │      50µs │
│ script @ default.yml:1009           │   43.75µs │
│ set @ default.yml:749               │  35.416µs │
│ append @ default.yml:879            │  26.851µs │
╰─────────────────────────────────────┴───────────╯

I also ran against 1.4.1 to see how it looked after the fix, and we see that the performance issue with the script processor is fixed. :-)

1.4.1

╭────────────────────────────────╮
│ parameters                     │
├──────────────────┬─────────────┤
│ package          │ crowdstrike │
│ data_stream      │         fdr │
│ source doc count │         126 │
│ doc count        │       10000 │
╰──────────────────┴─────────────╯
╭───────────────────────╮
│ ingest performance    │
├─────────────┬─────────┤
│ ingest time │   4.15s │
│ eps         │ 2411.96 │
╰─────────────┴─────────╯
╭─────────────────────────────────────────╮
│ processors by total time                │
├─────────────────────────────────┬───────┤
│ script @ default.yml:2258       │ 5.91% │
│ json @ default.yml:9            │ 5.28% │
│ append @ default.yml:1337       │ 3.88% │
│ append @ default.yml:2123       │ 3.84% │
│ geoip @ default.yml:1891        │ 3.35% │
│ script @ default.yml:2051       │ 2.92% │
│ date @ default.yml:14           │ 2.89% │
│ script @ default.yml:1463       │ 2.46% │
│ append @ default.yml:1342       │ 1.69% │
│ community_id @ default.yml:1820 │ 1.57% │
╰─────────────────────────────────┴───────╯
╭───────────────────────────────────────╮
│ processors by average time per doc    │
├───────────────────────────┬───────────┤
│ date @ default.yml:2172   │ 202.531µs │
│ date @ default.yml:2178   │ 151.898µs │
│ set @ default.yml:1876    │ 130.801µs │
│ script @ default.yml:1555 │ 125.786µs │
│ script @ default.yml:1463 │ 106.918µs │
│ date @ default.yml:44     │  75.949µs │
│ append @ default.yml:1359 │  50.632µs │
│ date @ default.yml:34     │  50.632µs │
│ script @ default.yml:2051 │  43.635µs │
│ script @ default.yml:1953 │  37.735µs │
╰───────────────────────────┴───────────╯

}

testCase := testCase{
events: docs,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be better as a feature request.

I'm wondering if we could introduce batch_size here, and divide the number of docs into those batches and send each batch with simulatePipelineProcessing. That would get around the limitation we have right now wher the number of docs must fit in one request and would make the processing more like what the users sees in production. Might even be possible to concurrently call simulatePipelineProcessing to again be more like what the users experiences in production.

Copy link
Contributor

@mtojek mtojek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's start with CLI design!

Based on your implementation, @adriansr, I have a feeling that there are many conditions to determine whether this is a pipeline test or a pipeline benchmark. Same with formatting reports - you need to pass/return another structure to process the correct test type.

What do you think about creating a new command action: elastic-package benchmark, which executes elastic-package benchmark pipeline. We will leave benchmark open for other package components to measure and DON'T couple it only with pipeline tests.

@ecosystem Happy to hear your comments on this.

BTW we should start with the spec PR first, as it introduces new _dev artifacts to the package.


func BenchmarkPipeline(options testrunner.TestOptions) (*testrunner.BenchmarkResult, error) {
// Load all test documents
docs, err := loadAllTestDocs(options.TestFolder.Path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that you need to prepare a spec PR first.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will wait to open that one until we agree on the options and structure here

@marc-gr
Copy link
Contributor

marc-gr commented Sep 5, 2022

Changes made:

  • Add a dedicated benchmark command
  • Use a dedicated _dev/benchmark folder to define config and samples instead of flags for better potential use in CI
  • Removed the duration option, it was not working so well and IMO makes the benchmarks less reproducible between runs since it uses heuristics to change the payload size, but I am happy to discuss it.
  • Moved pipeline management code to elasticsearch/ingest

@marc-gr marc-gr requested a review from mtojek September 5, 2022 11:34
@mtojek mtojek requested a review from a team September 5, 2022 13:39
Copy link
Contributor

@mtojek mtojek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so we need a few things:

  1. PR to package-spec to document the _dev/benchmark path.
  2. CI testing of the benchmark command.
  3. An option to fail the test run if benchmarks show slower results?

I suppose that the rest of comments are just minor.

Comment on lines +77 to +80
failOnMissing, err := cmd.Flags().GetBool(cobraext.FailOnMissingFlagName)
if err != nil {
return cobraext.FlagParsingError(err, cobraext.FailOnMissingFlagName)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it relevant for benchmark tests?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it depends. I was wondering if would be practical to add a config option to use the pipeline tests events as samples, to avoid repetition and basically to be able to leverage benchmarks for any package with pipeline tests without changes. If added, would fallback to lookup pipeline test samples, and I guess than in a scenario like this failing could be useful? WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If added, would fallback to lookup pipeline test samples, and I guess than in a scenario like this failing could be useful

Yes, in this case, it makes sense.

Based on your experience, how many cases would be covered by borrowing pipeline test events? Is it the majority or just w few samples? If that feature doesn't look to be popular, I would rather drop it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think would be nice to aim to have this in some capacity as part of the CI, so I think all packages would end up using it in one way or another. cc @leehinman


var results []*benchrunner.Result
for _, folder := range benchFolders {
r, err := benchrunner.Run(benchType, benchrunner.BenchOptions{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great if you can run the benchmark test as part of our CI pipeline. We could test it in a continuous way.

@marc-gr
Copy link
Contributor

marc-gr commented Sep 6, 2022

/test

@mtojek
Copy link
Contributor

mtojek commented Sep 6, 2022

BTW Please ignore the failing elastic-package-package-storage-publish/pr-merge. We have a different issue to solve it.

@marc-gr
Copy link
Contributor

marc-gr commented Sep 6, 2022

Spec PR in #906

@mtojek
Copy link
Contributor

mtojek commented Sep 6, 2022

/test

Copy link
Contributor

@mtojek mtojek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a direct user of this feature, but what I'm missing here is an option to save results in the repo and use them as a baseline in the future. If you don't intend to use it this way, feel free to ignore this idea.


const (
// How many top processors to return.
numTopProcs = 10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe expose it via CLI options?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we expose it through the cli, it will use the value for all the datastreams in the package, in some cases based on the size of the events used to test, this might cause some simulations to fail with a 413. This means some packages might require different num_doc values for each datastream, and this is why I added it to the benchmark config instead. Plus if the intention is to use it in CI, it is much less flexible to use because the same reason. WDYT?

}

func installIngestPipelines(api *elasticsearch.API, dataStreamPath string) (string, []ingest.Pipeline, error) {
func InstallDataStreamPipelines(api *elasticsearch.API, dataStreamPath string) (string, []Pipeline, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: is this renaming intentional? AFAIR there are packages with ingest pipelines detached from data streams, but most likely we don't support them here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both signatures received datastream as a parameter, since I had to expose it, it seemed clearer to me to rename it.

@marc-gr marc-gr requested a review from mtojek September 7, 2022 08:04
@jsoriano
Copy link
Member

jsoriano commented Sep 7, 2022

/test

@jsoriano
Copy link
Member

jsoriano commented Sep 7, 2022

@marc-gr you may need to rebase the PR or recreate it, CI seems to be looking for Adrian as member of Elastic.

@marc-gr
Copy link
Contributor

marc-gr commented Sep 7, 2022

Reopened #958

@marc-gr marc-gr closed this Sep 7, 2022
@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2022-09-07T08:32:30.072+0000

  • Duration: 28 min 34 sec

Test stats 🧪

Test Results
Failed 0
Passed 798
Skipped 0
Total 798

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

@mergify
Copy link
Contributor

mergify bot commented Sep 7, 2022

⚠️ The sha of the head commit of this PR conflicts with #958. Mergify cannot evaluate rules on this PR. ⚠️

@elasticmachine
Copy link
Collaborator

🌐 Coverage report

Name Metrics % (covered/total) Diff
Packages 100.0% (33/33) 💚
Files 66.667% (82/123) 👎 -0.282
Classes 61.143% (107/175) 👎 -1.281
Methods 48.345% (336/695) 👎 -1.28
Lines 31.917% (3034/9506) 👎 -1.379
Conditionals 100.0% (0/0) 💚

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request Team:Ecosystem Label for the Packages Ecosystem team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants