-
Couldn't load subscription status.
- Fork 128
Add benchmarks to pipeline tests #906
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
❕ Build Aborted
Expand to view the summary
Build stats
Steps errors
Expand to view the steps failures
|
🌐 Coverage report
|
xUnit format doesn't allow multiple results in a single file.
|
I was wondering how this would do against a known performance issue so I ran this against version 1.3.2, of the crowdstrike package. The 1.3.2 outputI also ran against 1.4.1 to see how it looked after the fix, and we see that the performance issue with the script processor is fixed. :-) 1.4.1 |
| } | ||
|
|
||
| testCase := testCase{ | ||
| events: docs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be better as a feature request.
I'm wondering if we could introduce batch_size here, and divide the number of docs into those batches and send each batch with simulatePipelineProcessing. That would get around the limitation we have right now wher the number of docs must fit in one request and would make the processing more like what the users sees in production. Might even be possible to concurrently call simulatePipelineProcessing to again be more like what the users experiences in production.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's start with CLI design!
Based on your implementation, @adriansr, I have a feeling that there are many conditions to determine whether this is a pipeline test or a pipeline benchmark. Same with formatting reports - you need to pass/return another structure to process the correct test type.
What do you think about creating a new command action: elastic-package benchmark, which executes elastic-package benchmark pipeline. We will leave benchmark open for other package components to measure and DON'T couple it only with pipeline tests.
@ecosystem Happy to hear your comments on this.
BTW we should start with the spec PR first, as it introduces new _dev artifacts to the package.
|
|
||
| func BenchmarkPipeline(options testrunner.TestOptions) (*testrunner.BenchmarkResult, error) { | ||
| // Load all test documents | ||
| docs, err := loadAllTestDocs(options.TestFolder.Path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that you need to prepare a spec PR first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will wait to open that one until we agree on the options and structure here
|
Changes made:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so we need a few things:
- PR to package-spec to document the
_dev/benchmarkpath. - CI testing of the benchmark command.
- An option to fail the test run if benchmarks show slower results?
I suppose that the rest of comments are just minor.
| failOnMissing, err := cmd.Flags().GetBool(cobraext.FailOnMissingFlagName) | ||
| if err != nil { | ||
| return cobraext.FlagParsingError(err, cobraext.FailOnMissingFlagName) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it relevant for benchmark tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it depends. I was wondering if would be practical to add a config option to use the pipeline tests events as samples, to avoid repetition and basically to be able to leverage benchmarks for any package with pipeline tests without changes. If added, would fallback to lookup pipeline test samples, and I guess than in a scenario like this failing could be useful? WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If added, would fallback to lookup pipeline test samples, and I guess than in a scenario like this failing could be useful
Yes, in this case, it makes sense.
Based on your experience, how many cases would be covered by borrowing pipeline test events? Is it the majority or just w few samples? If that feature doesn't look to be popular, I would rather drop it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think would be nice to aim to have this in some capacity as part of the CI, so I think all packages would end up using it in one way or another. cc @leehinman
|
|
||
| var results []*benchrunner.Result | ||
| for _, folder := range benchFolders { | ||
| r, err := benchrunner.Run(benchType, benchrunner.BenchOptions{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great if you can run the benchmark test as part of our CI pipeline. We could test it in a continuous way.
|
/test |
|
BTW Please ignore the failing |
|
Spec PR in #906 |
|
/test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a direct user of this feature, but what I'm missing here is an option to save results in the repo and use them as a baseline in the future. If you don't intend to use it this way, feel free to ignore this idea.
|
|
||
| const ( | ||
| // How many top processors to return. | ||
| numTopProcs = 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe expose it via CLI options?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we expose it through the cli, it will use the value for all the datastreams in the package, in some cases based on the size of the events used to test, this might cause some simulations to fail with a 413. This means some packages might require different num_doc values for each datastream, and this is why I added it to the benchmark config instead. Plus if the intention is to use it in CI, it is much less flexible to use because the same reason. WDYT?
| } | ||
|
|
||
| func installIngestPipelines(api *elasticsearch.API, dataStreamPath string) (string, []ingest.Pipeline, error) { | ||
| func InstallDataStreamPipelines(api *elasticsearch.API, dataStreamPath string) (string, []Pipeline, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: is this renaming intentional? AFAIR there are packages with ingest pipelines detached from data streams, but most likely we don't support them here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both signatures received datastream as a parameter, since I had to expose it, it seemed clearer to me to rename it.
|
/test |
|
@marc-gr you may need to rebase the PR or recreate it, CI seems to be looking for Adrian as member of Elastic. |
|
Reopened #958 |
|
|
🌐 Coverage report
|
This PR adds initial benchmarking support to pipeline tests.
Configuration
3 new flags are added to the
testsubcommand:--bench: Enables benchmarking.--bench-count: Number of documents to use for benchmarking (default 1000).--bench-duration: Adjust the number of docs so that the benchmark runs for this duration (count above is ignored).Output (text format)
Output (xUnit)
Uses the xUnit format as defined in the Jenkins Benchmark plugin.
For the plugin to work it's necessary to generate an output file for each datastream, so the report generation had to be modified a bit.
The detailed processor reports are not included in the xUnit format as they add a too much data for the Benchmark plugin to handle.
Related issues