Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(cli): added decompression benchmark #3773

Merged
merged 4 commits into from Apr 5, 2024

Conversation

jkowalski
Copy link
Contributor

No description provided.

Copy link

codecov bot commented Apr 5, 2024

Codecov Report

Attention: Patch coverage is 80.00000% with 22 lines in your changes are missing coverage. Please review.

Project coverage is 76.93%. Comparing base (cb455c6) to head (a3281ad).
Report is 88 commits behind head on master.

Files Patch % Lines
cli/command_benchmark_compression.go 75.60% 15 Missing and 5 partials ⚠️
cli/command_benchmark_ecc.go 0.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3773      +/-   ##
==========================================
+ Coverage   75.86%   76.93%   +1.07%     
==========================================
  Files         470      476       +6     
  Lines       37301    29017    -8284     
==========================================
- Hits        28299    22325    -5974     
+ Misses       7071     4749    -2322     
- Partials     1931     1943      +12     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@julio-lopez julio-lopez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1: LG 👍

A few comments and questions.
Feel free to merge, no need for an additional round of review here.

Thanks.

}()
}

// run one on the main goroutine and N-1 in parallel.
v := run()
v := run(args[len(args)-1])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the rationale for running one in the "caller" go routine? It seems the code could be simplified slightly if this case was removed.

It is surprising that only the result of one of the executions is returned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

premature optimization :)

cli/command_benchmark_compression.go Outdated Show resolved Hide resolved
@@ -155,11 +212,86 @@ func (c *commandBenchmarkCompression) run(ctx context.Context) error {
return compressedSize
}

outputBuffers := makeOutputBuffers(c.parallel, defaultCompressedDataByMethod)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What prevents GC from running while benchmarking and affecting the results? Is GC turned off? would the last method to be benchmarked be likely to experience more memory pressure?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we actually call runtime.GC() before each algorithm.

func (c *commandBenchmarkCompression) runCompression(ctx context.Context, data []byte, repeatCount int, algorithms map[compression.Name]compression.Compressor) error {
var results []compressionBechmarkResult

log(ctx).Infof("Compressing input file %q (%v) using all compression methods.", c.dataFile, units.BytesString(int64(len(data))))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it still the case that all compression methods are used here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replaced "all" with a number.

}

for _, a := range strings.Split(c.algorithms, ",") {
if strings.HasPrefix(string(name), a) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for my understanding: why HasPrefix?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so you can test all "zstd" in one shot

}

if benchmarkDecompression {
if err := c.runDecompression(ctx, data, repeatCount, algorithms); err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this work when multiple algorithms are specified and only decompression is benchmarked? essentially, what's the input for multiple decompressors when there is no prior compression.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we compress once per compressor, then decompress N times.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I missed that. Thanks.

@jkowalski jkowalski marked this pull request as ready for review April 5, 2024 01:35
@jkowalski jkowalski enabled auto-merge (squash) April 5, 2024 01:36
Copy link
Collaborator

@julio-lopez julio-lopez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1: LG 👍

@jkowalski jkowalski merged commit fe7a418 into kopia:master Apr 5, 2024
26 of 27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants