Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor serialization of benchmarks #1131

Merged
merged 13 commits into from Jan 12, 2024
Merged

Conversation

syl20bnr
Copy link
Member

@syl20bnr syl20bnr commented Jan 11, 2024

See changes below for a list of changes.

I will add an "upload" function to the persistence module in a further PR.

The current naming of the saved files need a bit of work, currently I just do benchmarks_<timestamp>.json but we could have some race condition and get some files overridden. --> We updated it to bench_<name>_<timestamp>.

The computed values are saved in seconds, maybe we should save them in milliseconds ? --> We changed to microseconds.

You can see an example of produced file here: https://gist.github.com/syl20bnr/b0fdbbf0c3877b9e4ddf262a9cc388ac

Checklist

  • Confirmed that run-checks all script has been executed.
  • Made sure the book is up to date with changes in this PR.

Related Issues/PRs

Changes

  • flatten benchmarks data to make it easier to save documents to a database and query them
  • split some information into their own fields like backend and device
  • add new serialized info:
    • computed values (mean, median, variance, min, max)
    • number of samples
    • operation name
    • tensor shapes if any
  • serialize to separate files, one file per benchmark run
  • simplify persistence module to only a save method
  • remove remaining num_repeats in benches execute functions.

Testing

Tested serialization to disk with all the available benchmarks.

* flatten benchmarks data to make it easier to save documents to a database and
query them
* split some information into their own fields like backend and device
* add new seralized info:
  - computed values (mean, median, variance, min, max)
  - number of samples
  - operation name
  - tensor shapes if any
* serialize to separate files, one file per benchmark run
* simplify persistence module to only a save method
@syl20bnr
Copy link
Member Author

For the .expect() messages, I am not sure what to put in them. I see a lot of people putting an explanation of the error there and also sometimes people putting what is expected to work. The word "expect" suggests the latter but that's not what I see online most of the time 🤷🏻‍♂️ What's our code guideline on this ?

Copy link

codecov bot commented Jan 11, 2024

Codecov Report

Attention: 80 lines in your changes are missing coverage. Please review.

Comparison is base (f43b686) 85.65% compared to head (94a214b) 85.67%.
Report is 3 commits behind head on main.

Files Patch % Lines
backend-comparison/src/persistence/base.rs 0.00% 51 Missing ⚠️
burn-common/src/benchmark.rs 73.33% 28 Missing ⚠️
burn-compute/src/tune/tune_benchmark.rs 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1131      +/-   ##
==========================================
+ Coverage   85.65%   85.67%   +0.02%     
==========================================
  Files         513      513              
  Lines       56816    56987     +171     
==========================================
+ Hits        48665    48823     +158     
- Misses       8151     8164      +13     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@syl20bnr
Copy link
Member Author

syl20bnr commented Jan 11, 2024

I updated the file name with a bench name and the first 8 chars of a random uuid, examples:

  • benchmarks_gelu_1b2bb391.json
  • benchmarks_matmul_f79409f7.json

The name is passed as an argument to the save function because each ran bench in a benchmark can have a different name and operation used. For instance with gelu where we have 3 batches of benches with different gelus.

@louisfd
Copy link
Member

louisfd commented Jan 11, 2024

For the .expect() messages, I am not sure what to put in them. I see a lot of people putting an explanation of the error there and also sometimes people putting what is expected to work. The word "expect" suggests the latter but that's not what I see online most of the time 🤷🏻‍♂️ What's our code guideline on this ?

From here: If you’re having trouble remembering how to phrase expect error messages remember to focus on the word “should” as in “env variable should be set by blah” or “the given binary should be available and executable by the current user”.

So I just stick to the word should and it works out

Copy link
Member

@louisfd louisfd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's very good in general. The only issue I see is how often we must tell the name of the benchmark (see my comments for example on unary). Could it not be unified in one place?

backend-comparison/benches/unary.rs Outdated Show resolved Hide resolved
backend-comparison/benches/unary.rs Outdated Show resolved Hide resolved
backend-comparison/benches/unary.rs Outdated Show resolved Hide resolved
Copy link
Member

@nathanielsimard nathanielsimard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of adding the new field "operation," we could add a field "ID" instead. "Name" is more generic, and we may have some benchmarks that aren't about single operations. Or maybe we need another structure:

  • name (Could have a default implementation core::any::type_name::<Self>().to_string())
  • shapes
  • options (num_repeat, kind, etc.)

The ID could be generated from the concatenation of all of the above.

backend-comparison/benches/binary.rs Outdated Show resolved Hide resolved
backend-comparison/benches/binary.rs Outdated Show resolved Hide resolved
backend-comparison/src/persistence/base.rs Show resolved Hide resolved
backend-comparison/src/persistence/base.rs Show resolved Hide resolved
backend-comparison/src/persistence/base.rs Outdated Show resolved Hide resolved
backend-comparison/src/persistence/base.rs Outdated Show resolved Hide resolved
@syl20bnr
Copy link
Member Author

It's very good in general. The only issue I see is how often we must tell the name of the benchmark (see my comments for example on unary). Could it not be unified in one place?

You are right, I thought about it but I cannot fetch the binary name at runtime, the only references are in Cargo, in the command line, and the source file name. I'll try to find something.

@syl20bnr
Copy link
Member Author

Ready for another round of review.
Updated gist: https://gist.github.com/syl20bnr/b0fdbbf0c3877b9e4ddf262a9cc388ac

Files on disk naming examples:

  bench_binary_1705007825641.json
  bench_from_data_1705007836909.json
  bench_gelu_reference_1705007872731.json
  bench_gelu_withcustomerf_1705007872987.json
  bench_gelu_withreferenceerf_1705007872929.json
  bench_matmul_1705007855627.json
  bench_to_data_1705007832378.json
  bench_unary_1705007818349.json

I added num_repeats also to the documents, I believe they can be useful.

Copy link
Member

@nathanielsimard nathanielsimard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not save the num_repeat in the database for now. I plan on eventually removing it since it is often confused with num_samples. It was useful when we didn't have a reliable way to sync a backend without actually reading the data. Repeating an execution meant that the data reading was less impactful.

backend-comparison/benches/custom_gelu.rs Outdated Show resolved Hide resolved
@syl20bnr
Copy link
Member Author

@nathanielsimard I already save the number of samples. See the gist.

I believe the repeat can still be useful, check the docstring I added to it, I think it is a good concept to have:

    /// Number of executions per sample

Say we have a very quick operation to bench, repeating it to get a meaningful sample can be useful.

@syl20bnr
Copy link
Member Author

syl20bnr commented Jan 12, 2024

Ideally the notion of repeat could be integrated into the main loop but I tried it and it was not as trivial to do.

@syl20bnr syl20bnr force-pushed the benchmark_persistence_refactor branch from 902441b to 000af88 Compare January 12, 2024 14:54
@syl20bnr
Copy link
Member Author

syl20bnr commented Jan 12, 2024

We decided to remove the num_repeats completely from the benchmarks.

Copy link
Member

@nathanielsimard nathanielsimard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

@syl20bnr syl20bnr force-pushed the benchmark_persistence_refactor branch from 4727dce to 278984c Compare January 12, 2024 15:18
Copy link
Member

@louisfd louisfd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's very nice, I think we reached the beautiful code stage.
I have only two minor requests and then it's good to go

burn-common/src/benchmark.rs Outdated Show resolved Hide resolved
burn-common/src/benchmark.rs Outdated Show resolved Hide resolved
@syl20bnr syl20bnr merged commit 9bd2d7b into main Jan 12, 2024
14 of 15 checks passed
@syl20bnr syl20bnr deleted the benchmark_persistence_refactor branch February 13, 2024 16:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants