-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create benchmarking tools for saving run/measurement data (with Falcon7b example) and model-demo utilities for verifying tokens/perf #9659
Conversation
T3k demo tests - https://github.com/tenstorrent/tt-metal/actions/runs/9668590898 |
3bddfa4
to
a815bfc
Compare
a815bfc
to
bb2f7c4
Compare
bb2f7c4
to
72e3c2e
Compare
Two things before I approve:
|
72e3c2e
to
7e10863
Compare
Thanks for sharing that Bill, I hadn't considered it. It seems like a good tool for benchmarking functions/tests, but I don't think it will be easy to integrate with our requirements for the CSVs since we require a very specific format for timestamps (which is why I added the BenchmarkProfiler) and we will need to benchmark blocks of code, not only functions. |
358314e
to
389d5f1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any tests with the following decorator/annotation tags. We are trying to make sure that all demos collect metrics.
Look for existing tests with these markings.
@pytest.mark.models_device_performance_bare_metal
@pytest.mark.models_performance_bare_metal
Note that we document this here: https://tenstorrent.github.io/tt-metal/latest/ttnn/ttnn/demos.html
See for example:
tests/ttnn/integration_tests/resnet/test_performance.py
Hi @eyonland, those decorators are used to specify which tests should be included in perf pipelines. The demos belong to a separate pipeline and should not use them. Also, the decorators of existing demo tests is orthogonal to this PR and not affected. The purpose of this PR is to create new benchmarking tools for measuring metrics that will be initially adopted by the demos and subsequently by other tests. |
you could move those blocks of code into functions :) |
We may want to log many steps (such as in the demo) so making everything a function might be overkill (plus the point about the timestamps). |
Could we have a meeting on this? It sounds like we want to deviate from the expectations of demos as described here: https://tenstorrent.github.io/tt-metal/latest/ttnn/ttnn/demos.html |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@skhorasganiTT / @uaydonat , based on our conversation, feel free to add the update to this PR for how this new pipeline fits into the docs here: https://tenstorrent.github.io/tt-metal/latest/ttnn/ttnn/demos.html
Thanks for the clarity on this.
…h Falcon7b example) and model-demo utilities for verifying tokens/perf Signed-off-by: Salar Hosseini <skhorasgani@tenstorrent.com>
Signed-off-by: Salar Hosseini <skhorasgani@tenstorrent.com>
c7ac40c
to
836514a
Compare
Ticket
Problem description
What's changed
models/perf/benchmarking_utils.py
containing tools for profiling data and saving CSVs in the appropriate formats (the requirements for these CSVs were specified by the data science team). These tools can be used for any type of test, not only demos. Two CSVs are saved for any run, "run_<start_ts>.csv" and "measurement_<start_ts>.csv".models/demos/utils/llm_demo_utils.py
containing a function which adds demo measurements using the benchmarking tools mentioned above. This is only applicable to model demos and defines certain requirements for data produced by the demos.llm_demo_utils.py
file mentioned above, created functions for doing output token verification and output perf verification.Checklist
cc @uaydonat