-
Notifications
You must be signed in to change notification settings - Fork 117
[test] Extend GPU burn test to report GPU node with smallest flops #1653
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1653 +/- ##
==========================================
- Coverage 87.72% 87.65% -0.07%
==========================================
Files 45 45
Lines 7477 7485 +8
==========================================
+ Hits 6559 6561 +2
- Misses 918 924 +6
Continue to review full report at Codecov.
|
jjotero
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about adding an optional step after performance with something like self.perf_error_patterns? This would process the output and collect all the information on how and where the test failed. In that way, this additional step would only run if the performance fails, and you would avoid having redundant output when the test passes performance.
|
@jjotero Is this a general feature request? |
@jjotero |
I guess it could be. Should I open an issue and move the discussion over there? |
Yes, it would make sense. |
vkarak
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jgphpc I fixed the PR.
GpuBurnTest
Goal is to report nidname when the test fails to meet perf. reference
'perf': (4115, -0.10, None, 'Gflop/s')but I am not sure how to do that... Feel free to push
For now, replacing
perf:with the last line gives some info.