-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding CPU Measurements and Unit Tests #14470
base: master
Are you sure you want to change the base?
Conversation
Hello, due to I couldn't reproduce the issue locally. Who knows where to get the the log as below: |
This commit is about reporting cpu user time and cpu system time, that is already available, but not printed in any log. Part one The solution adds CPU metrics after a build as seen below: INFO: 2 processes: 1 internal, 1 linux-sandbox. INFO: Total action wall time 70.05s INFO: Critical path 70.08s (setup 0.00s, action wall time 70.05s) INFO: Elapsed time 71.58s (preparation 1.45s, execution 70.13s) INFO: CPU time 5.31s (user 0.00s, system 0.01s, bazel 5.29s) INFO: Build completed successfully, 2 total actions If any of the values is not collected, "???" is printed instead like: INFO: 415 processes: 1 internal, 414 processwrapper-sandbox. INFO: Total action wall time 127.84s INFO: Critical path 2.58s (setup 0.00s, action wall time 2.51s) INFO: Elapsed time 38.37s (preparation 3.98s, execution 34.39s) INFO: CPU time ???s (user 79.45s, system 22.39s, bazel ???s) INFO: Build completed successfully, 415 total actions Note: The values is only presented if the following flag is set: --experimental_stats_summary The following flags must be set to get the measurements to be ollected: --experimental_collect_local_sandbox_action_metrics --experimental_profile_cpu_usage (set but default) Part two Unit Test is added in runtime-tests. Test command: ./misc/bazel test --experimental_stats_summary //src/test/java/com/google/devtools/build/lib:runtime-tests Note: Due to Reporter class has keywords 'final', it'll block mock in UT. We didn't find a better way to solve this issue. So in this commit, this keyword 'final' in Reporter was removed.
Here is the answer for my question above.:
|
New commit fixed the issue in Mac OS. It's due to Mac OS uses "," to divide integer and decimal fraction instead of '.' in other OS. E.g.: In Mac OS, it's '9,00'. In other OS, it's '9.00'. |
@ecngtng It's entirely possible that that difference in punctuation is not due to being on Mac per se, but due to Mac doing different i18n. |
@larsrc-google So in this case, is it reasonable to distinguish the punctuation with related key words in OS name? Or? |
Distinguishing on OS might cause problems for users in other countries with different default Locale. One alternative could be to get the JVMs decimal separator char for current Another alternative could be to use |
Change distinguishing from OS to getDecimalSeparator().
@ulrfa @larsrc-google Thanks for your good comments. Codes have been updated with your comment. |
Is this the right place to expose it? Most users likely don't care about CPU time. The idea behind If a user cares, they can always look it up in the BEP and the trace profile. |
Thank you @meisterT for reviewing! AFAIK, BEP provides CPU time only for:
and the
I assume the metrics above are relevant mostly for the developers of bazel codebase itself. But this PR provides CPU time for:
I think the CPU time, for the JVM and subprocesses combined, is one of the most relevant metrics for a user that would like to grasp the overall resource consumption of a build. In the future, I imagine storing subprocess's CPU time per action in the profile json file, and then calculate summary grouped by mnemonic, to see how much CPU each tool in the build chain consumes. But I think this PR is a good start and makes information conveniently available. Perhaps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started a review, then realized a general problem: Action CPU time is readily available for standalone/sandboxed actions. For singleplex worker actions, it could be found by taking the processes CPU time before and after making a request (I don't think we do that by default, but we could). But multiplex worker actions simply don't have a well-defined CPU time concept (actually the singleplex action would paper over some things like GC between requests). Since even a single action without CPU time makes the whole thing unknown, it will probably end up being unknown more often than not.
The way around this would be to measure non-worker actions the way you currently do, but measure worker actions by taking the CPU time at start and end (of build or of worker).
@@ -60,6 +63,10 @@ | |||
private long executionEndMillis; | |||
private SpawnStats spawnStats; | |||
private Path profilePath; | |||
private static final long UNKNOWN = -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you rename this to "UNKNOWN_CPU_TIME" to be a bit clearer?
Optional<Duration> cpuUserTimeForActionsDuration = event.getActionResult().cumulativeCommandExecutionUserTime(); | ||
|
||
if(cpuUserTimeForActionsDuration.isPresent() && (cpuUserTimeForActions != UNKNOWN)) { | ||
cpuUserTimeForActions = cpuUserTimeForActions + cpuUserTimeForActionsDuration.get().toMillis(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use += here and below.
src/main/java/com/google/devtools/build/lib/runtime/BuildSummaryStatsModule.java
Outdated
Show resolved
Hide resolved
} | ||
Optional<Duration> cpuUserTimeForActionsDuration = event.getActionResult().cumulativeCommandExecutionUserTime(); | ||
|
||
if(cpuUserTimeForActionsDuration.isPresent() && (cpuUserTimeForActions != UNKNOWN)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Unnecessary extra parentheses.
} | ||
} | ||
|
||
private static long sumCpuTimes(long a, long b, long c) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you're hardcoding this to three parameters, you might as well name them with what kinds of CPU times they are.
when(result.cumulativeCommandExecutionUserTime()).thenReturn(Optional.empty()); | ||
} | ||
else{ | ||
when(result.cumulativeCommandExecutionUserTime()).thenReturn(Optional.of(userTime)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can simplify these with Optional.ofNullable()
.
field1.setLong(buildSummaryStatsModule, 11000); | ||
buildSummaryStatsModule.buildComplete(createBuildEvent()); | ||
if(Comma) { | ||
verify(reporterMock).handle(Event.info("CPU time 88,00s (user 55,00s, system 22,00s, bazel 11,00s)")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the comma problem would be easier to understand if you use String.format
here instead of having the Comma
variable. A note on why it's done that way would be good, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this comment, I think you want it to be changed like:
verify(reporterMock).handle(Event.info(String.format("CPU time %.2fs (user %.2fs, system %.2fs, bazel %.2fs)",88.00, 55.00, 22.00, 11.00)));
Is it right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that should do it. Plus a comment on why it's not just a fixed string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope, in this way, it could be determined by the system to use '.' or ',' to expresss the floating number.
Hello @ecngtng, could you please confirm that all the review comment are taken care? Thanks. |
@ckolli5 yes, all the comments are taken care. |
@larsrc-google Codes have been changed according to your comments. Could you please help to check? Thanks! |
src/main/java/com/google/devtools/build/lib/events/Reporter.java
Outdated
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/runtime/BuildSummaryStatsModule.java
Outdated
Show resolved
Hide resolved
@@ -206,6 +229,34 @@ public void buildComplete(BuildCompleteEvent event) { | |||
criticalPathComputer = null; | |||
} | |||
profilePath = null; | |||
cpuUserTimeForActions = Duration.ofMillis(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Zeroing them on startup would also make it clearer that they are properly initialized.
Thank you. You missed one comment from before, and I added a couple small things. |
@larsrc-google Thanks for your good comments. Except the comment that I'm not sure if I have got your point, all the others have been updated as you suggested. About "Zeroing them on startup would also make it clearer that they are properly initialized.", the zeroing actions have been done in both executorInit() and buildComplete(). Is it not ok? Or? |
I don't see the zeroing happening in |
@larsrc-google You're right. After checking, I found they were really not in last commits. I added them back now. (Every interesting, I ever checked codes and confirmed that they merged several days ago.) |
@larsrc-google Codes have been changed and all checks passed. Could you please help to check? Thanks! |
Looks nice. I'll do some extra checks internally, and then I think we can merge it. |
Getting bogged down in all the technical bits, I forgot about @meisterT's comments above. Sorry. We want to reduce the terminal output, not increase it. Putting it in the log (as indicated in the original comment) would be fine, though. |
Hi @wilwell, Since I can see that this PR has been approved, please let me know whether I should proceed with importing it. |
No, not yet. As per my last comment adding this info to the output from every build is not desirable. |
This commit is about reporting CPU user time and CPU system time,
that is already available, but not printed in any log.
Part one
The solution adds CPU metrics after a build as seen below:
INFO: 2 processes: 1 internal, 1 linux-sandbox.
INFO: Total action wall time 70.05s
INFO: Critical path 70.08s (setup 0.00s, action wall time 70.05s)
INFO: Elapsed time 71.58s (preparation 1.45s, execution 70.13s)
INFO: CPU time 5.31s (user 0.00s, system 0.01s, bazel 5.29s)
INFO: Build completed successfully, 2 total actions
If any of the values is not collected, "???" is printed instead like:
INFO: 415 processes: 1 internal, 414 processwrapper-sandbox.
INFO: Total action wall time 127.84s
INFO: Critical path 2.58s (setup 0.00s, action wall time 2.51s)
INFO: Elapsed time 38.37s (preparation 3.98s, execution 34.39s)
INFO: CPU time ???s (user 79.45s, system 22.39s, bazel ???s)
INFO: Build completed successfully, 415 total actions
Note:
The values are only presented if the following flag is set:
--experimental_stats_summary
The following flags must be set to get the measurements collected:
--experimental_collect_local_sandbox_action_metrics
--experimental_profile_cpu_usage (set but default)
Part two
Unit Test is added in runtime-tests.
Test command:
bazel test --experimental_stats_summary //src/test/java/com/google/devtools/build/lib:runtime-tests