Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider the remaining job capacity in main loop #1949

Merged
merged 1 commit into from
Feb 29, 2024

Conversation

Flowdalic
Copy link
Contributor

@Flowdalic Flowdalic commented Apr 8, 2021

This changes CanRunMore() to return an int instead of a bool. The
return value is the remaining capacity of new jobs that can be spawned
without saturating a potential enabled load limitation (if ninja's -l
option is used). We assume that every started edge increases the load
by one, hence the available "load capacity" is the maximum load minus
the current load.

So, instead of spawning new jobs only limited by the remaining job
capacity ("-j") if the load average limitation is not yet reached, we
only spawn so many new jobs that would saturate the load limitation.

This helps to speed up multiple parallel builds on the same host, as
the following benchmark shows. We compare the total build times of 8
parallel builds of LLVM on a 256-core system using "ninja -j 258".

ninja-master: 1351 seconds
ninja-load-capacity: 920 seconds

That is, with this commit, the whole process becomes 1.46× faster.

The used benchmark script created and prepared 8 build directories,
records the start time, spawns 8 subshells invoking "ninja -j 258",
awaits the termination of those subshells, and records the end
time. Besides the total running time, it also outputs /proc/loadavg,
provides an indication of where the performance is gained:

ninja-master: 3.90 93.94 146.38 1/1936 209125
ninja-load-capacity: 92.46 210.50 199.90 1/1936 36917

So with this change, ninja uses the available hardware cores better in
the presence of competing ninja processes, while avoiding to overload
the system.

The used benchmark script for reference:

#!/usr/bin/env bash
set -euo pipefail

VANILLA_NINJA=~/code/ninja-master/build/ninja
LOAD_CAPACITY_AWARE_NINJA=~/code/ninja-load-capacity/build/ninja

CMAKE_NINJA_PROJECT_SOURCE=~/code/llvm-project/llvm

declare -ir PARALLEL_BUILDS=8
readonly TMP_DIR=$(mktemp --directory --tmpdir=/var/tmp)

cleanup() {
	rm -rf "${TMP_DIR}"
}
trap cleanup EXIT

BUILD_DIRS=()
for i in $(seq 1 ${PARALLEL_BUILDS}); do
	BUILD_DIR="${TMP_DIR}/${i}"
	mkdir "${BUILD_DIR}"
	(
		cd "${BUILD_DIR}"
		cmake -G Ninja "${CMAKE_NINJA_PROJECT_SOURCE}" &> "${BUILD_DIR}/build.log"
	)&
	BUILD_DIRS+=("${BUILD_DIR}")
done
wait

NPROC=$(nproc)
MAX_LOAD=$(( NPROC + 2 ))
SLEEP_SECONDS=300

for NINJA_BIN in "${VANILLA_NINJA}" "${LOAD_CAPACITY_AWARE_NINJA}"; do
	for BUILD_DIR in "${BUILD_DIRS[@]}"; do
		(
			"${NINJA_BIN}" -C "${BUILD_DIR}" clean &> "${BUILD_DIR}/build.log"
		)&
	done
	wait

	echo "Starting build with ${NINJA_BIN} using -j ${MAX_LOAD}"
	START=$(date +%s)
	for BUILD_DIR in "${BUILD_DIRS[@]}"; do
		(
			"${NINJA_BIN}" -C "${BUILD_DIR}" -l "${MAX_LOAD}" &> "${BUILD_DIR}/build.log"
		)&
	done
	wait
	STOP=$(date +%s)

	DELTA_SECONDS=$((STOP - START))
	echo "Using ${NINJA_BIN} to perform ${PARALLEL_BUILDS} of ${CMAKE_NINJA_PROJECT_SOURCE}"
	echo "took ${DELTA_SECONDS} seconds on this ${NPROC} core system using -j ${MAX_LOAD}"
	echo "/proc/loadavg:"
	cat /proc/loadavg
	echo "ninja --version:"
	"${NINJA_BIN}" --version

	echo "Sleeping "${SLEEP_SECONDS}" seconds to bring system into quiescent state"
	sleep ${SLEEP_SECONDS}
done

@Flowdalic
Copy link
Contributor Author

I am not sure if the Windows build failure is caused by my changes or something else. Since I am not a windows person, any help is appreciated. Thanks. :)

@jhasse
Copy link
Collaborator

jhasse commented Apr 8, 2021

Why is this faster?

@Flowdalic
Copy link
Contributor Author

Flowdalic commented Apr 13, 2021

Why is this faster?

Due the reduction of the total number of involved context switches.

I tried to explain the situation better in the new commit message, over realizing that it is maybe not so obvious and that I failed to do so in the initial commit message. Please have a look at the new commit message and let me know if you have any further questions. :)

@jhasse
Copy link
Collaborator

jhasse commented Apr 13, 2021

Thanks! The new commit message is great.

I'm not a user of the load limit parameter, maybe someone else can comment / review.

The Windows build failure is a general problem (see master branch) which I don't understand either :(

@Flowdalic
Copy link
Contributor Author

Friendly reminder :)

@Flowdalic
Copy link
Contributor Author

@jhasse friendly reminder. Anything I can do to get this merged?

src/build_test.cc Outdated Show resolved Hide resolved
@Flowdalic
Copy link
Contributor Author

Looks like someone needs to approve the CI workflows. Thanks.

@jonesmz

This comment was marked as abuse.

@Flowdalic
Copy link
Contributor Author

@Flowdalic it's not that the CI workflow needs approval.

I am not sure about. I see here

3 workflows awaiting approval
First-time contributors need a maintainer to approve running workflows. Learn more.
1 successful check

/usr/include/c++/4.8.2/bits/c++0x_warning.h:32:2: error: #error This file requires compiler and library support for the ISO C++ 2011 standard. This support is currently experimental, and must be enabled with the -std=c++11 or -std=gnu++11 compiler options.
 #error This file requires compiler and library support for the \

fixed by replacing cstdint with stdint.h.

@Flowdalic
Copy link
Contributor Author

Squashed everything into a single commit. Someone may want to approve the workflows. Anything else to get this merged?

@Flowdalic
Copy link
Contributor Author

Friendly Ping. Anything I can do to get this merged?

@Flowdalic
Copy link
Contributor Author

Another friendly Ping. Anything I can do to get this merged?

@jhasse jhasse added this to the 1.12.0 milestone Feb 7, 2022
@@ -18,6 +18,8 @@
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <climits>
#include <stdint.h>

This comment was marked as abuse.

src/build.cc Show resolved Hide resolved
src/build.cc Outdated Show resolved Hide resolved
@Flowdalic Flowdalic force-pushed the load-capacity branch 2 times, most recently from 7bc59c1 to 3a1fb50 Compare June 11, 2022 10:06
@Flowdalic
Copy link
Contributor Author

Flowdalic commented Jun 11, 2022

Note that I had to revert the suggested changes because ninja does not support C++11 yet.

@Flowdalic
Copy link
Contributor Author

Friendly ping.

src/build.cc Outdated Show resolved Hide resolved
src/build.cc Outdated Show resolved Hide resolved
src/build.cc Outdated Show resolved Hide resolved
@Flowdalic Flowdalic force-pushed the load-capacity branch 2 times, most recently from a0f5251 to e60933b Compare November 23, 2023 08:15
@Flowdalic
Copy link
Contributor Author

Flowdalic commented Nov 23, 2023

@jhasse thanks for your review. I have addressed your comments.

@Flowdalic Flowdalic force-pushed the load-capacity branch 2 times, most recently from 3f9126d to f011238 Compare November 23, 2023 08:33
This changes CanRunMore() to return an int instead of a bool. The
return value is the "remaining load capacity. That is the number of
new jobs that can be spawned without saturating a potential enabled
load limitation (if ninja's -l option is used). We assume that every
started edge increases the load by one. Hence the available "load
capacity" is the maximum allowed load minus the current load.

Previously, ninja would oversaturate the system with jobs, even though
a load and job limit was provided, when multiple ninja builds are
running. This is because changes in load average are inert, newly
started dobs to no immediatly change the load average, yet ninja
assumed that new jobs are immediately reflected in the load
average. Ninja would retrieve the current 1min load average, check if
it is below the limit and, if so, start a new job, and then
repeat. Since it takes a while for the new job to get reflected in the
load average, ninja would often spawn jobs until the job limit ("-j")
is reached. If this is done by multiple parallel ninja builds, then
the system becomes oversaturated, causing excessing context switches,
which eventually slow down each and every build process.

We can easily prevent this by considering the remaining load capacity
in ninja's main loop.

The following benchmark demonstrates how the change of this comit
helps to speed up multiple parallel builds on the same host. We
compare the total build times of 8 parallel builds of LLVM on a
256-core system using "ninja -j 258".

ninja-master:        1351 seconds
ninja-load-capacity:  920 seconds

That is, with this commit, the whole process becomes 1.46× faster.

The used benchmark script created and prepared 8 build directories,
records the start time, spawns 8 subshells invoking "ninja -j 258",
awaits the termination of those subshells, and records the end
time. Besides the total running time, it also outputs /proc/loadavg,
provides an indication of where the performance is gained:

ninja-master:         3.90  93.94 146.38 1/1936 209125
ninja-load-capacity: 92.46 210.50 199.90 1/1936 36917

So with this change, ninja uses the available hardware cores better in
the presence of competing ninja processes, while it does not overload
the system.

Finally, let us look at the two "dstat -cdgyl 60" traces of 8
parallel LLVM builds on a 256-core machine using "ninja -l 258":

ninja-master
--total-cpu-usage-- -dsk/total- ---paging-- ---system-- ---load-avg---
usr sys idl wai stl| read  writ|  in   out | int   csw | 1m   5m  15m
  1   0  99   0   0|  12k 4759k|   5B   55B|1135   455 |17.9 70.3 38.1
 38   6  56   0   0|2458B 7988k| 205B    0 |  34k   23k| 466  170 73.2
 26   3  71   0   0| 102k   94M|   0     0 |  22k 6265 | 239  156 74.3
 50   5  45   0   0|3149B   97M|   0     0 |  37k   12k| 257  191 92.2
 58   6  36   0   0|  90k   71M|   0     0 |  43k   12k| 320  224  110
 50   4  46   0   0|  52k   78M|   0     0 |  38k 6690 | 247  223  117
 50   5  45   0   0| 202k   90M|   0     0 |  37k 9876 | 239  238  130
 60   5  34   0   0| 109k   93M|   0     0 |  44k 8950 | 247  248  140
 69   5  26   0   0|5939B   93M|   0     0 |  50k   11k| 309  268  154
 49   4  47   0   0| 172k  111M|   0     0 |  36k 7835 | 283  267  161
 58   7  35   0   0|  29k  142M|   0     0 |  45k 7666 | 261  267  168
 72   4  24   0   0|  46k  281M|   0     0 |  50k   13k| 384  296  183
 49   6  46   0   0|  68B  198M|   0     0 |  37k 6847 | 281  281  185
 82   6  12   0   0|   0    97M|   0     0 |  59k   15k| 462  323  205
 31   5  63   0   0|   0   301M|   0     0 |  26k 5350 | 251  291  202
 66   7  28   0   0|  68B  254M|   0     0 |  49k 9091 | 270  292  208
 68   8  25   0   0|   0   230M|   0     0 |  51k 8186 | 287  292  213
 52   5  42   1   0|   0   407M|   0     0 |  42k 5619 | 207  271  211
 29   7  64   0   0|   0   418M|   0     0 |  27k 2801 | 131  241  205
  1   1  98   0   0| 137B  267M|   0     0 |1944   813 |55.8  199  193
  0   0 100   0   0|2253B   43M|   0     0 | 582   365 |26.8  165  181
  0   0  99   0   0|   0    68M|   0     0 | 706   414 |11.5  136  170
4   0  96   0   0|   0    13M|   0     0 |2892   378 |10.0  113  160

ninja-load-capacity
--total-cpu-usage-- -dsk/total- ---paging-- ---system-- ---load-avg---
usr sys idl wai stl| read  writ|  in   out | int   csw | 1m   5m  15m
  1   0  98   0   0|  12k 5079k|   5B   55B|1201   470 |1.35 40.2  115
 43   6  51   0   0|3345B   78M|   0     0 |  34k   20k| 247  127  142
 71   6  23   0   0|   0    59M|   0     0 |  53k 8485 | 286  159  152
 60   5  35   0   0|  68B  118M|   0     0 |  45k 7125 | 277  178  158
 62   4  35   0   0|   0   115M|   0     0 |  45k 6036 | 248  188  163
 61   5  34   0   0|   0    96M|   0     0 |  44k 9448 | 284  212  173
 66   5  28   0   0|   9B   94M|   0     0 |  49k 5733 | 266  219  178
 64   7  29   0   0|   0   159M|   0     0 |  49k 6350 | 241  223  182
 66   6  28   0   0|   0   240M|   0     0 |  50k 9325 | 285  241  191
 68   4  27   0   0|   0   204M|   0     0 |  49k 5550 | 262  241  194
 68   8  24   0   0|   0   161M|   0     0 |  53k 6368 | 255  244  198
 79   7  14   0   0|   0   325M|   0     0 |  59k 5910 | 264  249  202
 72   6  22   0   0|   0   367M|   0     0 |  54k 6684 | 253  249  205
 71   6  22   1   0|   0   377M|   0     0 |  52k 8175 | 284  257  211
 48   8  44   0   0|   0   417M|   0     0 |  40k 5878 | 223  247  210
 23   4  73   0   0|   0   238M|   0     0 |  22k 1644 | 114  214  201
  0   0 100   0   0|   0   264M|   0     0 |1016   813 |43.3  175  189
  0   0 100   0   0|   0    95M|   0     0 | 670   480 |17.1  144  177

As one can see in the above dstat traces, ninja-master will have a
high 1min load average, of up to 462. This is because ninja will not
considered the remaining load capacity when spawning new jobs, but
instead spawn as new jobs until it runs into the -j limitation. This,
in turn, causes an increase of context switches: the rows with a high
1min load average also have >10k context switches (csw). Whereas a
remaining load-capacity aware ninja avoids oversaturing the system
with excessive additional jobs.

Note that since the load average is an exponentially damped moving
sum, build systems that take the load average into consideration to
limit the load average to the number of available processors will
always (slightly) overprovision the system with tasks. Eventually,
this change decreases the aggressiveness ninja schedules new jobs if
the '-l' knob is used, and by that, the level of overprovisioning, to
a reasonable level compared to the status quo. It should be mentioned
that this means that an individual build using '-l' will now be
potentially a bit slower. However, this can easily be fixed by
increase the value provided to the '-l' argument.

The benchmarks where performed using the following script:

set -euo pipefail

VANILLA_NINJA=~/code/ninja-master/build/ninja
LOAD_CAPACITY_AWARE_NINJA=~/code/ninja-load-capacity/build/ninja
CMAKE_NINJA_PROJECT_SOURCE=~/code/llvm-project/llvm

declare -ir PARALLEL_BUILDS=8
readonly TMP_DIR=$(mktemp --directory --tmpdir=/var/tmp)

cleanup() {
    rm -rf "${TMP_DIR}"
}
trap cleanup EXIT

BUILD_DIRS=()
echo "Preparing build directories"
for i in $(seq 1 ${PARALLEL_BUILDS}); do
	BUILD_DIR="${TMP_DIR}/${i}"
	mkdir "${BUILD_DIR}"
	(
		cd "${BUILD_DIR}"
		cmake -G Ninja "${CMAKE_NINJA_PROJECT_SOURCE}" \
			&> "${BUILD_DIR}/build.log"
	)&
	BUILD_DIRS+=("${BUILD_DIR}")
done
wait

NPROC=$(nproc)
MAX_LOAD=$(echo "${NPROC} + 2" | bc )
SLEEP_SECONDS=300

NINJA_BINS=(
    "${VANILLA_NINJA}"
    "${LOAD_CAPACITY_AWARE_NINJA}"
)
LAST_NINJA_BIN="${LOAD_CAPACITY_AWARE_NINJA}"

for NINJA_BIN in "${NINJA_BINS[@]}"; do
	echo "Cleaning build dirs"
	for BUILD_DIR in "${BUILD_DIRS[@]}"; do
		(
			"${NINJA_BIN}" -C "${BUILD_DIR}" clean &> "${BUILD_DIR}/build.log"
		)&
	done
	wait

	echo "Starting ${PARALLEL_BUILDS} parallel builds with ${NINJA_BIN} using -j ${MAX_LOAD}"
	START=$(date +%s)
	for BUILD_DIR in "${BUILD_DIRS[@]}"; do
		(
			"${NINJA_BIN}" -C "${BUILD_DIR}" -l "${MAX_LOAD}" &> "${BUILD_DIR}/build.log"
		)&
	done
	wait
	STOP=$(date +%s)

	DELTA_SECONDS=$((STOP - START))
	echo "Using ${NINJA_BIN} to perform ${PARALLEL_BUILDS} of ${CMAKE_NINJA_PROJECT_SOURCE}"
	echo "took ${DELTA_SECONDS} seconds on this ${NPROC} core system using -j ${MAX_LOAD}"
	echo "/proc/loadavg:"
	cat /proc/loadavg
	echo "ninja --version:"
	"${NINJA_BIN}" --version

	if [[ "${NINJA_BIN}" != "${LAST_NINJA_BIN}" ]]; then
	    echo "Sleeping ${SLEEP_SECONDS} seconds to bring system into quiescent state"
	    sleep ${SLEEP_SECONDS}
	fi
done
@jhasse jhasse merged commit bf6409a into ninja-build:master Feb 29, 2024
10 checks passed
@Flowdalic Flowdalic deleted the load-capacity branch March 8, 2024 06:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants