Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GNU make jobserver client support #1140

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

stefanb2
Copy link
Contributor

  • add new TokenPool interface
  • GNU make implementation for TokenPool parses and verifies the magic
    information from the MAKEFLAGS environment variable
  • RealCommandRunner tries to acquire TokenPool
    • if no token pool is available then there is no change in behaviour
  • When a token pool is available then RealCommandRunner behaviour
    changes as follows
    • CanRunMore() only returns true if TokenPool::Acquire() returns true
    • StartCommand() calls TokenPool::Reserve()
    • WaitForCommand() calls TokenPool::Release()

Documentation for GNU make jobserver

http://make.mad-scientist.net/papers/jobserver-implementation/

Fixes #1139

@stefanb2
Copy link
Contributor Author

For easier review I provided a PR based on the squashed commit with the final code.

If you are interested in the development history with all the exploration and mistakes then please look at branch https://github.com/stefanb2/ninja/tree/topic-issue-1139-full

@stefanb2
Copy link
Contributor Author

stefanb2 commented Apr 28, 2016

I see what you mean. That probably explains why ninja instances are less aggressive than GNU make instances in acquiring tokens. In real life that doesn't really matter, because except for the tail end of the build the tokens are always used by somebody else and most build steps are short, i.e. if a token should be available the ninja instance will become aware of it soon.

Unfortunately there is no "token interrupt" available. I guess we would need to add the TokenPool to the SubProcessSet::DoWork() API and then add an internal SubProcessSet <-> TokenPool interface where TokenPool can provide poll/select information (if available) to SubProcessSet's waiting mechanism.

Could this be handled as future improvement request or do you see this as a blocker for getting this change merged?

#endif
if (ret > 0) {
char buf;
int ret = read(rfd_, &buf, 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The select above provides no guarantee that the read here will not block. If another job takes the last token between the select and the read, ninja will be stuck here until a make job finishes. That means that ninja can not process of its children exiting and start any new processes until make finishes a job, which could be an arbitrarily long time.

http://make.mad-scientist.net/papers/jobserver-implementation/ describes how difficult this is to get right, and involves dup'ing the fd so that it can be closed in a SIGCHLD signal handler to abort the read.

A non-blocking fd would have made this all trivial, but the make authors decided that select wasn't portable enough. Since all jobserver clients share the same pipe, you can't make the fd non-blocking without affecting all of them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remember: this code is only called from build.cc when ninja has at least one active child [subprocs_.running_.empty() == false].

In the case we do enter the read() and it blocks then the following 4 things can happen:

  1. it returns successfully. That's what we wanted anyway and return true.
  2. a child process terminates, interrupting the read which then returns -1. We didn't get a token and return false. The code returns to the main loop in build.cc for normal processing, e.g. reaping the child.
  3. a child terminates between the poll/select() call and entering read(). The read returns immediately with -1 -> same as (2).
  4. any other signal -> same as (2) or (3)

Please correct me if I'm wrong but in every case ninja returns to the main loop and never stalls.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ninja doesn't enable SIGCHLD, and it is ignored by default, so it won't interrupt the read.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, statement #3 is wrong. Even if SIGCHLD is enabled, the read will not return -1 if the SIGCHLD happens after the poll/select but before the read. That race condition is the reason make dups the fd before reading it - the SIGCHLD handler closes the dup'd copy of the fd, so the read returns -EBADF.

Copy link
Contributor Author

@stefanb2 stefanb2 May 25, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed concern by wrapping read() in alarm(1)/alarm(0) + installing corresponding signal handler.

IMHO this simple solution is good enough, because the race condition is rare. From the two different, large test builds I have seen the perror() only once in one build log.

BTW: latest patch set also includes support for GNU make 4.2 jobserver changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: I made a test change that added a dummy SIGCHLD handler that does nothing. I.e. it just unblocks the token read() when an active child exited. But that negatively impacted ninja performance. In my test build the build time with ~50K build steps it increased ~30%. Therefore I will not push that change.

The current patch set should be good to go then.

@stefanb2 stefanb2 force-pushed the topic-issue-1139-squashed branch 4 times, most recently from 03baadb to 7aa3333 Compare May 26, 2016 10:10
@colincross
Copy link
Contributor

I think the race conditions in both directions (ninja not reaping and starting new jobs because it is stuck on the read in Acquire(), and ninja not noticing an available token because it is stuck in DoWork waiting for jobs to finish) are both pretty fundamental problems. If they are not fixable I don't think this patch should be merged.

Ninja doesn't use threads right now, but one option would be to have a separate thread that managed acquiring tokens and signalling the main thread when they are available. The main thread would signal the jobserver thread when it wanted a token, the jobserver thread would call a blocking read on the jobserver pipe and write to a signalling pipe to interrupt DoWork when it returned. That's a lot of complexity to add to ninja though, for a feature where the ideal solution is still to convert everything to ninja.

@maximuska
Copy link
Contributor

I think using timer signal makes read essentially non blocking, solving the
issue, except it still leaves a theoretical window for alarm signal to be
delivered before read is called. A read call which atomically unblocks
signals would be needed to make this rock solid, somewhat like pselect().

The comment about sigchld performance impact made me curious. Have you
replaced alarm signal or used both signals in your experiment?

On May 26, 2016 11:12 PM, "colincross" notifications@github.com wrote:

I think the race conditions in both directions (ninja not reaping and
starting new jobs because it is stuck on the read in Acquire(), and ninja
not noticing an available token because it is stuck in DoWork waiting for
jobs to finish) are both pretty fundamental problems. If they are not
fixable I don't think this patch should be merged.

Ninja doesn't use threads right now, but one option would be to have a
separate thread that managed acquiring tokens and signalling the main
thread when they are available. The main thread would signal the jobserver
thread when it wanted a token, the jobserver thread would call a blocking
read on the jobserver pipe and write to a signalling pipe to interrupt
DoWork when it returned. That's a lot of complexity to add to ninja though,
for a feature where the ideal solution is still to convert everything to
ninja.


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#1140 (comment)

char buf;

interrupted_ = false;
alarm(1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is still a race here, although an extremely unlikely one. If the process were preempted after alarm(1) but before read the signal could be delivered before the read started, and the read would block indefinitely.

See https://android.googlesource.com/platform/build/+/master/tools/makeparallel/makeparallel.cpp#166 for what I believe is a race-free implementation of this read, based on the ideas from the make jobserver document.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and ninja not noticing an available token because it is stuck in DoWork waiting for jobs to finish) are both pretty fundamental problems

Not really:

  • ninja will not stall, i.e. it will always have at least one job running
  • tokens will not be wasted, other clients will be using them

That's a lot of complexity to add to ninja though, for a feature where the ideal solution is still to convert everything to ninja.

See #1139 (comment) why that is not possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented something similar which also installs the same signal handler temporarily also for SIGCHLD.

Performance is about the same in my testbuild as with the last patchset.

@stefanb2 stefanb2 force-pushed the topic-issue-1139-squashed branch 2 times, most recently from 1b05452 to 30bc846 Compare May 27, 2016 11:48
@stefanb2
Copy link
Contributor Author

I've added a proposal for a token monitor in stefanb2@9832d65. SubprocessSet::DoWork() will now exit when token read file descriptor becomes readable.

@stefanb2 stefanb2 force-pushed the topic-issue-1139-squashed branch 2 times, most recently from 0e5b8ca to 22faa20 Compare May 30, 2016 10:42
@stefanb2
Copy link
Contributor Author

I hope the latest patch set addresses all reviewer comments.

@dcolascione
Copy link

What are the remaining reasons not to merge this code?

@fabio-porcedda
Copy link

I would like too have this feature merged, i simply cannot convert all projects to ninja-build because i'm not allowed to do that.

@stefanb2 Thanks a lot for your work

@hwti
Copy link

hwti commented Oct 18, 2017

Unlike make, this will use the jobserver even if -j is specified on the command line.
Perhaps the jobserver should be ignored in these cases, so a make -j2 could launch a ninja -j4 for example.

@stefanb2
Copy link
Contributor Author

@hwti: this is unnecessary, because when you use ninja in a makefile rule you must explicitly enable the jobserver magic. As documented in the GNU make manual here and here the job server magic will only be activated for a job if

  • the command line contains the string $(MAKE) before it is expanded, or
  • the start of the command line is tagged with the + token after it is expanded.

I.e. if your makefile has

target: ....
    ninja -j4 ...

ninja will never use the jobserver but always run 4 jobs in parallel, no matter if -jN is given on the GNU make command line or not.

In the case that the ninja invocation is hidden inside a script that is executed by GNU make with active jobserver magic you can simply use

export MAKEFLAGS=
...
ninja -j4 ...

or

MAKEFLAGS= ninja -j4 ...

@hwti
Copy link

hwti commented Oct 19, 2017

@stefanb2: What I meant is this is a case where make and ninja behavior differs, so it can be confusing.

For example, if the top-level makefile has

target:
     +Tools/Scripts/build-webkit --gtk ...

The build-webkit script uses cmake with either ninja (if available) or make, and adds -jNCPUs by default.
Ninja would use the jobserver, even if make doesn't, effectively hiding the issue that the script should check MAKEFLAGS before adding the argument.

Moreover, if a script wanted to disable the jobserver, it would have to carefully filter MAKEFLAGS, or to do it only in ninja case, as the variable also contains other flags and variable overrides.

@stefanb2
Copy link
Contributor Author

@hwti I guess you meant

...uses cmake with either ninja ..., and adds -jNCPUs to ninja command line by default.

That assumption is no longer correct when ninja supports job server, hence cmake behaviour should be updated or at least be made configurable.

But even without that update I don't see why this is a problem. If you update your ninja to a version that supports jobserver, then the ninja instance in the build will simply honour the jobserver instead of running with hard-coded -jNCPUs. If the build system would fail due to "incorrect" value of NCPUs then this would be a bug in the build system, not ninja.

BTW: if my guess was wrong and you meant that cmake always adds -jNCPUs no matter if make or ninja, then the + in your makefile rule is incorrect, because build-webkit is supposed to ignore jobserver.

@hwti
Copy link

hwti commented Oct 19, 2017

@stefanb2

That assumption is no longer correct when ninja supports job server, hence cmake behaviour should be updated or at least be made configurable.

It's build-webkit which calls cmake --build ... -- -jNCPUs, which gives the argument to either ninja or make.
If ninja honors the jobserver in this case, the build could work, instead of triggering an OOM due to too many g++ processes with make and the current official ninja.

So the real bug is in the script, which should check MAKEFLAGS, and only consider adding the argument if the top-level makefile was called without parallelism (or if there is no +, then it would be a bad buildsystem integration).

In this particular case, the behavior is "known", since it isn't new.
But you can imagine someone starting a new project, making the same kind of mistake, and not seeing it. Then only other people which only have make, or an older/unpatched ninja version would have issues.

@xim
Copy link

xim commented Nov 7, 2017

+1 for the feature, would like to see this merged.

I've been experimenting with using these patches for a project which has a bunch of subprojects. There's a top-level python script that generates the ninja files, then calls them (each with a set of different env params / command line switches). Currently the ninja files are run in serial, which means plenty of CPU idle during linking. I want to use a jobserver approach to improve build times on systems with many CPUs.

I made a proof-of-concept wrapping Makefile that runs the python script inside a jobserver, and it works.

Had to pass close_fds=False to the subprocess.call(...) in my python script to get this to work. If I didn't, things appeared to work but ninja ate all the CPU it could. I suspect ninja was in a tight loop trying to access the jobserver fds. Minor bug?

@xim
Copy link

xim commented Nov 8, 2017

I see my earlier assumption on why ninja was eating all my CPU may have been wrong. I was running a massive build process on a build server with many parallel ninjas on a build server, and the ninjas appear to run into some kind of trouble after a while if I used -l to scale jobs.

I am running on a 20-core build server (+ hyper-threading), 42 ninja-based projects to be built. If I use many parallel ninjas, and pass -l50 for load based task scaling, intermittently all the ninjas will use 100% CPU. As the number of ninjas increases, the risk of them getting stuck eating all the CPU increases.

Without -l there is no problem.

@stefanb2
Copy link
Contributor Author

stefanb2 commented Nov 8, 2017

@xim: -l is only necessary when you run multiple independent(!) build jobs, each with its own top-level make (== own jobserver), on the same machine.

If you have only one build job per machine, with one top-level make (== 1 jobserver), then the total amount of parallel executed build steps will never pass the -jN limit if all children adhere to the jobserver protocol. If you have children that do not adhere to the protocol, i.e. that execute more than one build step in parallel without requesting a token, e.g. hard-coded make -jN commands, then your build will run into overload situations.

Please remember that -lN is based on load average, which is not a fast measure. I.e. it only has a real effect when the build has been running for a while.

@stefanb2
Copy link
Contributor Author

stefanb2 commented Nov 12, 2017

@hwti sorry for taking so long to follow up on your question.

Do I understand you correctly that you want nina -jN to emulate the exact GNU make behaviour? I.e.

  1. print a warning that jobserver is ignored, and
  2. execute N jobs in parallel

I wrote a small .ninja file and tested the following situations with make -jN -f Makefile.top that calls

case command result
1 +ninja will execute max. N jobs EDIT: # of CPUs + 2 jobs
2 +ninja -j1 will execute max. 1 job
3 +ninja -jM with M <= N: will execute max. M jobs
4 +ninja -jM with M > N: will execute max. N jobs
  • UPDATE: case 1 may not use utilize tokens to the full when N > CPUs + 2
  • case 2 already works correctly.
  • case 3 is mainly correct, but ninja may execute less than the desired M jobs.
  • case 4 will execute less than the desired M jobs.

UPDATE: attaching my test files: parallel-build.tar.gz

Add tests that verify the token functionality of the builder main loop.
We replace the default fake command runner with a special version where
the tests can control each call to AcquireToken(), CanRunMore() and
WaitForCommand().
GNU make uses a semaphore as jobserver protocol on Win32. See also

   https://www.gnu.org/software/make/manual/html_node/Windows-Jobserver.html

Usage is pretty simple and straightforward, i.e. WaitForSingleObject()
to obtain a token and ReleaseSemaphore() to return it.

Unfortunately subprocess-win32.cc uses an I/O completion port (IOCP).
IOCPs aren't waitable objects, i.e. we can't use WaitForMultipleObjects()
to wait on the IOCP and the token semaphore at the same time.

Therefore GNUmakeTokenPoolWin32 creates a child thread that waits on the
token semaphore and posts a dummy I/O completion status on the IOCP when
it was able to obtain a token. That unblocks SubprocessSet::DoWork() and
it can then check if a token became available or not.

- split existing GNUmakeTokenPool into common and platform bits
- add GNUmakeTokenPool interface
- move the Posix bits to GNUmakeTokenPoolPosix
- add the Win32 bits as GNUmakeTokenPoolWin32
- move Setup() method up to TokenPool interface
- update Subprocess & TokenPool tests accordingly
- remove unnecessary "struct" from TokenPool
- add PAPCFUNC cast to QueryUserAPC()
- remove hard-coded MAKEFLAGS string from win32
- remove useless build test CompleteNoWork
- rename TokenPoolTest to TestTokenPool
- add tokenpool modules to CMake build
- remove unused no-op TokenPool implementation
- fix errors flagged by codespell & clang-tidy
- POSIX GNUmakeTokenPool should return same token
- address review comments from

ninja-build#1140 (comment)
ninja-build#1140 (review)
ninja-build#1140 (review)
ninja-build#1140 (comment)
ninja-build#1140 (comment)
intelfx pushed a commit to intelfx/ninja that referenced this pull request Mar 2, 2023
intelfx pushed a commit to intelfx/ninja that referenced this pull request Mar 2, 2023
We are not allowed to call Builder::Build() with an empty plan.
Unfortunately this issue is only visible when asserts are enabled. As we
can't call Builder::Build() this test is useless. Simply remove it.

ninja-build#1140 (review)
intelfx pushed a commit to intelfx/ninja that referenced this pull request Mar 2, 2023
Follow the convention established by other test support classes.

ninja-build#1140 (comment)
intelfx pushed a commit to intelfx/ninja that referenced this pull request Mar 2, 2023
intelfx pushed a commit to intelfx/ninja that referenced this pull request Mar 2, 2023
According to the GNU make manual the client needs to write back the same
tokens it reads. The order is not important.

Add a stack to the instance

- onto which we push a successfully read token,
- from which we peek the token to return, and
- from which we pop when the token was successfully returned.

Update the tests accordingly.

ninja-build#1140 (comment)
Plagman pushed a commit to ValveSoftware/Proton that referenced this pull request Aug 10, 2023
ninja-build/ninja#1140

So that -jN actually uses N job slots even when using ninja.
Sadly this does not cover cargo, which is less of an issue.
Plagman pushed a commit to ValveSoftware/Proton that referenced this pull request Sep 7, 2023
ninja-build/ninja#1140

So that -jN actually uses N job slots even when using ninja.
Sadly this does not cover cargo, which is less of an issue.
Plagman pushed a commit to ValveSoftware/Proton that referenced this pull request Sep 12, 2023
ninja-build/ninja#1140

So that -jN actually uses N job slots even when using ninja.
Sadly this does not cover cargo, which is less of an issue.
@ivyl
Copy link

ivyl commented Nov 17, 2023

Thanks for working on this. I hope it will eventually get merged :-)

In https://github.com/ValveSoftware/Proton we build multiple sub-projects that use autotools, meson and cmake among others. Having ninja being able to talk to a job server and effectively parallelize saves us a significant amount of time each build. We no longer have to chose between stalling due to overzealous build ordering or DOSing the CPU and RAM with all the processes spawned by all parallel ninja invocations in addition to make.

@neheb
Copy link
Contributor

neheb commented Nov 17, 2023

@ivyl if you're impatient, install ninja from pip. It includes these patches.

virtual bool Acquire() = 0;
virtual void Reserve() = 0;
virtual void Release() = 0;
virtual void Clear() = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please document this interface to explain when these methods are supposed to be called (and under which preconditions).

src/tokenpool.h Outdated
virtual void Clear() = 0;

// returns NULL if token pool is not available
static struct TokenPool *Get(void);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: There is no need for struct here, please remote.

#include <stdlib.h>

// No-op TokenPool implementation
struct TokenPool *TokenPool::Get(void) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: please remote the unnecessary struct here (see comment below).

#include <unistd.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Please remove un-needed header includes. Introduce them with the code that actually requires them instead.

struct sigaction act;
memset(&act, 0, sizeof(act));
act.sa_handler = CloseDupRfd;
if (sigaction(SIGALRM, &act, &old_act_) < 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move all signal-handling code to subprocess-posix.cc instead, since it is considerably easier to understand what's going on when all signal-related code is in the same source file, and exposed through a sane API for the rest of the code base to use.

struct sigaction old_act_;
bool restore_;

static int dup_rfd_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Please clarify that this is a static variable, for example by using a prefix like s_dup_rfd_.


bool GNUmakeTokenPool::CheckFd(int fd) {
if (fd < 0)
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the fd < 0 check is redundant with the fcntl() one and can be removed.

int ret = fcntl(fd, F_GETFD);
if (ret < 0)
return false;
return true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: simplify with return fcntl(fd, F_GETFD) != -1;

configure.py Outdated
@@ -543,6 +543,7 @@ def has_re2c():
objs += cxx(name, variables=cxxvariables)
if platform.is_windows():
for name in ['subprocess-win32',
'tokenpool-none',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep configure.py consistent with CMakeLists.txt in all patches.


bool GNUmakeTokenPool::SetAlarmHandler() {
struct sigaction act;
memset(&act, 0, sizeof(act));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: simplify with struct sigaction act = {};

ret = read(dup_rfd_, &buf, 1);
alarm(0);

sigaction(SIGCHLD, &old_act, NULL);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As noted in a previous comment, this logic interferes with the one in subprocess-posix.cc in really brittle ways. It should be moved there instead to keep the code manageable and testable.

@carterols
Copy link

Super hopeful this can get merged. This has been a long time coming!

@jhasse
Copy link
Collaborator

jhasse commented May 30, 2024

see #2450

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add GNU make jobserver client support