ERL-1186: make -j fails #4157

OTP-Maintainer · 2020-03-07T20:46:26Z

Original reporter: JIRAUSER14002
Affected version: Not Specified
Component: Not Specified
Migrated from: https://bugs.erlang.org/browse/ERL-1186

Make -j fails with several errors, and compilation is aborted, see log.

My "native" -j32 works great, and even -j120 works, but it failed with -j128.

-j should do things as parallel as they are defined in the Makefile, but i guess the 200+ files in wx pushes it over some limit. I don't know if this is a limit in number of concurrent dirty schedulers or a Makefile issue or something else, so attaching the full output.

The text was updated successfully, but these errors were encountered:

OTP-Maintainer · 2020-03-09T07:01:13Z

john said:

It looks like you've reached the system thread limit, {{RLIMIT_NPROC}}, pid limit, or similar. It doesn't look like a bug in Erlang or the makefile.

OTP-Maintainer · 2020-03-09T08:37:17Z

john said:

On second thought, we've decided to tweak {{erlc}} so it starts the emulator with a single scheduler instead of the default, since it's pretty sequential anyway. That should greatly reduce the risk of running into this issue.

OTP-Maintainer · 2020-03-09T09:02:44Z

kostis said:

What about the HiPE native code compiler (erlc +native) ?  It's concurrent alright – as one would have expected in a language that advertises itself as a concurrent language...

OTP-Maintainer · 2020-03-09T16:02:10Z

john said:

HiPE will have to make do. {{erlc}} is generally used in makefiles and other tools that want to handle parallelism themselves, and I think it's fair to let them. Note that this change will not affect tools that use the {{compile}} module directly like {{rebar3}} or {{mix}}.

OTP-Maintainer · 2020-03-09T16:45:08Z

kostis said:

> erlc is generally used in makefiles and other tools that want to handle parallelism themselves.

And you base that statement on what information exactly?  (For example, I very regularly use erlc from the Unix shell.)

erlc is just another Unix command, not much different from erl.  If there are many Makefiles out there using `erl` (for example, to execute some Erlang program) would you also consider starting the Erlang emulator with just one scheduler by default?

 

In any case, (IMO, of course) the above change is the wrong thing to do.

If you want to properly solve this particular problem (and similar ones in the future, I guess), the proper thing to do is to extend `erlc` to also accept the `+S` option(s) that `erl` accepts and pass that to `erl`.  (This part of the change is useful to have anyway.)

Then you can change _just_ the Makefile of `wx`, which starts the compilation of hundreds of files at the same time and causes the problem, to call `erlc +S1` instead of just the default `erlc`.

OTP-Maintainer · 2020-03-09T17:13:08Z

essen said:

Is it still sequential if erlc receives multiple files at the same time? Would be great if it (optionally) wasn't, instead. Then Erlang.mk would compile projects faster. Sure Make can handle concurrent builds but that's not always desirable.

OTP-Maintainer · 2020-03-09T17:45:04Z

JIRAUSER14002 said:

With a cursory test, doing `erlc *.erl` in a directory with 31 files, i see only one CPU core used.

So i don't see erlc defaulting to one scheduler thread breaking e.g. @kostis use case really.

As @essen says, it would be neat if `erlc <several files>` compiled them in parallel, but imo that is where the `-j` and/or `+S` should come in.

OTP-Maintainer · 2020-03-09T17:56:35Z

kostis said:

{quote}So i don't see erlc defaulting to one scheduler thread breaking e.g. @kostis use case really.
{quote}
Please read the complete thread.

`erlc` is a Unix program that takes options.  One of them is the `+native` which _does_ use many CPU cores.  Perhaps there are or there will be other such options/cases (e.g. compiling multiple files).

Making `erlc` single-threaded by default
 # _does_ break my existing use case (compiling with the native code compiler)
 # is, even at the conceptual level, the wrong thing to do in a language that claims to be (primarily) _concurrent_, independently of the current technology of its byte-code compiler.

OTP-Maintainer · 2020-03-10T10:13:38Z

john said:

Having slept on it, I'll make an exception for HiPE and let {{+native}} imply {{+S0}}. I have no desire to argue this further.
{quote}In any case, (IMO, of course) the above change is the wrong thing to do.
{quote}
I disagree, your suggestion requires users to figure out what's going on. I'm sure that the reporter would have figured out that {{ERL_FLAGS=+S1}} solves the problem sooner or later but I don't like wasting people's time.
{quote}Is it still sequential if erlc receives multiple files at the same time? Would be great if it (optionally) wasn't, instead. Then Erlang.mk would compile projects faster. Sure Make can handle concurrent builds but that's not always desirable.
{quote}
Yes, and we could make them parallel if we ignore some corner cases relating to parse transformations.

Have you tried using the compile server?
{quote}is, even at the conceptual level, the wrong thing to do in a language that claims to be (primarily) concurrent, independently of the current technology of its byte-code compiler.
{quote}
I agree, and we'll happily accept PRs that make the compilers more concurrent, but {{erlc}} is single-threaded until then so anything above {{+S1}} is a waste of resources.

OTP-Maintainer · 2020-03-10T11:02:28Z

essen said:

{quote}Have you tried using the compile server?{quote}

I have not because of lack of time, and because of the fact it's not available in older Erlang versions meaning it cannot be the default method of building projects and too few projects would benefit from it at this point. I plan to add support for it as an alternative to the current way of doing things, but it's low priority for now.

OTP-Maintainer · 2020-03-11T12:36:53Z

john said:

I've merged a fix to {{master}}, thanks for your report!
{quote}I have not because of lack of time, and because of the fact it's not available in older Erlang versions meaning it cannot be the default method of building projects and too few projects would benefit from it at this point. I plan to add support for it as an alternative to the current way of doing things, but it's low priority for now.
{quote}
Fair enough.

OTP-Maintainer · 2020-03-11T20:04:49Z

JIRAUSER14002 said:

Great, thanks!

I tried building master from clean now, and it works perfectly.

OTP compiles in 100 seconds flat :)

OTP-Maintainer · 2020-03-11T22:43:15Z

kostis said:

{quote}OTP compiles in 100 seconds flat 
{quote}
 

Without a point of comparison, this is (a bit of) a meaningless number. 

For example, on my machine, even without this change, `make -j 42` on a clean OTP consistently finishes slightly (3-4 secs) faster than using `make -j`. What can we learn/conclude from this?

Since this whole thread originated from `make -j 32` working great but with `make -j 128` failing, how much faster is `make -j` on your machine compared to e.g. `make -j 42` ?

Btw, this is an honest question.  In my experience, the bottleneck(s) in building OTP seem to be
 * linking the .o files of the emulator (which exercises the file system – perhaps I have a slow disk)
 * compiling .c files in wx – in particular, the command CC ../priv/x86_64-unknown-linux-gnu/erl_gl.so .

None of these two items have anything to do with `erlc` being single-threaded or not.

OTP-Maintainer · 2020-03-12T00:55:02Z

rickard said:

The purpose of this change was to not unnecessarily exhaust system resources, not to improve performance even though we actually have seen performance improvements when building OTP.

Building OTP is also not the only potential build using parallel make and erlc. It is not unimaginable that you want to increase the amount of parallel erlc invocations if you got a big server, lots of cores and lots of files to compile. The problem with wasted system resources just gets worse and worse the more cores and files you got.

If/when the compiler can perform work in parallel it is an easy change to bump up the amount of schedulers. But not limiting the amount of schedulers to the maximum amount of work that actually can be done in parallel is just waste of resources for no good reason what so ever.

OTP-Maintainer added enhancement team:VM Assigned to OTP team VM priority:low labels Feb 10, 2021

OTP-Maintainer closed this as completed Feb 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ERL-1186: make -j fails #4157

ERL-1186: make -j fails #4157

OTP-Maintainer commented Mar 7, 2020

OTP-Maintainer commented Mar 9, 2020

OTP-Maintainer commented Mar 9, 2020

OTP-Maintainer commented Mar 9, 2020

OTP-Maintainer commented Mar 9, 2020

OTP-Maintainer commented Mar 9, 2020

OTP-Maintainer commented Mar 9, 2020

OTP-Maintainer commented Mar 9, 2020

OTP-Maintainer commented Mar 9, 2020

OTP-Maintainer commented Mar 10, 2020

OTP-Maintainer commented Mar 10, 2020

OTP-Maintainer commented Mar 11, 2020

OTP-Maintainer commented Mar 11, 2020

OTP-Maintainer commented Mar 11, 2020

OTP-Maintainer commented Mar 12, 2020

ERL-1186: make -j fails #4157

ERL-1186: make -j fails #4157

Comments

OTP-Maintainer commented Mar 7, 2020

OTP-Maintainer commented Mar 9, 2020

OTP-Maintainer commented Mar 9, 2020

OTP-Maintainer commented Mar 9, 2020

OTP-Maintainer commented Mar 9, 2020

OTP-Maintainer commented Mar 9, 2020

OTP-Maintainer commented Mar 9, 2020

OTP-Maintainer commented Mar 9, 2020

OTP-Maintainer commented Mar 9, 2020

OTP-Maintainer commented Mar 10, 2020

OTP-Maintainer commented Mar 10, 2020

OTP-Maintainer commented Mar 11, 2020

OTP-Maintainer commented Mar 11, 2020

OTP-Maintainer commented Mar 11, 2020

OTP-Maintainer commented Mar 12, 2020