-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Link time slowdown in an optimized build #21521
Comments
Thanks for the report. The best way to proceed with this kind of regression is to try to bisect to smaller range and see if we can narrow it down to a specific change. Can you take a look at the bisect instruction here: https://emscripten.org/docs/contributing/developers_guide.html#bisecting |
Another profiling tool you can use for Binaryen specifically is pass timings. Putting
That is, it prints the time each pass takes. (It also validates, which makes the build slower, but doesn't interfere with the timing measurement.) |
@sbc100 I think I found a commit using instructions from the doc. I hope I did everything right. It seems to be a part of 3.1.50 version.
|
Below are the results for one of the targets (test UI app with EMSDK 3.1.54) where we see the slowdown.
4 obj.o files Binary sizes:
|
Below is the linker output with
Comparing this to the output for the same target but with EMSDK 3.1.37, I can see that "dae-optimizing" takes almost x10 longer. "simplify-locals" increased as well.
|
@mere-human can you provide the wasm file to reproduce this? We did measurements before landing that, and I did some more measurements now on wasm files I have locally, and can't reproduce the problem. It might be specific to that wasm somehow. |
If you can't provide it, what I would do to investigate this on your machine is to run |
@kripken Thanks, I'll try that. But did you check it on Windows? I see the problem only on Windows but not on Mac (where the build time has actually improved). Also, I'm not sure I can use "perf" on Windows. |
I've just compared .wasm output for each phase with |
Oh, sorry, I missed that this was only on Windows. I don't have a Windows machine myself - I tested on Linux. In that case, maybe indirect calls are slower there for some reason? That is the only thing that PR should change, I think. I'm not sure how the OS can matter for indirect call speed, though I recall Spectre had some mitigations related to that... Perhaps there is some Windows profiling tool that you can run to get some insights on what is happening here? |
I work on a large project that is built with Emscripten.
When upgrading EMSDK from version 3.1.37 to 3.1.50, I noticed an increase in link time for a release build.
The slowdown occurs only when building on Windows, but not on Mac.
The link time remains relatively the same in version 3.1.54.
Link time for the main large UI target:
For another smaller test UI target, the link time between .37 and .50 went from 2:43m to 3:22m on Windows.
As it can be seen, the slowdown scales up.
However, for small CLI targets such as unit tests, the link time has actually improved.
The project contains a lot of unused code that is eliminated during the wasm-opt phase.
The amount of code as well as linker flags has remained the same between EMSDK versions.
What could be the reason for this, and are there any ways to solve it?
I suspect it could be due to LLVM 17 update that happened in 3.1.50.
I can provide more details if needed.
Version of emscripten/emsdk:
Failing command line in full:
Link command for the smaller program below.
Full link command and output with
-v
appended:Linker args were simplified to exclude unnecessary file names. Linker args are specified through an input file.
Args:
em++.bat -L Libs -sEXPORTED_FUNCTIONS=_main -sUSE_WEBGL2=1 --bind -sUSE_PTHREADS=1 -sCASE_INSENSITIVE_FS=1 -O2 -sPTHREAD_POOL_SIZE=6 -Wl,--error-limit=0 -sEXPORTED_RUNTIME_METHODS=ccall,cwrap -sPTHREAD_POOL_SIZE_STRICT=2 -sDISABLE_EXCEPTION_CATCHING=0 -sLLD_REPORT_UNDEFINED -sTOTAL_MEMORY=2147418112 --profiling-funcs --bind -sUSE_WEBGL2=1 -sFETCH=1 -sSTACK_SIZE=5MB -sEXPORTED_FUNCTIONS=_main,_htons,_ntohs -sEXPORTED_RUNTIME_METHODS=ccall,cwrap -sPROXY_TO_PTHREAD=1 --pre-js=script1.js --pre-js=script2.js ... -o app.html source1.o lib1.a ... --js-library lib1.js ...
Some of the static libs use "-Wl,--whole-archive" or "-Wl,--no-whole-archive".
Linker output:
Output for the smaller program.
Click to expand
I modified some file names. There were 4 .o files and 86 .a files.When building with
EMCC_DEBUG=1
, I can see that most of the link time is spent in wasm-opt & binariyen. Although, without much details.Example log entries (for a smaller program):
The text was updated successfully, but these errors were encountered: