Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

asm.js output size increase after update to latest incoming #5049

Closed
aidanhs opened this issue Mar 18, 2017 · 8 comments
Closed

asm.js output size increase after update to latest incoming #5049

aidanhs opened this issue Mar 18, 2017 · 8 comments

Comments

@aidanhs
Copy link
Contributor

aidanhs commented Mar 18, 2017

DEBUG:root:applying js optimization passes: asmPreciseF32 asm receiveJSON localCSE safeLabelSetting emitJSON
chunkification: num funcs: 3 actual num chunks: 1 chunk size range: 425615445 - 425615445
buffer.js:378
    throw new Error('toString failed');
    ^

Error: toString failed
    at Buffer.toString (buffer.js:378:11)
    at read (/work/emsdk/emscripten/incoming/tools/js-optimizer.js:68:45)
    at Object.<anonymous> (/work/emsdk/emscripten/incoming/tools/js-optimizer.js:7950:11)
    at Module._compile (module.js:434:26)
    at Object.Module._extensions..js (module.js:452:10)
    at Module.load (module.js:355:32)
    at Function.Module._load (module.js:310:12)
    at Function.Module.runMain (module.js:475:10)
    at startup (node.js:117:18)
    at node.js:951:3
Traceback (most recent call last):
  File "/work/emsdk/emscripten/incoming/em++", line 16, in <module>
    emcc.run()
  File "/work/emsdk/emscripten/incoming/emcc.py", line 1975, in run
    JSOptimizer.flush()
  File "/work/emsdk/emscripten/incoming/emcc.py", line 1873, in flush
    run_passes(chunks[i], 'js_opts_' + str(i), just_split='receiveJSON' in chunks[i], just_concat='emitJSON' in chunks[i])
  File "/work/emsdk/emscripten/incoming/emcc.py", line 1843, in run_passes
    final = shared.Building.js_optimizer(final, passes, debug_level >= 4, JSOptimizer.extra_info, just_split=just_split, just_concat=just_concat)
  File "/work/emsdk/emscripten/incoming/tools/shared.py", line 2046, in js_optimizer
    ret = js_optimizer.run(filename, passes, NODE_JS, debug, extra_info, just_split, just_concat)
  File "/work/emsdk/emscripten/incoming/tools/js_optimizer.py", line 565, in run
    return temp_files.run_and_clean(lambda: run_on_js(filename, passes, js_engine, source_map, extra_info, just_split, just_concat))
  File "/work/emsdk/emscripten/incoming/tools/tempfiles.py", line 78, in run_and_clean
    return func()
  File "/work/emsdk/emscripten/incoming/tools/js_optimizer.py", line 565, in <lambda>
    return temp_files.run_and_clean(lambda: run_on_js(filename, passes, js_engine, source_map, extra_info, just_split, just_concat))
  File "/work/emsdk/emscripten/incoming/tools/js_optimizer.py", line 465, in run_on_js
    filenames = [run_on_chunk(command) for command in commands]
  File "/work/emsdk/emscripten/incoming/tools/js_optimizer.py", line 287, in run_on_chunk
    assert proc.returncode == 0, 'Error in optimizer (return code ' + str(proc.returncode) + '): ' + output
AssertionError: Error in optimizer (return code 1): 
Command exited with non-zero status 1

i.e. it's failing in https://github.com/kripken/emscripten/blob/142adaf/tools/js-optimizer.js#L66-L69

See nodejs/node#3175

Something must be generating larger code so the js-optimizer is failing to read it all in (this is in the second phase of optimizer passes before emterpretifying, see the bullet points in #5046, so asmPreciseF32 asm eliminate simplifyExpressions emitJSON have been run), but the input code to this second phase is 371MB so clearly this is much bigger than it was when it worked before (since the size then must have been 250MB or below)!

I'll investigate later, just wanted to report the issue

@aidanhs
Copy link
Contributor Author

aidanhs commented Mar 18, 2017

Worth pointing out that I have chunkification explicitly disabled so I could investigate #5031. It was working fine until I updated about 4 days of incoming commits, including llvm 4 etc.

So nobody else is likely to see this failure, but I still want to figure out why it's happening since it could affect my final output size.

@kripken
Copy link
Member

kripken commented Mar 18, 2017

When you re-enable chunkification, does it work?

Either way still interesting to know why code size changed here, but it could just be libc is a little bigger, and happened to cross a node.js limit.

@aidanhs
Copy link
Contributor Author

aidanhs commented Mar 18, 2017

Yeah, with chunkification works (hence "nobody else is likely to see this failure").

@aidanhs
Copy link
Contributor Author

aidanhs commented Mar 19, 2017

My command:

rm -rf ~/.emscripten_cache && EMCC_DEBUG=1 /usr/bin/time em++ -o x.js --js-opts 0 -O2 -g0 --memory-init-file 1 -s TOTAL_MEMORY=1073741824 -L $(pwd)/working_ground -L $(pwd)/../boost/dist/lib -s ALIASING_FUNCTION_POINTERS=0 -s DEMANGLE_SUPPORT=1 -s DISABLE_EXCEPTION_CATCHING=0 -s USE_SDL=2 -s USE_SDL_IMAGE=2 -s USE_SDL_NET=2 -s USE_SDL_TTF=2 -I/work/emwesnoth/SDL2_mixer/dist/include/SDL2 -L/work/emwesnoth/SDL2_mixer/dist/lib -s SDL2_IMAGE_FORMATS='["png"]' -s USE_LIBPNG=1 -s LEGACY_GL_EMULATION=1 -s USE_ZLIB=1 -s USE_VORBIS=1 build/release/wesnoth.o -lwesnoth_extras build/release/lua/liblua.a build/release/libwesnoth_core.a build/release/libwesnoth.a build/release/libwesnoth_sdl.a -lwesnoth_extras build/release/lua/liblua.a -L/work/emwesnoth/pango/dist/lib -L/work/emwesnoth/glib/dist/lib -L/work/emwesnoth/cairo/dist/lib -L/work/emwesnoth/harfbuzz/dist/lib -L/work/emwesnoth/fontconfig/dist/lib -L/work/emwesnoth/pixman/dist/lib -L/work/emwesnoth/libxml2/dist/lib -lm -lSDL2_net -lboost_iostreams -lboost_random -lboost_system -lboost_filesystem -lboost_locale -lSDL2_ttf -lSDL2_mixer -lSDL2_image -lvorbisfile -lpthread -lfontconfig -lpangoft2-1.0 -lpangocairo-1.0 -lcairo -lharfbuzz -lpixman-1 -lpango-1.0 -lglib-2.0 -lgobject-2.0 -lgmodule-2.0 -lxml2 -lboost_program_options -lboost_regex -lX11 -lpng

This is basically a ton of link flags plus --js-opts 0 -O2 -g0 --memory-init-file 1 -s TOTAL_MEMORY=1073741824 -s ALIASING_FUNCTION_POINTERS=0 -s DEMANGLE_SUPPORT=1 -s DISABLE_EXCEPTION_CATCHING=0. JS opts are 0 to try and isolate LLVM.

I tested two versions of emscripten. When I refer to 'incoming-x' below, I mean incoming as of Friday (df8796f) which contains llvm 4. Both of the tests were run on identical bitcode files (generated by incoming-x) and since there weren't any errors I assume that's fine.

  • with emscripten 1.37.6 (just pre llvm 4), x.js is 140MB (and x.js.mem is 1.2MB)
  • with emscripten incoming-x, x.js is 217MB (and x.js.mem is 1.8MB)

That's a pretty gigantic difference :(

@kripken do you have an interest in getting to the bottom of this? If so, what would be helpful for stripping it down for you? e.g. if I can come up with a reduced test case that's 5MB with the old llvm and has an extra ~40% overhead with the new llvm, is that useful? In the worst case I can just share everything with you (code/bitcode/whatever), but I'm reasonably set up for being able to reduce at the moment so could save some time.

@kripken
Copy link
Member

kripken commented Mar 19, 2017

Yeah, that seems larger than a small libc change... very odd.

First thing I would do is make a build with whitespace and function names (e.g. with --profiling) and see if the size difference is there too. Assuming it is, then run tools/find_bigfuncs.py on each and compare the outputs, which are sorted by size.

If nothing shows up there, then the next candidate are the function tables, especially with ALIASING_FUNCTION_POINTERS=0. As in asm.js they must be a power of two, if they were already massive and just moved over a boundary where a new power of two is needed, that could explain part of this. To check this, in those profiling builds just look at the lines near the end with var FUNCTION_TABLE, and maybe count the number of ,s total.

@aidanhs
Copy link
Contributor Author

aidanhs commented Mar 19, 2017

I did the easy step first (removing flags) out of laziness. Results:

For 1.37.6 (140MB):

  • removing -s DEMANGLE_SUPPORT=1 makes 0% difference.
  • removing -s ALIASING_FUNCTION_POINTERS=0 makes 4MB difference (after: 136MB)
  • removing -s DISABLE_EXCEPTION_CATCHING=0 makes 56MB difference (after: 84MB)
  • removing all three (to check any odd interactions) makes 58MB difference (after: 82MB)

For incoming-x (217MB):

  • removing -s DEMANGLE_SUPPORT=1 makes 0% difference.
  • removing -s ALIASING_FUNCTION_POINTERS=0 makes 11MB difference (after: 206MB)
  • removing -s DISABLE_EXCEPTION_CATCHING=0 makes 95MB difference (after: 122MB)
  • removing all three (to check any odd interactions) makes 101MB difference (after: 116MB)

Surprised that exceptions are such a big deal, but no other hints here. I'll investigate function sizes next.

@aidanhs aidanhs changed the title Buffer size too large for js optimizer after update to latest incoming asm.js output size increase after update to latest incoming Mar 20, 2017
@aidanhs
Copy link
Contributor Author

aidanhs commented Mar 22, 2017

I compared the two files after removing all three of the flags mentioned in the previous comment:

$ python ../compare_bigfuncs.py x.old.js x.new.js
file 2 has 15853 functions more than file 1 overall (unique: 16508 vs 655)
file 2 has 1011846 lines more than file 1 overall in unique functions
file 2 has 32MB more than file 1 overall in unique functions
file 2 has 2037 lines less than file 1 in 19659 common functions
file 2 has 135KB more than file 1 in 19659 common functions

Looks like the new llvm is leaving far more functions around. One of the functions I checked at random never has its address taken so seems like it should be eliminated. Another one does have its address taken, so I don't know how the old llvm was eliminating it.

I'm going to do a full rebuild of all deps with the old llvm so I'm not running it on bitcode generated by the new llvm as I want to double check the above.

@aidanhs
Copy link
Contributor Author

aidanhs commented Mar 22, 2017

I'm going to do a full rebuild of all deps with the old llvm so I'm not running it on bitcode generated by the new llvm as I want to double check the above.

Did this and the sizes are now within 1MB of each other. I assume I hadn't disabled chunks on the old llvm, and then I was misled by llvm 4.0 bitcode being accepted but then being badly mistreated by llvm 3.9.

Closing as a PEBCAK. I'll contribute back my compare_bigfuncs.py script.

@aidanhs aidanhs closed this as completed Mar 22, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants