-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Linking LibreOffice with WASM EH and SjLj fails since 3.1.6 #16572
Comments
Interesting, not many projects using wasm EH + wasm sjlj AFAIK, this might be one of the first... The first issue is hopefully not complex, but the function call mismatch mentioned on the last line is more worrying to me. Does it happen only in certain optimization levels perhaps? (can vary at both compile and link time) cc @aheejin |
Can you see which object file contains the reference to the undefined (sometimes |
I also suspect one of the object files linked wasn't built with I am also curious about the function call mismatch error... Do you have a reproducer? |
Maybe the parameter is missing for some external library. I just know 3.1.5 links and 3.1.6 doesn't. And WASM EH fails with a strange error early in the LO job scheduler and Emscripten EH works. WASM Qt5 is supposed to be build without EH, but I didn't rebuild that; maybe that also needs this flag nevertheless. The following builds are a month old, used for testing FF nightly with WASM EH back then. The "About LO" dialog should show the Emscripten version used. The broken call is a class function, Task::UpdateMinPeriod, called from https://github.com/LibreOffice/core/blob/c4cb1d1dd581a5f120d9cf8b1d4274ec38f3eabe/vcl/source/app/scheduler.cxx#L395. And it's not the first call of it. I just know it doesn't happen with the Emscripten EH build. opt-build: https://drive.google.com/file/d/1JAWj0S7gB6kWej3i3xaZUjYPN2bxg6su/view?usp=sharing |
I found that libpng wasn't build with So now my LO build doesn't have any more emscripten_longjmp ... but Qt WASM also has an internal libpng / libjpeg, which now need |
I think you would only need to supply Did you try |
FWIW, libpng in fact does use libjpeg(-turbo) also uses This means that, for example, all libraries that link against libjpeg (e.g. libtiff) must also be compiled with (I had a similar issue with wasm-vips when I updated Emscripten to 3.1.6, see commit kleisauke/wasm-vips@396c85b). |
Back when doing the initial LO WASM port, I had the idea to replace the LO's SjLj usage in the PNG and JPEG filters with exception handling. Then I decided to explicitly build those two files without EH. Sure you AFAIK could just build the source with SjLj calls with Now |
After rebuilding Qt with FWIW if you run LO via the I really have no idea how to provide a simpler example. I could provide a DWARF debug build, if that would help. Since the original problem is now fixed / invalid, you / we could eventually close this issue. I would be happy to get some further ideas, how to debug my runtime problem. Still can be a bug with FF nightly, but Chrome fails at the same point; no difference between the old and my current build from my POV. |
I'm planning to merge https://gerrit.libreoffice.org/c/core/+/132139 Would be nice, if someone can verify the commit message, so I can adapt it if I'm still misunderstanding something. |
Those can be tricky to debug. I'd make sure you have a deterministic testcase, then make sure it happens in all browsers (to rule out a browser bug, which it sounds like you have). If this is a regression (I'm not sure if this is, or if the linking issue is), you can bisect on emscripten. Otherwise perhaps you can bisect on something in LO. If you can't bisect, you can try the sanitizers. If all of the above fails, manual debugging might be needed (find out exactly where it fails, and add some debug logging etc.). |
@jmglogow If what the commit does is to add For the other bug, can you provide a reproducer? It doesn't have to be small; if we need to download a repo, that's also fine. As long as you have a deterministic reproducer and steps to reproduce that it will be helpful. |
@aheejin I've linked the reproducers above, but the following are new builds with Emscripten 3.1.8 and current LO source + my
Both contain a If you modify Both builds are done with |
Can you provide a reproducer and not the resulting object files? The former usually helps a lot more. What I mean by the reproducer is your directory before the build and the step to reproduce the resulting (erroneous) object files. |
The reproducer would be you building LO WASM with WASM EH. Since the linked patch didn't yet pass the CI (I updated a minor detail), it's just on the Then you can follow the More info about the LO build on Linux is https://wiki.documentfoundation.org/Development/BuildingOnLinux. The WASM build is a cross build, because LO needs tooling to generate source and config files. My current
I'm |
I just spend some time trying to debug this problem a bit further. While I found no smaller reproducer, I found the dynamic_cast failure happens already much earlier. The following is the structure of the failing code, I tried to reproduce:
The output from LO is:
Task::Start is a lot more complex, but the dynamic_cast instantly fails at the start of the function, like above example code. I used the same flags to compile it and even split it in multiple files for different compilation units, like the original code, without success to reproduce. And also in the real code there is no other code between the function calls: FWIW I found a workaround for the assumed compiler / linker bug: instead of embedding the Any other suggestions? |
I just tried to "fix" my hack by adding a |
Unrelated with this problem, I updated emsdk from
The build itself was done using |
Did you do a complete rebuild of all object files when upgrading from 3.1.10 to 3.1.12? I'd likely this is related to the libc++ upgrade that happen in 3.1.11 (#1700). |
(Yes, I think a full rebuild should fix it) |
It was a full rebuild. But now I realized I didn't rebuild my Qt WASM libraries, the only external dependency not build by LO itself. I'll rebuild Qt and I guess it'll link then. Maybe I'm lucky and the newer clang will even fix this bug (which was the main reason to try the Emscripten upgrade; I still believe it's a compiler / linker bug). |
So the Qt rebuild fixed the linking problem (as expected) and this bug still exists in the updated clang (which I expected too). Still the WASM EH build feels much "snappier" then the Emscripten EH build. I still have no minimal reproducer. I somehow expect the bug to be related to the size of the code, but this is just blind guessing. |
Sorry for the late reply. This thread got long and I'm not sure what the remaining bugs are. The link errors you reported have been resolved, right? Then the remaining bugs are
Is this correct? Are these two different bugs? Or 2 is a reducer for 1? For 1, I tried to setup the environment following #16572 (comment), but there were many documents to follow and something didn't work in the middle. I don't really remember what that was, because I tried that more than 2 months ago when you posted it. I'd appreciate a smaller reproducer or, even if it's large, an already set-up build directory with the actual emcc command line. For 2 (#16572 (comment)), I'm not sure what the bug is. I compiled the C++ code you attached 1. without EH 2. with Emscripten EH 3. with Wasm EH, and all of them seemed to r un fine. This is my shell printout:
Is this printout not correct? Or am I testing this in the wrong way? |
This is correct. We're back at the original bug (I should have opened a new one for the additional problem; sorry for that). I think this is one bug. IMHO Further debugging showed, that a
Hmm - let's see; I can provide my NEH Qt5 build. That is IMHO the hardest to setup, because
Then just run It'll miss some basic tooling, like
Yeah, as I wrote, I failed to reproduce it and just posted the code to give an idea about the code structure, so sorry for not expressing this good enough. |
I'm not sure if I did this right, but I downloaded your QT build and extracted it in Also I cloned https://github.com/LibreOffice/core/ in Then I ran this within
It crashes with this:
I have |
I think that is because IIRC that way to find the path to the llvm tools based on an emscripten install is to run |
I see, thanks. I don't use emsdk and use my local build of LLVM/binaryen/emscripten directly.. But
I guess this autogen script or QT or something doesn't read this |
It's the same. You can just create the file
When I posted my comment yesterday night, my build hadn't finished. I got additional linking errors, which I have now fixed with https://gerrit.libreoffice.org/c/core/+/135519. You can either Thanks for looking into this. |
Yikes - I always read |
I just merged this; little bit longer then the hour I originally assumed (LO CI had some Windows troubles, which seem to be resolved now).
Great. Sorry that you had to weed-out some stuff I wasn't yet aware of. OTOH and FWIW, the LO WASM build should be easier now for others ;-)
First start would be to cherry-pick Generally you can have a look at The originally reported bug happens at There is also the Emscripten-generated HTH to get you started. P.S. there is https://wiki.documentfoundation.org/Development/How_to_debug, but that won't help you with WASM. |
Sorry, I have zero knowledge about LibreOffice internals, so it's not easy to follow what you say.. 😢 I cherry-picked the commit e5572ca83a15be900aaecefd415d3ad31d34200c as you suggested, and now it works. But this doesn't help me with anything, because it just works, and it doesn't reproduce the error. Without the cherry-picked commit, it crashes, but as I said, I don't have any stack traces or anything, so it's hard to know where to start..
What I would like is the some more info than the text |
No worries. I'm happy someone actually invests time in this, who might be able to fix the bug (or produce a smaller reproducer, or give any other insight, even helping me to debug this further). I would be a bit embarrassed, if it were some LO specific problem… but then I also have no real way to debug / detect this. I know the general concepts of WASM (stacking VM, those function tables, which verify function signatures, etc.), but I generally found porting LO to WASM more like trial-and-error, compared to the Windows Arm64 port I also did (including learning Arm64 assembler to implement LO's own FFI implementation).
A copy and paste error: currently it's line
The original callsite is the one described above. So AFAIK a backtrace won't help you much, because it will be the system event loop, triggering the Scheduler timeout, the Scheduler searching for the next task to process. Here is the backtrace I see in the JavaScript console, if I run
Hope this helps anyway. |
I dug into this for some time. I still haven't found out why this fails, but here are some observations. I have confirmed the behavior you described in #16572 (comment), #16572 (comment), and #16572 (comment). I think the reason the last The task that was deleted was The reason I'm not sure how the lifetime of this main thread that runs The reason I tracked down the leaf level function that actually throws: I couldn't track down further because Anyway, to sum up, because |
I'll try to verify this later with latest Emscripten and a fresh build. Thing is
Yeah, I didn't include the debug WASM into the Qt upload. Honestly, I'm now as puzzled as before. |
The timeline is:
So that the destructor removes itself from the scheduler doesn't matter here. I tried to find who calls That So basically we need to figure out why |
By the way, have you tried the address sanitizer? It might help you with diagnosing memory issues. |
So I'm back in this old thread. OTOH it was probably good to get away from the problem to come back with new "energy". And sorry for the late reply. Today I started adding
It was a little bit more frustrating, because I added a In Qt 6.4, WASM got a new main loop that still uses Not idea, if this could prevent the documented JS exception problem. I'll try to build Qt6 WASM again, which I failed to do at the beginning of the year. 6.4 should be much less buggy w.r.t. WASM. "Magic" exception handling without unwinding the stack is definitely something unexpected for me. I still might be wrong, as I found no way to catch any exception in the C++ code, but my guess is this issue can be now closed, if there is one already tracking the missing implementation. |
** Version of emscripten/emsdk: **
emcc (Emscripten gcc/clang-like replacement + linker emulating GNU ld) 3.1.7 (48a1620)
clang version 15.0.0 (https://github.com/llvm/llvm-project fbce4a78035c32792b0a13cf1f169048b822c06b)
Target: wasm32-unknown-emscripten
Thread model: posix
InstalledDir: /home/jmg/Development/libreoffice/git_emsdk/upstream/bin
** Linking EH command flags: **
Failed build:
-fwasm-exceptions -s SUPPORT_LONGJMP=wasm
. Buiding with-s DISABLE_EXCEPTION_CATCHING=0
works fine.** Failing command line in full: **
S=/home/jmg/Development/libreoffice/wasm && B=$S/build-dbg-neh && I=$B/instdir && W=$B/workdir && /usr/bin/ccache /home/jmg/Development/libreoffice/git_emsdk/upstream/emscripten/em++ -fno-stack-protector -pthread -s USE_PTHREADS=1 -s TOTAL_MEMORY=1GB -s PTHREAD_POOL_SIZE=4 --bind -s FORCE_FILESYSTEM=1 -s WASM_BIGINT=1 -s ERROR_ON_UNDEFINED_SYMBOLS=1 -s FETCH=1 -s ASSERTIONS=1 -s EXIT_RUNTIME=0 -s EXPORTED_RUNTIME_METHODS=["UTF16ToString","stringToUTF16","printErr"] -pthread -s USE_PTHREADS=1 -fwasm-exceptions -s SUPPORT_LONGJMP=wasm -L$W/LinkTarget/StaticLibrary -L$I/sdk/lib -L$I/program -L$I/program -O1 -fstrict-aliasing -fstrict-overflow -g -gseparate-dwarf --pre-js $S/static/emscripten/environment.js --pre-js $W/CustomTarget/static/emscripten_fs_image/soffice.data.js.link --pre-js $S/static/emscripten/soffice_args.js $W/CObject/desktop/source/app/main.o -Wl,--start-group -luno_sal -lsofficeapp -luno_sal -lsofficeapp -lcomphelper -luno_cppu -luno_cppuhelpergcc3 -ldeploymentmisclo -leditenglo -lfwklo -li18nlangtag -luno_salhelpergcc3 -lsblo -lsfxlo -lsvllo -lsvxlo -lsvxcorelo -lsvtlo -ltklo -ltllo -lucbhelper -lutllo -lvcllo -lreglo -lunoidllo -lxmlreaderlo -lstorelo -lxmlscriptlo -lbasegfxlo -ldrawinglayercorelo -li18nutil -lsotlo -lepoxy -lxolo -llnglo -lsaxlo -ldrawinglayerlo -lavmedialo -lcomponentslo -lsvgfilterlo -lgraphicfilterlo -lhyphenlo -llnthlo -lspelllo -lbiblo -lchartcorelo -lchartcontrollerlo -lcmdmaillo -lconfigmgrlo -lctllo -ldbtoolslo -ldesktopbe1lo -levtattlo -lexpwraplo -lfilterconfiglo -lfps_officelo -lforlo -lfsstoragelo -li18npoollo -li18nsearchlo -llocalebe1lo -lloglo -lmigrationoo2lo -lmigrationoo3lo -lmsfilterlo -lnumbertextlo -lodfflatxmllo -loffacclo -looxlo -lpasswordcontainerlo -lpdffilterlo -lstoragefdlo -lsvgiolo -lemfiolo -lswlo -lsysshlo -ltextconversiondlgslo -ltextfdlo -lucpexpand1lo -lucpextlo -lucpimagelo -lucptdoc1lo -lunordflo -lunoxmllo -luuilo -lxmlfalo -lxmlfdlo -lxoflo -lxsltdlglo -lxsltfilterlo -lcuilo -lhwplo -lmswordlo -lswdlo -lt602filterlo -lwpftwriterlo -lwriterfilterlo -lcached1 -ldeployment -ldeploymentgui -lembobj -lemboleobj -lpackage2 -lsrtrs1 -lucb1 -lucpfile1 -lucphier1 -lucppkg1 -lxmlsecurity -lxsec_xmlsec -lxstor -lbinaryurplo -lbootstraplo -lintrospectionlo -linvocadaptlo -linvocationlo -liolo -lnamingservicelo -lproxyfaclo -lreflectionlo -lstocserviceslo -luuresolverlo -lwriterperfectlo -lgcc3_uno -lvclplug_qt5lo -lcollator_data -ldict_ja -ldict_zh -lindex_data -llocaledata_en -llocaledata_es -llocaledata_euro -llocaledata_others -ltextconv_dict -lswuilo -lepoxy $W/LinkTarget/StaticLibrary/libdtoa.a $W/LinkTarget/StaticLibrary/libzlib.a $W/LinkTarget/StaticLibrary/libboost_locale.a $W/LinkTarget/StaticLibrary/libgraphite.a $W/LinkTarget/StaticLibrary/liblibjpeg-turbo.a $W/LinkTarget/StaticLibrary/liblibpng.a $W/LinkTarget/StaticLibrary/libzlib.a $W/LinkTarget/StaticLibrary/libexpat.a $W/LinkTarget/StaticLibrary/libdtoa.a $W/LinkTarget/StaticLibrary/libzlib.a $W/LinkTarget/StaticLibrary/libfindsofficepath.a $W/LinkTarget/StaticLibrary/libboost_locale.a $W/LinkTarget/StaticLibrary/libgraphite.a $W/LinkTarget/StaticLibrary/liblibjpeg-turbo.a $W/LinkTarget/StaticLibrary/liblibpng.a $W/LinkTarget/StaticLibrary/libulingu.a $W/LinkTarget/StaticLibrary/libexpat.a $W/LinkTarget/StaticLibrary/libshell_xmlparser.a $W/LinkTarget/StaticLibrary/libboost_filesystem.a -L$W/UnpackedTarball/icu/source/lib -licui18n -licuuc $W/UnpackedTarball/openssl/libssl.a $W/UnpackedTarball/openssl/libcrypto.a -L$W/UnpackedTarball/liblangtag/liblangtag/.libs -llangtag -L$W/UnpackedTarball/libxml2/.libs -lxml2 -lm -L$W/UnpackedTarball/harfbuzz/src/.libs -lharfbuzz -L$W/UnpackedTarball/lcms2/src/.libs -llcms2 -L$W/UnpackedTarball/libwebp/src/.libs -lwebp -L$W/UnpackedTarball/cairo/src/.libs -lcairo -L$W/UnpackedTarball/pixman/pixman/.libs -lpixman-1 -L$W/UnpackedTarball/fontconfig/src/.libs -lfontconfig -L$W/UnpackedTarball/freetype/instdir/lib -lfreetype -L$W/UnpackedTarball/liborcus/src/liborcus/.libs -lorcus-0.17 -L$W/UnpackedTarball/liborcus/src/parser/.libs -lorcus-parser-0.17 -L$W/UnpackedTarball/hunspell/src/hunspell/.libs -lhunspell-1.7 -L$W/UnpackedTarball/hyphen/.libs -lhyphen -L$W/UnpackedTarball/mythes/.libs -lmythes-1.2 $W/UnpackedTarball/libnumbertext/src/.libs/libnumbertext-1.0.a -L$W/UnpackedTarball/redland/src/.libs -lrdf -L$W/UnpackedTarball/raptor/src/.libs -lraptor2 -L$W/UnpackedTarball/rasqal/src/.libs -lrasqal -L$W/UnpackedTarball/libxslt/libxslt/.libs -lxslt -L$W/UnpackedTarball/libxslt/libexslt/.libs -lexslt $W/UnpackedTarball/libabw/src/lib/.libs/libabw-0.1.a $W/UnpackedTarball/libebook/src/lib/.libs/libe-book-0.1.a -L$W/UnpackedTarball/libmwaw/src/lib/.libs -lmwaw-0.3 -L$W/UnpackedTarball/libodfgen/src/.libs -lodfgen-0.1 -L$W/UnpackedTarball/librevenge/src/lib/.libs -lrevenge-0.0 -L$W/UnpackedTarball/libstaroffice/src/lib/.libs -lstaroffice-0.0 -L$W/UnpackedTarball/libwpd/src/lib/.libs -lwpd-0.10 -L$W/UnpackedTarball/libwpg/src/lib/.libs -lwpg-0.3 -L$W/UnpackedTarball/libwps/src/lib/.libs -lwps-0.4 $W/UnpackedTarball/xmlsec/src/.libs/libxmlsec1.a -ldl -L/home/jmg/Development/libreoffice/git_qt5/install-5.15.2/lib -lQt5Core -lQt5Gui -lQt5Widgets -lQt5Network -lqtpcre2 -lQt5EventDispatcherSupport -lQt5FontDatabaseSupport -L/home/jmg/Development/libreoffice/git_qt5/install-5.15.2/plugins/platforms -lqwasm -L$W/UnpackedTarball/icu/source/lib -licui18n -L$W/UnpackedTarball/icu/source/lib -licuuc $W/UnpackedTarball/openssl/libssl.a $W/UnpackedTarball/openssl/libcrypto.a -L$W/UnpackedTarball/liblangtag/liblangtag/.libs -llangtag -L$W/UnpackedTarball/libxml2/.libs -lxml2 -lm -L$W/UnpackedTarball/harfbuzz/src/.libs -lharfbuzz -L$W/UnpackedTarball/icu/source/lib -licuuc -L$W/UnpackedTarball/lcms2/src/.libs -llcms2 -L$W/UnpackedTarball/libwebp/src/.libs -lwebp -L$W/UnpackedTarball/cairo/src/.libs -lcairo -L$W/UnpackedTarball/pixman/pixman/.libs -lpixman-1 -L$W/UnpackedTarball/fontconfig/src/.libs -lfontconfig -L$W/UnpackedTarball/freetype/instdir/lib -lfreetype -L$W/UnpackedTarball/liborcus/src/liborcus/.libs -lorcus-0.17 -L$W/UnpackedTarball/liborcus/src/parser/.libs -lorcus-parser-0.17 -L$W/UnpackedTarball/hunspell/src/hunspell/.libs -lhunspell-1.7 -L$W/UnpackedTarball/hyphen/.libs -lhyphen -L$W/UnpackedTarball/mythes/.libs -lmythes-1.2 $W/UnpackedTarball/libnumbertext/src/.libs/libnumbertext-1.0.a -L$W/UnpackedTarball/redland/src/.libs -lrdf -L$W/UnpackedTarball/raptor/src/.libs -lraptor2 -L$W/UnpackedTarball/rasqal/src/.libs -lrasqal -L$W/UnpackedTarball/libxslt/libxslt/.libs -lxslt -L$W/UnpackedTarball/libxslt/libexslt/.libs -lexslt $W/UnpackedTarball/libabw/src/lib/.libs/libabw-0.1.a $W/UnpackedTarball/libebook/src/lib/.libs/libe-book-0.1.a -L$W/UnpackedTarball/libmwaw/src/lib/.libs -lmwaw-0.3 -L$W/UnpackedTarball/libodfgen/src/.libs -lodfgen-0.1 -L$W/UnpackedTarball/librevenge/src/lib/.libs -lrevenge-0.0 -L$W/UnpackedTarball/libstaroffice/src/lib/.libs -lstaroffice-0.0 -L$W/UnpackedTarball/libwpd/src/lib/.libs -lwpd-0.10 -L$W/UnpackedTarball/libwpg/src/lib/.libs -lwpg-0.3 -L$W/UnpackedTarball/libwps/src/lib/.libs -lwps-0.4 -L/home/jmg/Development/libreoffice/git_qt5/install-5.15.2/lib -lQt5Core -lQt5Gui -lQt5Widgets -lQt5Network -lqtpcre2 -lQt5EventDispatcherSupport -lQt5FontDatabaseSupport -L/home/jmg/Development/libreoffice/git_qt5/install-5.15.2/plugins/platforms -lqwasm -L$W/UnpackedTarball/icu/source/lib -licudata -Wl,--end-group -o$I/program/soffice.html ; RC=$ ? ; rm -f $W/LinkTarget/link.lock; if test $RC -ne 0; then exit $RC; fi
error: undefined symbol: emscripten_longjmp (referenced by top-level compiled C/C++ code)
warning: Link with
-s LLD_REPORT_UNDEFINED
to get more information on undefined symbolswarning: To disable errors for undefined symbols use
-s ERROR_ON_UNDEFINED_SYMBOLS=0
warning: _emscripten_longjmp may need to be added to EXPORTED_FUNCTIONS if it arrives from a system library
Error: Aborting compilation due to previous errors
em++: error: '/home/jmg/Development/libreoffice/git_emsdk/node/14.15.5_64bit/bin/node /home/jmg/Development/libreoffice/git_emsdk/upstream/emscripten/src/compiler.js /tmp/tmpu_78bn8k.json' failed (returned 1)
make[1]: *** [/home/jmg/Development/libreoffice/wasm/desktop/Executable_soffice_bin.mk:10: /home/jmg/Development/libreoffice/wasm/build-dbg-neh/instdir/program/soffice.html] Fehler 1
** Full link command and output with
-v
appended: **"/home/jmg/Development/libreoffice/git_emsdk/upstream/bin/wasm-ld" @/tmp/emscripten_9txe174p.rsp.utf-8
"/home/jmg/Development/libreoffice/git_emsdk/upstream/bin/wasm-emscripten-finalize" -g --bigint --no-dyncalls --no-legalize-javascript-ffi --dwarf /home/jmg/Development/libreoffice/wasm/build-dbg-neh/instdir/program/soffice.wasm --detect-features
"/home/jmg/Development/libreoffice/git_emsdk/node/14.15.5_64bit/bin/node" /home/jmg/Development/libreoffice/git_emsdk/upstream/emscripten/src/compiler.js /tmp/tmpijnjxqwv.json
Linking with EMCC_DEBUG=1 the
diff -u link-3.1.5.log link-3.1.6.log
has an interesting diff in the 'declares': array:The rest is just temporary files and time differences.
It looks like the origin of the problem is #15792, which added
#ifndef __USING_WASM_SJLJ__
for the missing symbol.FWIW, the WASM EH LibreOffice just builds since some time but doesn't run yet, because of a function call mismatch, which interestingly doesn't happen with the Emscripten EH build with the same code...
The text was updated successfully, but these errors were encountered: