Global destructors not optimized away even with LTO #19993

kripken · 2023-08-07T22:39:14Z

Consider this:

#include <emscripten.h>

struct Class {
  ~Class() {
    emscripten_random();
  }
};

Class c;

int main() {
}

Building with emcc a.cpp -O3 --profiling-funcs I would expect the destructor to be optimized away, since EXIT_RUNTIME defaults to false. However, it isn't:

(module
 (import "a" "a" (func $emscripten_random (result f32)))
..
 (table $0 2 2 funcref)
 (elem $0 (i32.const 1) $__cxx_global_array_dtor)
..
 (export "e" (table $0))
..
 (func $__cxx_global_array_dtor (param $0 i32)
  (drop
   (call $emscripten_random)
  )
 )
)

The destructor is there, and it calls that import which keeps more code alive.

The destructor is in the table, because an atexit() existed to it. When EXIT_RUNTIME is not enabled we link with libnoexit , which makes atexit a no-op. However, the function is already in the table, so later Binaryen optimizations can't remove it.

LTO should fix this, but does not: adding -flto changes nothing in the output.

Trying to investigate that, I added --mllvm=--print-after-all in the lld command. Looking for atexit, I see this:

declare i32 @__cxa_atexit(ptr, ptr, ptr) local_unnamed_addr #2

But I never at any point see it defined. I'd expect to see it defined as a function returning 0, because that is what we have in libnoexit:

emscripten/system/lib/libc/atexit_dummy.c

Line 16 in bc15162

int __cxa_atexit(void (*func)(void *), void *arg, void *dso) { return 0; }

It does get linked in properly, as I see the lld command starts with -lGL -lal -lhtml5 -lstubs -lnoexit -lc (so it's before libc). And I see it properly in the wasm output when I build with say -O1 and without LTO:

 (func $__cxa_atexit (param $0 i32) (param $1 i32) (param $2 i32) (result i32)
  (i32.const 0)
 )

LTO is definitely running, and e.g. it tries to inline - I see e.g. *** IR Dump After InlinerPass on (main) *** - but it doesn't actually inline the trivial cxa_atexit that is linked in, sadly...

Also, I see those Dump after InlinerPass on (NAME) on all functions - even libc ones - but not on cxa_atexit. In fact the only pass LTO runs that mentions it is

*** IR Dump After Lower @llvm.global_dtors via `__cxa_atexit` (lower-global-dtors) ***

Given all that, my best guess is that LLVM LTO isn't looking at cxa_atexit for normal optimizations. It seems to treat it in a special way - is that possible?

This seems like a pretty significant missed optimization for us. I noticed this on #19903 (comment) but IIUC it affects all default optimized builds that have any global destructors, keeping unneeded code.

cc @sbc100 @aheejin @tlively @dschuff as this is beyond my knowledge of LLVM.

The text was updated successfully, but these errors were encountered:

sbc100 · 2023-08-08T00:15:33Z

I reason __cxa_atexit is not part of LTO is because explicitly exclude it:

class libnoexit(Library):                                                                               
  name = 'libnoexit'                                                                                 
  # __cxa_atexit calls can be generated during LTO the implemenation cannot                          
  # itself be LTO.  See `get_libcall_files` below for more details.                                  
  force_object_files = True

One option might be to allow it to be LTO, but somehow force it to always be included at LTO time (so new references to it don't cause LTO to fail).

kripken · 2023-08-08T00:22:14Z

Ah 😄 the answer was right below the code I was reading, thanks!

So it would be great to get this into LTO. But how can we force it to be included in order to handle new references? That would take changes inside LLVM I guess?

sbc100 · 2023-08-08T00:46:41Z

it might be enough to just add -Wl-u,__cxa_atexit to the link command.

kripken · 2023-08-08T19:34:22Z

I tried adding -Wl,-u=__cxa_atexit (I had to change a few letters there) but it makes no difference.

What does -u do, btw? I don't see it in wasm-ld --help.

sbc100 · 2023-08-08T20:06:08Z

-u makes the linker assume a given symbol is undefined. from the man page:

       -u symbol
       --undefined=symbol
           Force symbol to be entered in the output file as an undefined symbol.  Doing this may, for
           example, trigger linking of additional modules from standard libraries.  -u may be
           repeated with different option arguments to enter additional undefined symbols.  This
           option is equivalent to the "EXTERN" linker script command.

           If this option is being used to force additional modules to be pulled into the link, and
           if it is an error for the symbol to remain undefined, then the option --require-defined
           should be used instead.

sbc100 · 2023-08-08T20:14:20Z

I guess it not enough to trigger the inclusion of the symbol at LTO time which is what we need here.. i'll see if I can figure out another way.

sbc100 · 2023-10-10T23:49:07Z

I think I've found a (pretty nasty) way to make this work.

…sed (#68758) In emscripten we have a build mode (the default actually) where the runtime never exits and therefore `__cxa_atexit` is a dummy/stub function that does nothing. In this case we would like to be able completely DCE any otherwise-unused global dtor functions. Fixes: emscripten-core/emscripten#19993

Depends on llvm/llvm-project#68758 Fixes: emscripten-core#19993

Depends on llvm/llvm-project#68758 Fixes: #19993

kripken mentioned this issue Aug 8, 2023

WasmFS JS API: Implement syncfs #19903

Open

sbc100 mentioned this issue Oct 11, 2023

[LowerGlobalDtors] Skip __cxa_atexit call completely when arg0 is unused llvm/llvm-project#68758

Merged

sbc100 closed this as completed in llvm/llvm-project#68758 Oct 23, 2023

sbc100 added a commit to sbc100/emscripten that referenced this issue Oct 23, 2023

Allow unused destructors to be completely elided

d757f4c

Depends on llvm/llvm-project#68758 Fixes: emscripten-core#19993

sbc100 mentioned this issue Oct 23, 2023

Allow unused destructors to be completely elided #20519

Merged

sbc100 added a commit to sbc100/emscripten that referenced this issue Oct 23, 2023

Allow unused destructors to be completely elided

62ed989

Depends on llvm/llvm-project#68758 Fixes: emscripten-core#19993

sbc100 added a commit to sbc100/emscripten that referenced this issue Oct 24, 2023

Allow unused destructors to be completely elided

334e7c2

Depends on llvm/llvm-project#68758 Fixes: emscripten-core#19993

sbc100 added a commit that referenced this issue Oct 24, 2023

Allow unused destructors to be completely elided (#20519)

3630336

Depends on llvm/llvm-project#68758 Fixes: #19993

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Global destructors not optimized away even with LTO #19993

Global destructors not optimized away even with LTO #19993

kripken commented Aug 7, 2023

sbc100 commented Aug 8, 2023

kripken commented Aug 8, 2023

sbc100 commented Aug 8, 2023

kripken commented Aug 8, 2023

sbc100 commented Aug 8, 2023

sbc100 commented Aug 8, 2023

sbc100 commented Oct 10, 2023

Global destructors not optimized away even with LTO #19993

Global destructors not optimized away even with LTO #19993

Comments

kripken commented Aug 7, 2023

sbc100 commented Aug 8, 2023

kripken commented Aug 8, 2023

sbc100 commented Aug 8, 2023

kripken commented Aug 8, 2023

sbc100 commented Aug 8, 2023

sbc100 commented Aug 8, 2023

sbc100 commented Oct 10, 2023