Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Global destructors not optimized away even with LTO #19993

Closed
kripken opened this issue Aug 7, 2023 · 7 comments · Fixed by llvm/llvm-project#68758 or #20519
Closed

Global destructors not optimized away even with LTO #19993

kripken opened this issue Aug 7, 2023 · 7 comments · Fixed by llvm/llvm-project#68758 or #20519

Comments

@kripken
Copy link
Member

kripken commented Aug 7, 2023

Consider this:

#include <emscripten.h>

struct Class {
  ~Class() {
    emscripten_random();
  }
};

Class c;

int main() {
}

Building with emcc a.cpp -O3 --profiling-funcs I would expect the destructor to be optimized away, since EXIT_RUNTIME defaults to false. However, it isn't:

(module
 (import "a" "a" (func $emscripten_random (result f32)))
..
 (table $0 2 2 funcref)
 (elem $0 (i32.const 1) $__cxx_global_array_dtor)
..
 (export "e" (table $0))
..
 (func $__cxx_global_array_dtor (param $0 i32)
  (drop
   (call $emscripten_random)
  )
 )
)

The destructor is there, and it calls that import which keeps more code alive.

The destructor is in the table, because an atexit() existed to it. When EXIT_RUNTIME is not enabled we link with libnoexit , which makes atexit a no-op. However, the function is already in the table, so later Binaryen optimizations can't remove it.

LTO should fix this, but does not: adding -flto changes nothing in the output.

Trying to investigate that, I added --mllvm=--print-after-all in the lld command. Looking for atexit, I see this:

declare i32 @__cxa_atexit(ptr, ptr, ptr) local_unnamed_addr #2

But I never at any point see it defined. I'd expect to see it defined as a function returning 0, because that is what we have in libnoexit:

int __cxa_atexit(void (*func)(void *), void *arg, void *dso) { return 0; }

It does get linked in properly, as I see the lld command starts with -lGL -lal -lhtml5 -lstubs -lnoexit -lc (so it's before libc). And I see it properly in the wasm output when I build with say -O1 and without LTO:

 (func $__cxa_atexit (param $0 i32) (param $1 i32) (param $2 i32) (result i32)
  (i32.const 0)
 )

LTO is definitely running, and e.g. it tries to inline - I see e.g. *** IR Dump After InlinerPass on (main) *** - but it doesn't actually inline the trivial cxa_atexit that is linked in, sadly...

Also, I see those Dump after InlinerPass on (NAME) on all functions - even libc ones - but not on cxa_atexit. In fact the only pass LTO runs that mentions it is

*** IR Dump After Lower @llvm.global_dtors via `__cxa_atexit` (lower-global-dtors) ***

Given all that, my best guess is that LLVM LTO isn't looking at cxa_atexit for normal optimizations. It seems to treat it in a special way - is that possible?

This seems like a pretty significant missed optimization for us. I noticed this on #19903 (comment) but IIUC it affects all default optimized builds that have any global destructors, keeping unneeded code.

cc @sbc100 @aheejin @tlively @dschuff as this is beyond my knowledge of LLVM.

@sbc100
Copy link
Collaborator

sbc100 commented Aug 8, 2023

I reason __cxa_atexit is not part of LTO is because explicitly exclude it:

class libnoexit(Library):                                                                               
  name = 'libnoexit'                                                                                 
  # __cxa_atexit calls can be generated during LTO the implemenation cannot                          
  # itself be LTO.  See `get_libcall_files` below for more details.                                  
  force_object_files = True   

One option might be to allow it to be LTO, but somehow force it to always be included at LTO time (so new references to it don't cause LTO to fail).

@kripken
Copy link
Member Author

kripken commented Aug 8, 2023

Ah 😄 the answer was right below the code I was reading, thanks!

So it would be great to get this into LTO. But how can we force it to be included in order to handle new references? That would take changes inside LLVM I guess?

@sbc100
Copy link
Collaborator

sbc100 commented Aug 8, 2023

it might be enough to just add -Wl-u,__cxa_atexit to the link command.

@kripken
Copy link
Member Author

kripken commented Aug 8, 2023

I tried adding -Wl,-u=__cxa_atexit (I had to change a few letters there) but it makes no difference.

What does -u do, btw? I don't see it in wasm-ld --help.

@sbc100
Copy link
Collaborator

sbc100 commented Aug 8, 2023

-u makes the linker assume a given symbol is undefined. from the man page:

       -u symbol
       --undefined=symbol
           Force symbol to be entered in the output file as an undefined symbol.  Doing this may, for
           example, trigger linking of additional modules from standard libraries.  -u may be
           repeated with different option arguments to enter additional undefined symbols.  This
           option is equivalent to the "EXTERN" linker script command.

           If this option is being used to force additional modules to be pulled into the link, and
           if it is an error for the symbol to remain undefined, then the option --require-defined
           should be used instead.

@sbc100
Copy link
Collaborator

sbc100 commented Aug 8, 2023

I guess it not enough to trigger the inclusion of the symbol at LTO time which is what we need here.. i'll see if I can figure out another way.

@sbc100
Copy link
Collaborator

sbc100 commented Oct 10, 2023

I think I've found a (pretty nasty) way to make this work.

sbc100 added a commit to llvm/llvm-project that referenced this issue Oct 23, 2023
…sed (#68758)

In emscripten we have a build mode (the default actually) where the
runtime never exits and therefore `__cxa_atexit` is a dummy/stub
function that does nothing. In this case we would like to be able
completely DCE any otherwise-unused global dtor functions.

Fixes: emscripten-core/emscripten#19993
sbc100 added a commit to sbc100/emscripten that referenced this issue Oct 23, 2023
sbc100 added a commit to sbc100/emscripten that referenced this issue Oct 23, 2023
sbc100 added a commit to sbc100/emscripten that referenced this issue Oct 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants