Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIDE_MODULE imports increases dramatically after 2.0.32 #15487

Open
waterlike86 opened this issue Nov 11, 2021 · 25 comments
Open

SIDE_MODULE imports increases dramatically after 2.0.32 #15487

waterlike86 opened this issue Nov 11, 2021 · 25 comments

Comments

@waterlike86
Copy link
Contributor

waterlike86 commented Nov 11, 2021

UPDATE: last good version is 2.0.27

We are upgrading from 2.0.26 to 2.0.32 and the side modules are failing to compile in browser because they exceed the import limit of 100000, in fact it grew from 4662 to 141035.
symptom as follows

CompileError: WebAssembly.Module(): imports count of 141035 exceeds internal limit of 100000 @+11710

it seems that something inside wasm-ld changed and is causing this.
I am trying to get the in between versions of emsdk to narrow down the version, in the mean time would like to ask if anyone has any ideas

both 2.0.26 and 2.0.32 have flags as follows

emsdk/upstream/bin/wasm-ld'
 '-o'
 'out.wasm'
 'dynamic.cpp.o'
 '--whole-archive'
 'archive.a'
 '--no-whole-archive'
 '-L/emcache/sysroot/lib/wasm32-emscripten/pic'
 '-mllvm'
 '-combiner-global-alias-analysis=false'
 '-mllvm'
 '-enable-emscripten-cxx-exceptions'
 '-mllvm'
 '-enable-emscripten-sjlj'
 '-mllvm'
 '-disable-lsr'
 '--import-undefined'
 '--import-memory'
 '--export-if-defined=__heap_base',
 '--export-if-defined=setThrew', '--export-if-defined=emscripten_stack_get_end'
 '--export-if-defined=emscripten_stack_get_free','--export-if-defined=emscripten_stack_set_limits'
 '--export-if-defined=__cxa_demangle', '--export-if-defined=stackSave', '--export-if-defined=stackRestore'
 '--export-if-defined=stackAlloc', '--export-if-defined=__wasm_call_ctors', '--export-if-defined=fflush'
 '--export-if-defined=__errno_location', '--export-if-defined=malloc', '--export-if-defined=free'
 '--export-if-defined=__cxa_is_pointer_type', '--export-if-defined=__cxa_can_catch', '--export-if-defined=_get_tzname'
 '--export-if-defined=_get_daylight', '--export-if-defined=_get_timezone'
 '--export-if-defined=emscripten_main_thread_process_queued_calls'
 '--experimental-pic'
 '-shared'
 '--no-export-dynamic
@kripken
Copy link
Member

kripken commented Nov 11, 2021

Bisecting might help here, that can be pretty fast to do: https://emscripten.org/docs/contributing/developers_guide.html#bisecting

@sbc100
Copy link
Collaborator

sbc100 commented Nov 11, 2021

Do you have a delta of the exports? What kinds of symbols are in the delta? Are they symbols from archive.a?

Are you using SIDE_MODULE=1 or SIDE_MODULE=2?

Can you share the full emcc link command?

BTW, there is the change in 2.0.34 that should limit the exports of SIDE_MODULE=1/MAIN_MODULE=1: #15413

@waterlike86
Copy link
Contributor Author

waterlike86 commented Nov 12, 2021

Do you have a delta of the exports? What kinds of symbols are in the delta? Are they symbols from archive.a?

yes they are symbols from archive.a which its trying to import.

Are you using SIDE_MODULE=1 or SIDE_MODULE=2?

we are using SIDE_MODULE=2

the last working version was 2.0.27, anything after that its bad.

@waterlike86
Copy link
Contributor Author

This is the emcc command being used.

-s SIDE_MODULE=2 -s DISABLE_EXCEPTION_CATCHING=0 --emit-symbol-map -s ASSERTIONS=1 -s DEMANGLE_SUPPORT=1 --profiling -s WASM=1 -s ALLOW_MEMORY_GROWTH=1 -s EVAL_CTORS=0 -r  -O2 -s INLINING_LIMIT=1 --profiling -fPIC -s WASM_BIGINT=1
 '--whole-archive' 'archive.a' '--no-whole-archive'

BTW, there is the change in 2.0.34 that should limit the exports of SIDE_MODULE=1/MAIN_MODULE=1: #15413

hmm does that mean that there are plans to deprecate SIDE_MODULE=2 ?

@sbc100
Copy link
Collaborator

sbc100 commented Nov 12, 2021

If you are using SIDE_MODULE=2 when only functions listed as -sEXPORTED_FUNCTIONS or functions marked in the source code as EMSCRIPTEN_KEEPALIVE should be exported.

Since I don't see -sEXPORTED_FUNCTIONS on your command line I assume you are exporting functions via EMSCRIPTEN_KEEPALIVE in source code? Is that right?

@waterlike86
Copy link
Contributor Author

right we use -sEXPORTED_FUNCTIONS, i just removed it from the list, but theres just 1 function in there to register against the main module. the rest are function pointers.

@sbc100
Copy link
Collaborator

sbc100 commented Nov 12, 2021

If you specify you export in -sEXPORTED_FUNCTIONS then that should be the only export from the side module. Is it true that there is only a single export from the side module both with 2.0.26 and 2.0.32? Or are the number of export and the number imports both increasing when you upgrade from 2.0.26 and 2.0.32?

Can you see if maybe the new symbols are both imported and exported? (imported via GOT.mem or GOT.func modules and exported directly). If that is the case, that could explain the increase.. these are dynamic symbols that are defined in the side module but can be overridden from by main module. If that is the case, I think we can find a solution where these symbols are instead internalized within the side module (as they previously were).

@waterlike86
Copy link
Contributor Author

@sbc100 @kripken using bisect this is the release where it first occured, its an llvm change.
https://chromium.googlesource.com/emscripten-releases/+/5a9a0c3d01c75410e237670633fd28280e8b2a47

and this looks suspicious.
2021-08-19 sbc@chromium.org [lld][WebAssembly] Handle weakly defined symbols in shared libraries.

when looking at the archive which i have i see that most if not all of the functions which are being imported are of type
W | Weak reference (multiple defs link together into zero or one defs)

@sbc100
Copy link
Collaborator

sbc100 commented Nov 12, 2021

I think capital "W" means that symbols defined weakly in the archive.

When a symbol is weakly defined like that it means it can be overridden by a strongly defined symbol, either at static link time or at dynamic link time. For side modules that means that module must both import and export the symbol. In most cases the thing that is exported will then be re-imports, assuming that not other module defines that symbol.

Do you know why you are archive contains so many weakly defined symbols?

@waterlike86
Copy link
Contributor Author

in 1 of the side modules we pack opencv in, exporting only a small set of functions to expose functionality we need and to keep the size of the side module small.
i see that most of the weakly defined symbols are templated code, virtual functions, constructors, destructors, operators, anything which can be overridden by inheritance.

some parts of the list of imports are below.

- func[11034] sig=0 <cv::AutoBuffer<char, 1024ul>::AutoBuffer()> <- env._ZN2cv10AutoBufferIcLm1024EEC2Ev
- func[11035] sig=0 <cv::AutoBuffer<char, 1024ul>::size() const> <- env._ZNK2cv10AutoBufferIcLm1024EE4sizeEv
- func[11036] sig=0 <cv::AutoBuffer<char, 1024ul>::data()> <- env._ZN2cv10AutoBufferIcLm1024EE4dataEv
- func[11037] sig=2 <cv::AutoBuffer<char, 1024ul>::resize(unsigned long)> <- env._ZN2cv10AutoBufferIcLm1024EE6resizeEm
- func[11038] sig=1 <cv::AutoBuffer<char, 1024ul>::operator[](unsigned long)> <- env._ZN2cv10AutoBufferIcLm1024EEixEm
- func[11039] sig=0 <cv::AutoBuffer<char, 1024ul>::~AutoBuffer()> <- env._ZN2cv10AutoBufferIcLm1024EED2Ev

 func[11001] sig=1 <cv::softfloat::operator=(cv::softfloat const&)> <- env._ZN2cv9softfloataSERKS0_
- func[11002] sig=2 <cv::softfloat::fromRaw(unsigned int)> <- env._ZN2cv9softfloat7fromRawEj
- func[11003] sig=1 <cv::softfloat::softfloat(cv::softfloat const&)> <- env._ZN2cv9softfloatC2ERKS0_
- func[11004] sig=59 <cv::softdouble::fromRaw(unsigned long long)> <- env._ZN2cv10softdouble7fromRawEy
- func[11005] sig=1 <cv::softdouble::operator=(cv::softdouble const&)> <- env._ZN2cv10softdoubleaSERKS0_
- func[11006] sig=1 <cv::softdouble::softdouble(cv::softdouble const&)> <- env._ZN2cv10softdoubleC2ERKS0_
- func[11007] sig=0 <cv::softfloat::isNaN() const> <- env._ZNK2cv9softfloat5isNaNEv
- func[11008] sig=3 <cv::softfloat::nan()> <- env._ZN2cv9softfloat3nanEv
- func[11009] sig=0 <cv::softfloat::isInf() const> <- env._ZNK2cv9softfloat5isInfEv
- func[11010] sig=3 <cv::softdouble::one()> <- env._ZN2cv10softdouble3oneEv
- func[11011] sig=0 <cv::softdouble::softdouble()> <- env._ZN2cv10softdoubleC2Ev
- func[11012] sig=2 <cv::softdouble::operator-() const> <- env._ZNK2cv10softdoublengEv
- func[11013] sig=0 <cv::softdouble::isNaN() const> <- env._ZNK2cv10softdouble5isNaNEv
- func[11014] sig=3 <cv::softdouble::nan()> <- env._ZN2cv10softdouble3nanEv
- func[11015] sig=0 <cv::softdouble::isInf() const> <- env._ZNK2cv10softdouble5isInfEv
- func[11016] sig=3 <cv::softdouble::inf()> <- env._ZN2cv10softdouble3infEv
- func[11017] sig=3 <cv::softdouble::zero()> <- env._ZN2cv10softdouble4zeroEv
- func[11018] sig=2 <cv::softfloat::operator-() const> <- env._ZNK2cv9softfloatngEv

@sbc100
Copy link
Collaborator

sbc100 commented Nov 15, 2021

Ah, I see. I guess C++ using a lot of weak symbols and in a dynamically linked setup the linker doesn't know which shared library is going to define a given symbol, so it has to both import and export them.

However if you know for sure that all the symbol in a shared library are not going to be overridden by any other we should have link flag to allow you to specify this. I think perhaps -Bsymbolic is the flag that we want for this case, although this does have that effect on ELF targets today, so we might need a new flag.

@waterlike86
Copy link
Contributor Author

I an confirm that this commit below is the one causing the issue.
llvm/llvm-project@c468dc1
rebuilt 2.0.32 emsdk lld binary with that commit reverted and the imports went back to 2.0.26 levels

@waterlike86
Copy link
Contributor Author

I think perhaps -Bsymbolic is the flag that we want for this case

i couldnt find any reference to this, not sure if this is something i can try out.
Could we still keep the current SIDE_MODULE=2 behaviour? only export function using -sEXPORTED_FUNCTIONS or EMSCRIPTEN_KEEPALIVE

@sbc100
Copy link
Collaborator

sbc100 commented Nov 16, 2021

Yes, I would like to have the old behaviour somehow, at least as an option. I just need to find a set of wasm-ld flags we can use to enable this behaviour.

@waterlike86
Copy link
Contributor Author

Hi Sam, would you still be working on this? I dont think we can upgrade to later versions if this issue is still there

@sbc100
Copy link
Collaborator

sbc100 commented Dec 1, 2021

Yes this is certainly something that needs fixing. I haven't had a chance to look into it yet.

@Mintyboi
Copy link

Mintyboi commented Jan 4, 2022

@sbc100 Happy New Year to you!
Just a follow up on this issue, will you have time to look into this?

sbc100 added a commit to llvm/llvm-project that referenced this issue May 26, 2022
… libraries

We have some special handling for weakly defined symbols where we both
import and export them, but this is not needed for hidden symbols which
should never be imported or exported.

See emscripten-core/emscripten#16972

This should also help with:
emscripten-core/emscripten#15487

Differential Revision: https://reviews.llvm.org/D126491
@sbc100
Copy link
Collaborator

sbc100 commented May 26, 2022

I think this llvm change might help a lot with this: https://reviews.llvm.org/D126491

@maxbrunsfeld
Copy link
Contributor

It looks like that LLVM commit landed. I'm just curious if there are plans to fix this issue now, or if there are other blockers. Thank you!

mem-frob pushed a commit to draperlaboratory/hope-llvm-project that referenced this issue Oct 7, 2022
… libraries

We have some special handling for weakly defined symbols where we both
import and export them, but this is not needed for hidden symbols which
should never be imported or exported.

See emscripten-core/emscripten#16972

This should also help with:
emscripten-core/emscripten#15487

Differential Revision: https://reviews.llvm.org/D126491
@Mintyboi
Copy link

Mintyboi commented Jun 7, 2023

@sbc100 Continuing from the issue from the op, using these, it brought our imports to within the browser's limit.:

However we are still importing way more functions as compared to before. The reason is because the weak symbols in our side module are mostly of default visibility, so according to the logic in https://github.com/llvm/llvm-project/blob/main/lld/wasm/Writer.cpp#L725, it will get imported.

Our side module comes from a third party vendor, besides getting them to compile their library with -fvisibility=hidden, do you see another way for us to not import weak symbols like in 2.0.26?

@Mintyboi
Copy link

Mintyboi commented Jun 7, 2023

@sbc100 Continuing from the issue from the op, using these, it brought our imports to within the browser's limit.:

However we are still importing way more functions as compared to before. The reason is because the weak symbols in our side module are mostly of default visibility, so according to the logic in https://github.com/llvm/llvm-project/blob/main/lld/wasm/Writer.cpp#L725, it will get imported.

Our side module comes from a third party vendor, besides getting them to compile their library with -fvisibility=hidden, do you see another way for us to not import weak symbols like in 2.0.26?

Apologies, I was still testing on 3.1.16.
I tried out my sample code in the latest emsdk (3.1.41) and I don't seem to see this happening anymore, let me dig a little more. Do you know if there have been any changes around this area?

@sbc100
Copy link
Collaborator

sbc100 commented Jun 7, 2023

Not that I know of. We still compiler with -fvisibility=default (not hidden) by default (for better or worse):

emscripten/emcc.py

Lines 955 to 960 in 1b26bf5

# We use default visiibilty=default in emscripten even though the upstream
# backend defaults visibility=hidden. This matched the expectations of C/C++
# code in the wild which expects undecorated symbols to be exported to other
# DSO's by default.
if not any(a.startswith('-fvisibility') for a in user_args):
flags.append('-fvisibility=default')

@Mintyboi
Copy link

Mintyboi commented Jul 12, 2023

I tried out my sample code in the latest emsdk (3.1.41) and I don't seem to see this happening anymore, let me dig a little more.

Update: Seems like the issue has not been resolved. Although part of it is resolved in my sample app, I'm still seeing the explosion of imports in my build.
However, I managed to get my third party vendor to compile their library with the visibility visibility=hidden as default and that resolves my issue.

I believe we can close this chapter in emscripten. Thank you.

Edit: Opps I cant close it because I'm not the OP

@sbc100
Copy link
Collaborator

sbc100 commented Jul 12, 2023

Did you mean to say that your vendor compiled their library with visibility=hidden?

@Mintyboi
Copy link

Yes, exactly. Thanks for pointing out. I've edit the previous comment to reflect it correctly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants