Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.comm in .a not found #256

Closed
fefe17 opened this issue Jan 6, 2022 · 4 comments
Closed

.comm in .a not found #256

fefe17 opened this issue Jan 6, 2022 · 4 comments

Comments

@fefe17
Copy link

fefe17 commented Jan 6, 2022

I'm probably doing something wrong here, this may not be a bug in mold after all.

Here's a minimal repro:

$ cat a.S
.data
.comm foo,4,4
$ cat b.c
extern int foo;
int main() { return foo; }
$ clang -o c a.S b.c
(works)
$ clang -o c a.S b.c -fuse-ld=mold
(still works)
$ clang -c a.S b.c
$ ar cru c.a a.o b.o
$ ranlib c.a
$ clang -o c c.a
(works)
$ clang -o c c.a -fuse-ld=mold
mold: error: undefined symbol: c.a(b.o): foo
clang-14: error: linker command failed with exit code 1 (use -v to see invocation)

Huh? Why can't mold find foo if it comes from an archive?

@rui314
Copy link
Owner

rui314 commented Jan 6, 2022

Common symbols are resolved in a strange way when they are in archive files. Here is how GNU ld works.

  • If a common symbol in an archive can be used to resolve regular undefined symbol, the linker pulls out the object file from the archive
  • If a non-common defined symbol in an archive can be used to resolve common symbol, the linker pulls out the object file from the archive
  • If a common symbol in an archive can be used to resolve other common symbol, the linker does not pull out the object file from the archive

So, the linker does not pull out an object if it results in overwriting a common symbol with a common symbol. This behavior is strange, to say the least.

So, mold takes a simpler approach. It does not pull out an archive member for a common symbol. It seems this semantics are working fine in most cases, but as you pointed out, this is indeed different.

How did you notice this issue? I wonder if there's a program in the wild that depends on this subtle difference.

@fefe17
Copy link
Author

fefe17 commented Jan 6, 2022

I notices this because my libc makes use of this for the __guard symbol.
https://www.fefe.de/dietlibc/

I could switch that back to a regular symbol if I have to to make it work with mold.

I went with the GNU as documentation here: https://ftp.gnu.org/old-gnu/Manuals/gas-2.9.1/html_chapter/as_7.html#SEC76
It does kinda make sense to me, and the GNU ld behavior also makes sense in light of this documentation.
Basically two files can have an int foo declared and it will land in the binary only once. I think gcc recently changed default behavior away from using this, but previous versions did that by default. You could have a variable declaration in a header file that you included in five places and it would magically work and not throw duplicate symbol errors.

It does not mention any special behavior for archives.

To be honest I'm not sure why we switched from a regular symbol to a .comm in the first place.

What would be the downside of pulling in an object for a .comm symbol in mold instead of failing the build?

@kevinko
Copy link

kevinko commented Jan 6, 2022

@rui314 , I think that I ran into a related issue just yesterday. If statically linking libpthread.a, mold will not be able to find certain common symbols.

Toy pthread program:

#include <pthread.h>

void * run_thread(void *arg) {
        return NULL;
}

int main() {
        pthread_t thread_id;
        pthread_attr_t attr;
        pthread_attr_init(&attr);
        pthread_create(&thread_id, &attr, run_thread, NULL);
        pthread_join(thread_id, NULL);
        return 0;
}

Under Ubuntu 20.04 with gcc-9:

  • stock gcc-9 is OK
# gcc test.c -o test -lpthread -static
  • dynamically linking with mold is OK
# gcc test.c -o test -lpthread -B/usr/libexec/mold
  • statically linking with mold results in undefined symbols
# gcc test.c -o test -lpthread -B/usr/libexec/mold -static
mold: error: undefined symbol: /usr/lib/x86_64-linux-gnu/libpthread.a(pthread_cancel.o): __libc_multiple_threads_ptr
mold: error: undefined symbol: /usr/lib/x86_64-linux-gnu/libpthread.a(pthread_create.o): __xidcmd
mold: error: undefined symbol: /usr/lib/x86_64-linux-gnu/libpthread.a(pthread_create.o): __static_tls_align_m1
mold: error: undefined symbol: /usr/lib/x86_64-linux-gnu/libpthread.a(pthread_create.o): __static_tls_size
mold: error: undefined symbol: /usr/lib/x86_64-linux-gnu/libpthread.a(pthread_create.o): __static_tls_size
mold: error: undefined symbol: /usr/lib/x86_64-linux-gnu/libpthread.a(pthread_create.o): __static_tls_size
mold: error: undefined symbol: /usr/lib/x86_64-linux-gnu/libpthread.a(pthread_create.o): __static_tls_align_m1
mold: error: undefined symbol: /usr/lib/x86_64-linux-gnu/libpthread.a(pthread_create.o): __libc_multiple_threads_ptr
collect2: error: ld returned 1 exit status

The undefined symbols here are all common symbols in libpthread.a:

# nm /usr/lib/x86_64-linux-gnu/libpthread.a |grep " C " 
0000000000000008 C __libc_multiple_threads_ptr
0000000000000008 C __static_tls_align_m1
0000000000000008 C __static_tls_size
0000000000000008 C __xidcmd
0000000000000038 C __default_pthread_attr
0000000000000004 C __is_smp
0000000000000004 C __pthread_multiple_threads
0000000000000004 C __pthread_debug
0000000000000004 C __concurrency_level
0000000000000008 C __fork_generation
0000000000000008 C __sem_mappings

@rui314
Copy link
Owner

rui314 commented Jan 8, 2022

I made a change so that unresolved symbols are resolved to common symbols in an archive if exists. It ended up with a small patch, so I don't think it is likely to have an undesirable side effect, but since name resolution is a intricate step, it needs extensive testing. I'll do that by compiling all Gentoo packages before 1.0.2. In the meantime, if you guys notice any issue, please file a bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants