Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create a self-hosted incremental linker #1535

Open
andrewrk opened this Issue Sep 17, 2018 · 16 comments

Comments

Projects
None yet
4 participants
@andrewrk
Copy link
Member

andrewrk commented Sep 17, 2018

LLD has no plans to do incremental linking, even though incremental linking is a perfect fit for zig's debug builds. Not good enough. Zig project will have its own linker.

This is a huge project - support has to be explicitly added for every target OS and every target architecture.

Once this project is far enough along, zig will drop its dependency on LLD. No reason to have both.

@andrewrk andrewrk added this to the 1.1.0 milestone Sep 17, 2018

@ghost

This comment has been minimized.

Copy link

ghost commented Sep 17, 2018

Isn’t linking about as fast as cat?
If so how is this big effort worth it vs just relinking everything?

@andrewrk

This comment has been minimized.

Copy link
Member Author

andrewrk commented Sep 17, 2018

It's about 5x slower than cat.

Also I just tried catting a debug build of clang to /dev/null and it took 3 seconds on my SSD. That's not fast enough. If you change a single function and recompile a large project, zig should only have to compile 1 function and update only the bytes of that function in the output file.

@ghost

This comment has been minimized.

Copy link

ghost commented Sep 17, 2018

Ok as much as I‘d argue against rewrite, this seems to make sense in the long run indeed.
Maybe someone else has started the same effort until 1.1 bekomes a thing. Maybe rust, at least they have function level recompile as well.

@ghost

This comment has been minimized.

Copy link

ghost commented Sep 17, 2018

There is this Google Project „Gold“ linker which goal is/ was both fast and incremental linking

https://www.airs.com/blog/archives/38

https://github.com/pathscale/binutils/tree/master/gold

Commit dates seme a bit dated but goals should be perfect so at least could be a guide, maybe it even still works, that would be quite something...

Doing a bit more digging it seems gold added incremental linking, but at least for full linking its half as fast as lld, for whatever reason.
Maybe lld is not as slow afterall.

@Sahnvour

This comment has been minimized.

Copy link
Member

Sahnvour commented Sep 17, 2018

LLD is crazy fast already, but incremental linking could help on huge projects.

Do you mean LLD will explicitely not have incremental linking ?

@ghost

This comment has been minimized.

Copy link

ghost commented Sep 17, 2018

Maybe rust, at least they have function level recompile as well.

regarding rust, they are thinking about it as well
rust-lang/rust#39915 (comment) incremental is possible on windows already so I don't think its absurd to suggest lld will maybe add this?
(they also talk about gold, does not seem to be completely dead, just not working for them but maybe works for zig incremental linking?)

given that lld is already very fast, supports many/ all? important platforms and zig and rust would have the same demand...

@andrewrk

This comment has been minimized.

Copy link
Member Author

andrewrk commented Sep 17, 2018

Do you mean LLD will explicitely not have incremental linking ?

Yes, I've asked them about it and others have asked them about it, and they just want a fast but deterministic linker that redoes the whole job every time.

@andrewrk andrewrk referenced this issue Sep 27, 2018

Open

add docs for use #1589

0 of 4 tasks complete
@andrewrk

This comment has been minimized.

Copy link
Member Author

andrewrk commented Sep 30, 2018

Another motivation for this issue is that the MACH-O code in LLD is poor quality, and nobody in the LLVM community wants to improve it.

See for example https://gist.github.com/srgpqt/61163a279baa4f8d41b01a653c2635bc

A self-hosted linker would fit into the bootstrapping plan (#853), no problem. If we had a self-hosted linker and we dropped the dependency on LLD:

  • stage1 builds with the system C++ compiler and linker. (status quo)
  • stage2 builds with stage1 and the system linker.
  • stage3 builds with stage2
@bnoordhuis

This comment has been minimized.

Copy link
Member

bnoordhuis commented Oct 1, 2018

Not that I want to move the goalposts too much but a custom linker means no LTO unless someone writes that as well (super hard!)

@andrewrk

This comment has been minimized.

Copy link
Member Author

andrewrk commented Oct 1, 2018

Our current plan for LTO is emitting everything into a single .o file and running -O3 on that. That's what stage 1 does. It's really slow, but for a release build that's the trade-off.

Meanwhile in debug builds the plan is to split into as many .o files as would speed up the compilation.

It's possible that there may be a setting for release mode, how much to compromise the optimization in exchange for faster build times and less compilation memory requirements.

@ghost

This comment has been minimized.

Copy link

ghost commented Oct 1, 2018

The other thing that makes me suspicious is that rust manages to cross compile and they use lld?

And there is also thinLTO which is actually MUCH faster and that would go away as an option as well.

I mean it’s all your project so this is just meant as some thought to avoid unnecessary work but it’s your decision.

reading http://lists.llvm.org/pipermail/llvm-dev/2018-June/123782.html sounds indeed quite discouraging and surprising tbh.

@andrewrk

This comment has been minimized.

Copy link
Member Author

andrewrk commented Oct 1, 2018

I would love to know how rust accomplishes cross compiling for the MacOS target. It's very unlikely they're using LLD.

@ghost

This comment has been minimized.

Copy link

ghost commented Oct 1, 2018

It's very unlikely they're using LLD.

I forgot to adjust my comment, adding a reference to the rust thread I already linked to previously.


searching though their issues revealed another problem with creating an own linker
rust-lang/rust#54637

For a project at Google, we need retpoline support.

We should still support retpolines when plt is used. That being said, it seems to me that this is almost entirely a linker’s job.


rust-lang/rust#39915 (comment)

they try to use LLD

Once that's done we can advertise it to the community, asking for feedback. Here we can gain both timing information as well as bug reports to send to LLD. If everything goes smoothly (which is sort of doubtful with a whole brand new linker, but hey you never know!) we can turn it on by default, otherwise we can work to stabilize the selection of LLD and then add an option to Cargo.toml so projects can at least opt-in to it.


about the current status here is a search for lld related commits
https://github.com/rust-lang/rust/search?q=lld&type=Commits

as far as I can tell they have their own lld fork like zig which they call rust-lld
at least for ARM https://www.reddit.com/r/rust/comments/9a7te2/nightly_rust_is_switching_to_use_lld_llvms_new/#bottom-comments

rg -F "linker: Some("
src/librustc_target/spec/riscv32imac_unknown_none_elf.rs
28:            linker: Some("rust-lld".to_string()),

src/librustc_target/spec/windows_base.rs
77:        linker: Some("gcc".to_string()),

src/librustc_target/spec/thumb_base.rs
46:        linker: Some("rust-lld".to_string()),

src/librustc_target/spec/armebv7r_none_eabihf.rs
31:            linker: Some("rust-lld".to_owned()),

src/librustc_target/spec/wasm32_unknown_unknown.rs
54:        linker: Some("rust-lld".to_owned()),

src/librustc_target/spec/msp430_none_elf.rs
34:            linker: Some("msp430-elf-gcc".to_string()),

src/librustc_target/spec/riscv32imc_unknown_none_elf.rs
28:            linker: Some("rust-lld".to_string()),

src/librustc_target/spec/l4re_base.rs
35:        linker: Some("ld".to_string()),

src/librustc_target/spec/aarch64_unknown_none.rs
23:        linker: Some("rust-lld".to_owned()),

src/librustc_target/spec/armv7r_none_eabihf.rs
31:            linker: Some("rust-lld".to_owned()),

src/librustc_target/spec/armv7r_none_eabi.rs
31:            linker: Some("rust-lld".to_owned()),

src/librustc_target/spec/armebv7r_none_eabi.rs
31:            linker: Some("rust-lld".to_owned()),

windows seems different from lld

src/librustc_target/spec/windows_base.rs
77:        linker: Some("gcc".to_string()),

about Mac there is
https://github.com/rust-lang/rust/blob/master/src/librustc_target/spec/i686_apple_darwin.rs#L17


sorry this turned into a bit of a mess

andrewrk added a commit that referenced this issue Oct 6, 2018

disable C ABI tests on macos due to LLD deficiency
See #1535

we'll have a better macos linker someday

andrewrk added a commit that referenced this issue Nov 2, 2018

support building static self hosted compiler on macos
 * add a --system-linker-hack command line parameter to work around
   poor LLD macho code. See #1535
 * build.zig correctly handles static as well as dynamic dependencies
   when building the self hosted compiler.
   - no more unnecessary libxml2 dependency
   - a static build on macos produces a completely static self-hosted
     compiler for macos (except for libSystem as intended).

andrewrk added a commit that referenced this issue Nov 2, 2018

support building static self hosted compiler on macos
 * add a --system-linker-hack command line parameter to work around
   poor LLD macho code. See #1535
 * build.zig correctly handles static as well as dynamic dependencies
   when building the self hosted compiler.
   - no more unnecessary libxml2 dependency
   - a static build on macos produces a completely static self-hosted
     compiler for macos (except for libSystem as intended).
@bheads

This comment has been minimized.

Copy link

bheads commented Nov 16, 2018

@andrewrk

This comment has been minimized.

Copy link
Member Author

andrewrk commented Nov 16, 2018

Linkers & Loaders is a brilliant resource. This book helped me go from being clueless, to generally understanding what linkers do enough that I can read linker source code and understand the concepts. Note that the author has recommended that people check out a copy at a library or maybe even buy the book because the online content is outdated and therefore contains errors.

@andrewrk

This comment has been minimized.

Copy link
Member Author

andrewrk commented Feb 5, 2019

Another motivation for having our own linker has to do with Thread Local Storage (#924). Thread local variables go in the .tdata and .tbss sections, and then the linker merges them together. On Linux, libc looks at the AUXVAL for the PT_TLS Program Header Entry to find out the size of the TLS at runtime. musl-libc, for example, preallocates 16 * pointer_size bytes for TLS but then has to call mmap if it isn't enough:

static struct builtin_tls {
    char c;
    struct pthread pt;
    void *space[16];
} builtin_tls[1];

...

    if (libc.tls_size > sizeof builtin_tls) {
#ifndef SYS_mmap2
#define SYS_mmap2 SYS_mmap
#endif
        mem = (void *)__syscall(
            SYS_mmap2,
            0, libc.tls_size, PROT_READ|PROT_WRITE,
            MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
        /* -4095...-1 cast to void * will crash on dereference anyway,
         * so don't bloat the init code checking for error codes and
         * explicitly calling a_crash(). */
    } else {
        mem = builtin_tls;
    }

This all happens before main. On the other hand, if we had our own linker, we could have a special placeholder for the statically allocated TLS array, and thus always avoid this mmap before main.

It works differently on Windows. I haven't looked up how it works there yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.