Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

windows-gnu target compatibility with toolchains providing LLVM tools only #72241

Closed
1 of 2 tasks
mati865 opened this issue May 15, 2020 · 25 comments · Fixed by #94872
Closed
1 of 2 tasks

windows-gnu target compatibility with toolchains providing LLVM tools only #72241

mati865 opened this issue May 15, 2020 · 25 comments · Fixed by #94872
Labels
A-linkage Area: linking into static, shared libraries and binaries C-bug Category: This is a bug. O-windows-gnu Toolchain: GNU, Operating system: Windows T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@mati865
Copy link
Contributor

mati865 commented May 15, 2020

Rust currently assumes all GNU libraries and tools are present. This is not the case when using LLVM only toolchains like this one: https://github.com/mstorsjo/llvm-mingw

Remaining issues:

Opening the issue so interested people can track it easily.

@mati865 mati865 added the C-bug Category: This is a bug. label May 15, 2020
@mati865

This comment has been minimized.

@rustbot rustbot added the O-windows-gnu Toolchain: GNU, Operating system: Windows label May 15, 2020
@jonas-schievink jonas-schievink added A-linkage Area: linking into static, shared libraries and binaries T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels May 15, 2020
@mati865
Copy link
Contributor Author

mati865 commented May 21, 2020

cc @petrochenkov for the second issue, we just need possibility of providing custom late_link_args_dynamic and late_link_args_static.

@petrochenkov
Copy link
Contributor

@mati865
Do you mean ability to pass -C link-args-dynamic / -C link-args-static on the command line in addition to the target spec, or ability to drop the target spec args and start from scratch with regular -C link-args?

@mati865
Copy link
Contributor Author

mati865 commented May 22, 2020

@petrochenkov in this specific case we need to replace libgcc with compiler-rt and libgcc_eh or libgcc_s with libunwind. late_link_args hopefully could be left intact but I have to verify that.
Adding new target would do the trick but it might be an overkill if things can work with existing target.

@petrochenkov
Copy link
Contributor

So, I think this is a job for #[link(cfg(...))] (if we want to avoid a new target).

One cfg feature is provided by the compiler and allows to discern between the link_args_dynamic and link_args_static at link time.

Another cfg feature is a regular cargo feature of the src/libunwind crate which allows to switch between "LLVM toolchain" (compiler-rt + libunwind) and "GCC toolchain" (libgcc + libgcc_eh).

Then 8e3467c is reverted and all the linking is done through #[link] attributes in src/libunwind.

#[link(name = "gcc_s", cfg(all(rustc_detect_dynamic, gcc_toolchain)))]
#[link(name = "gcc_eh", cfg(all(not(rustc_detect_dynamic), gcc_toolchain)))]
#[link(name = "unwind", cfg(not(gcc_toolchain)))]
extern {}

or something.

@petrochenkov
Copy link
Contributor

This way we'll eliminate late_link_args_(dynamic,static) stuff from target specs which felt like a hack to me since its introduction.

Before 8e3467c #![no_std] programs (or rather programs not depending on src/libunwind) didn't link to the exception support libraries, which seems like the right thing to me.

With these options moved to the target spec they apply to all programs, even no-std ones.
(In this sense any libraries linked from target specs seem at least suspicious.)

@mati865
Copy link
Contributor Author

mati865 commented May 23, 2020

I've tried similar approach in #71001
That makes i686 executables depend on having libgcc_s-dw2-1.dll (or libunwind.dll) either in PATH or next to them.

I wanted to avoid new target if possible because we will definitely need new target to allow linking with UCRT instead of MSVCRT since one cannot mix libraries build with UCRT and MSVCRT.
C/C++ toolchains using UCRT already exists in the wild and I'm already being asked when it'll be possible to use Rust built libraries with them.

EDIT: For x86_64 we could even build libunwind as part of std and link it statically but this somehow results in errors because proc_macro crates are no longer built (just like previously on musl with +crt-static).

@petrochenkov
Copy link
Contributor

#71001 didn't use cfg(rustc_detect_dynamic) proposed in #72241 (comment) though.
With it executables won't depend on gcc_s.

@mati865
Copy link
Contributor Author

mati865 commented May 23, 2020

Wait, so we can use features here not only when building std but also when building user code? 😮

@petrochenkov
Copy link
Contributor

@mati865
Yes, cfgs in #[link] are serialized into crate metadata "as is" and evaluated only when the final binary is build.
That's how cfg(target_feature = "crt-static") works, for example.

@petrochenkov
Copy link
Contributor

cc #72510

@petrochenkov
Copy link
Contributor

petrochenkov commented Jun 8, 2020

@nu8
I've seen the __ms_vsnprintf error a few days ago when building rustc using gcc toolchain (https://packages.msys2.org/package/mingw-w64-x86_64-gcc) (so I had to switch to msvc).
So it seems like it isn't specific to LLVM toolchain, some recent changes just broke mingw in general.

@mati865
Copy link
Contributor Author

mati865 commented Jun 8, 2020

No it's not specific to LLVM at all.
This error happens when you mix UCRT and MSVCRT and the only solution for it is to never mix different CRTs.
Rust doesn't provide UCRT based toolchain (I mentioned it in #72241 (comment)). For that to happen we need more recent mingw-w64 and maybe even built to target UCRT by default.

I've seen the __ms_vsnprintf error a few days ago when building rustc using gcc toolchain (https://packages.msys2.org/package/mingw-w64-x86_64-gcc) (so I had to switch to msvc).

@petrochenkov few days ago MSYS2 upgraded mingw-w64 and it broke winpthreads library. Fix was committed on the next day I think.

Issue described by @nu8 is caused by using toolchain from llvm-mingw project.
That toolchain is using LLVM only tools and targets UCRT. either of those can cause build to fail because neither is currently supported by Rust.

@mati865
Copy link
Contributor Author

mati865 commented Jun 8, 2020

@petrochenkov I reproduced your issue by upgrading mingw-w64 among the other packages...
That's because MinGW doesn't seem to have stable order of it's linkage.

Currently Rust does it this way:

late_link_args.insert(
LinkerFlavor::Gcc,
vec![
"-lmingwex".to_string(),
"-lmingw32".to_string(),
"-lmsvcrt".to_string(),
// mingw's msvcrt is a weird hybrid import library and static library.
// And it seems that the linker fails to use import symbols from msvcrt
// that are required from functions in msvcrt in certain cases. For example
// `_fmode` that is used by an implementation of `__p__fmode` in x86_64.
// Listing the library twice seems to fix that, and seems to also be done
// by mingw's gcc (Though not sure if it's done on purpose, or by mistake).
//
// See https://github.com/rust-lang/rust/pull/47483
"-lmsvcrt".to_string(),
"-luser32".to_string(),
"-lkernel32".to_string(),
],
);

Now compare it to GCC which adjusts their list as need, last adjustment happened 2 weeks ago in gcc-mirror/gcc@850533a.
Interesting parts of gcc hello.c -###:
-lmingw32 -lgcc -lgcc_eh -lmoldname -lmingwex -lmsvcrt -lkernel32 -lpthread -ladvapi32 -lshell32 -luser32 -lkernel32 -lmingw32 -lgcc -lgcc_eh -lmoldname -lmingwex -lmsvcrt -lkernel32

Clang solves it by using --start-group and --end-group around mingw-w64 libs.
Interesting parts of clang hello.c -###:
-lmingw32 -lgcc -lgcc_eh -lmoldname -lmingwex -lmsvcrt -lpthread -ladvapi32 -lshell32 -luser32 -lkernel32 -lmingw32 -lgcc -lgcc_eh -lmoldname -lmingwex -lmsvcrt

At this point I don't know whether Rust should:

  • follow GCC by adjusting that list everytime something breaks,
  • follow Clang and use --start-group and --end-group (there is performance hit here)
  • say "I'm out of this mess" by setting no_default_libraries to false and letting the linker handle it

I'm currently looking for workaround to make stage0 std build...

@petrochenkov
Copy link
Contributor

follow GCC by adjusting that list everytime something breaks

Hmm, shouldn't it be possible to make a "union" list that would work both pre- and post-gcc-mirror/gcc@850533a by passing some libraries multiple times?
Which would be similar to what linker does with --(start,end)-group, except that we would be doing it manually.

--(start,end)-group would make it harder to move these libraries to #[link] attributes in libc or libstd crates from the target spec, which would be nice to do because it's more in line with what other targets do.

@mati865
Copy link
Contributor Author

mati865 commented Jun 8, 2020

That seems to do the trick for now:

index f556bf03f02..744f26239ca 100644
--- a/src/librustc_target/spec/windows_gnu_base.rs
+++ b/src/librustc_target/spec/windows_gnu_base.rs
@@ -20,9 +20,9 @@ pub fn opts() -> TargetOptions {
     late_link_args.insert(
         LinkerFlavor::Gcc,
         vec![
+            "-lmsvcrt".to_string(),
             "-lmingwex".to_string(),
             "-lmingw32".to_string(),
-            "-lmsvcrt".to_string(),
             // mingw's msvcrt is a weird hybrid import library and static library.
             // And it seems that the linker fails to use import symbols from msvcrt
             // that are required from functions in msvcrt in certain cases. For example

For stage0 to work you can build Rust with RUSTFLAGS_BOOTSTRAP='-C link-arg=-lmsvcrt' ./x.py build.

Generally GCC passes unnecessary libs like totally empty -lmoldname.
It would be nice if there was a tool to create dependency graph for import/static libraries so we could confidently choose the order instead of current "let's see if that order works".

@mati865
Copy link
Contributor Author

mati865 commented Jun 8, 2020

@nu8

However with Clang, at least with this version:

mstorsjo/llvm-mingw

I havent had to do that at all.

Maybe it uses --start-group and --end-group implicitly for user libraries?

I think Clang uses this:

http://theory.stanford.edu/~nikolaj/programmingz3.html

to resolve the Libraries.

MSYS2 ships LLVM with z3, llvm-mingw builds it without that.

Maybe that brings some overhead, but Clang seems about
30% faster to me as well.

It depends on the use case but Clang is often just faster than GCC.

@mati865
Copy link
Contributor Author

mati865 commented Jun 9, 2020

Opened #73184 to fix __ms_vsnprintf issue building Rust.

@mati865
Copy link
Contributor Author

mati865 commented Jul 10, 2020

I wanted to avoid new target if possible because we will definitely need new target to allow linking with UCRT instead of MSVCRT since one cannot mix libraries build with UCRT and MSVCRT.

Okay, seems like this will need separate target (GCC's emutls vs Clang's native TLS is a good example).
Instead we could avoid creating separate MSVCRT vs UCRT targets for each compiler until somebody needs them. Windows-gnu would keep MSVCRT as the default and new MinGW clang toolchain would use UCRT.

Should I close this issue and open another one or MCP for the new target?

kleisauke referenced this issue in libvips/build-win64-mxe Sep 27, 2020
- MinGW-w64 CRT needs to be linked with LLD.
- Build and bundle llvm-mingw tests.
- Compile gettext with --disable-libasprintf.
- Update LLVM to 10.0.0.
- Update MinGW-w64 to the latest master version.
- Build Rust within MXE instead of using the Rust Docker image.
- Fix librsvg build with latest MinGW-w64.
@mstorsjo
Copy link

However with Clang, at least with this version:
mstorsjo/llvm-mingw
I havent had to do that at all.

Maybe it uses --start-group and --end-group implicitly for user libraries?

As LLD's COFF backend is modelled after MS link.exe, it handles static libs in the same way as that tool, which means that it looks in all referenced static libs for undefined references - pretty much as if --start-group/--end-group was wrapped around the full list of arguments. (And Clang passes those arguments around the libs to the linker, to solve that issue when using ld.bfd as well.)

I think Clang uses this:
http://theory.stanford.edu/~nikolaj/programmingz3.html
to resolve the Libraries.

MSYS2 ships LLVM with z3, llvm-mingw builds it without that.

Z3 is entirely unrelated to linking. A quick grep through the source tree would point to it being used by the clang static analyzer, possibly...

Maybe that brings some overhead, but Clang seems about
30% faster to me as well.

It depends on the use case but Clang is often just faster than GCC.

In my experience, Clang isn't that fast in itself any longer, it's often similar to GCC or slower, for compilation. But for linking, LLD is often an order of magnitude faster than ld.bfd. This is especially noticable for large C++ projects - see e.g. https://bugs.llvm.org/show_bug.cgi?id=42217#c17, where a user said "I can finally link my project with lld. This reduces link time from ~26s to ~1s. Amazing!"

@kleisauke
Copy link
Contributor

kleisauke commented May 17, 2022

@mati865 Thanks for adding these *-pc-windows-gnullvm targets! I was wondering if there is also interest in the i686-pc-windows-gnullvm and armv7-pc-windows-gnullvm targets? If so, I could try to upstream these patches:
https://github.com/libvips/build-win64-mxe/blob/master/build/plugins/llvm-mingw/patches/rust-1-fixes.patch
https://github.com/libvips/build-win64-mxe/blob/9ffefdfb91117cb679cd1ed08a0f0f20f6366ed3/build/patches/librsvg-llvm-mingw.patch#L52-L191

(I'm able to test the (unusual) armv7-pc-windows-gnullvm target on my Raspberry Pi 3B with Windows 10 IoT, if necessary)

@mati865
Copy link
Contributor Author

mati865 commented May 24, 2022

@kleisauke sorry, thought I had already anwsered here.
32-bit LLVM targets are blocked right now on having working unwinding. It doesn't manifest right away when cross-compilling but prevents bootstrapping Rust natively.

@tschuett
Copy link

LLVM (https://github.com/llvm/llvm-project/blob/main/llvm-libgcc/docs/LLVMLibgcc.rst) hosts theses strange:
libgcc, libgcc_eh, libgcc_s
files.

@mati865
Copy link
Contributor Author

mati865 commented May 27, 2022

I doubt it helps in any way. Implementation details for Windows Dwarf 2 between libgcc and libunwind differ a lot and that library doesn't seem to take it into account.

@mstorsjo
Copy link

LLVM (https://github.com/llvm/llvm-project/blob/main/llvm-libgcc/docs/LLVMLibgcc.rst) hosts theses strange:
libgcc, libgcc_eh, libgcc_s
files.

They're not really of any value for this setup. Those files are used for letting libunwind (and compiler-rt) fill the gap where preexisting binaries on Linux have a dependency on e.g. libgcc_s.so, to make libunwind and compiler-rt, which provide the same symbols, provide a library interface for ABI compatibility.

workingjubilee pushed a commit to tcdi/postgrestd that referenced this issue Sep 15, 2022
Add MVP LLVM based mingw-w64 targets

Fixes rust-lang/rust#72241

Those are `rustc` side changes to create working x86_64 and AArch64 Rustc hosts and targets.
Apart from this PR changes to various crates are required which I'll do once this is accepted.

I'm expecting more changes on `rustc` side later on as I cannot even run full testsuite at this moment because passing JSON spec breaks paths in various tests.

Tier 3 policy:

> A tier 3 target must have a designated developer or developers (the "target maintainers") on record to be CCed when issues arise regarding the target. (The mechanism to track and CC such developers may evolve over time.)

I pledge to do my best maintaining it, MSYS2 is one of interested consumers so it should have enough testing (after the releases).

 > Targets must use naming consistent with any existing targets; for instance, a target for the same CPU or OS as an existing Rust target should use the same name for that CPU or OS. Targets should normally use the same names and naming conventions as used elsewhere in the broader ecosystem beyond Rust (such as in other toolchains), unless they have a very good reason to diverge. Changing the name of a target can be highly disruptive, especially once the target reaches a higher tier, so getting the name right is important even for a tier 3 target.

This triple name was discussed at [`t-compiler/LLVM+mingw-w64 Windows targets`](https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/LLVM.2Bmingw-w64.20Windows.20targets)

> Target names should not introduce undue confusion or ambiguity unless absolutely necessary to maintain ecosystem compatibility. For example, if the name of the target makes people extremely likely to form incorrect beliefs about what it targets, the name should be changed or augmented to disambiguate it.

I think the explanation in platform support doc is enough to make this aspect clear.

> Tier 3 targets may have unusual requirements to build or use, but must not create legal issues or impose onerous legal terms for the Rust project or for Rust developers or users.

It's using open source tools only.

> The target must not introduce license incompatibilities.

It's even more liberal than already existing `*-pc-windows-gnu`.

> Anything added to the Rust repository must be under the standard Rust license (MIT OR Apache-2.0).

Understood.

> The target must not cause the Rust tools or libraries built for any other host (even when supporting cross-compilation to the target) to depend on any new dependency less permissive than the Rust licensing policy. This applies whether the dependency is a Rust crate that would require adding new license exceptions (as specified by the tidy tool in the rust-lang/rust repository), or whether the dependency is a native library or binary. In other words, the introduction of the target must not cause a user installing or running a version of Rust or the Rust tools to be subject to any new license requirements.

There are no new dependencies/features required.

> Compiling, linking, and emitting functional binaries, libraries, or other code for the target (whether hosted on the target itself or cross-compiling from another target) must not depend on proprietary (non-FOSS) libraries. Host tools built for the target itself may depend on the ordinary runtime libraries supplied by the platform and commonly used by other applications built for the target, but those libraries must not be required for code generation for the target; cross-compilation to the target must not require such libraries at all. For instance, rustc built for the target may depend on a common proprietary C runtime library or console output library, but must not depend on a proprietary code generation library or code optimization library. Rust's license permits such combinations, but the Rust project has no interest in maintaining such combinations within the scope of Rust itself, even at tier 3.

As previously said it's using open source tools only.

> "onerous" here is an intentionally subjective term. At a minimum, "onerous" legal/licensing terms include but are not limited to: non-disclosure requirements, non-compete requirements, contributor license agreements (CLAs) or equivalent, "non-commercial"/"research-only"/etc terms, requirements conditional on the employer or employment of any particular Rust developers, revocable terms, any requirements that create liability for the Rust project or its developers or users, or any requirements that adversely affect the livelihood or prospects of the Rust project or its developers or users.

There are no such terms present/

> Neither this policy nor any decisions made regarding targets shall create any binding agreement or estoppel by any party. If any member of an approving Rust team serves as one of the maintainers of a target, or has any legal or employment requirement (explicit or implicit) that might affect their decisions regarding a target, they must recuse themselves from any approval decisions regarding the target's tier status, though they may otherwise participate in discussions.

I'm not the reviewer here.

> This requirement does not prevent part or all of this policy from being cited in an explicit contract or work agreement (e.g. to implement or maintain support for a target). This requirement exists to ensure that a developer or team responsible for reviewing and approving a target does not face any legal threats or obligations that would prevent them from freely exercising their judgment in such approval, even if such judgment involves subjective matters or goes beyond the letter of these requirements.

Again I'm not the reviewer here.

> Tier 3 targets should attempt to implement as much of the standard libraries as possible and appropriate (core for most targets, alloc for targets that can support dynamic memory allocation, std for targets with an operating system or equivalent layer of system-provided functionality), but may leave some code unimplemented (either unavailable or stubbed out as appropriate), whether because the target makes it impossible to implement or challenging to implement. The authors of pull requests are not obligated to avoid calling any portions of the standard library on the basis of a tier 3 target not implementing those portions.

> The target must provide documentation for the Rust community explaining how to build for the target, using cross-compilation if possible. If the target supports running binaries, or running tests (even if they do not pass), the documentation must explain how to run such binaries or tests for the target, using emulation if possible or dedicated hardware if necessary.

Building is described in platform support doc, running tests doesn't work right now (without hacks) because Rust's build system doesn't seem to support testing targets built from `.json`.
Docs will be updated once this lands in beta allowing master branch to build and run tests without `.json` files.

> Tier 3 targets must not impose burden on the authors of pull requests, or other developers in the community, to maintain the target. In particular, do not post comments (automated or manual) on a PR that derail or suggest a block on the PR based on a tier 3 target. Do not send automated messages or notifications (via any medium, including via `@)` to a PR author or others involved with a PR regarding a tier 3 target, unless they have opted into such messages.

Understood.

> Backlinks such as those generated by the issue/PR tracker when linking to an issue or PR are not considered a violation of this policy, within reason. However, such messages (even on a separate repository) must not generate notifications to anyone involved with a PR who has not requested such notifications.

Understood.

 > Patches adding or updating tier 3 targets must not break any existing tier 2 or tier 1 target, and must not knowingly break another tier 3 target without approval of either the compiler team or the maintainers of the other tier 3 target.

I believe I didn't break any other target.

> In particular, this may come up when working on closely related targets, such as variations of the same architecture with different features. Avoid introducing unconditional uses of features that another variation of the target may not have; use conditional compilation or runtime detection, as appropriate, to let each target run code supported by that target.

I think there are no such problems in this PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-linkage Area: linking into static, shared libraries and binaries C-bug Category: This is a bug. O-windows-gnu Toolchain: GNU, Operating system: Windows T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants
@mstorsjo @mati865 @jonas-schievink @petrochenkov @tschuett @kleisauke @rustbot and others