New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial MSVC support for the compiler #25350

Merged
merged 31 commits into from May 20, 2015

Conversation

Projects
None yet
@alexcrichton
Copy link
Member

alexcrichton commented May 12, 2015

Special thanks to @retep998 for the excellent writeup of tasks to be done and @ricky26 for initially blazing the trail here!

MSVC Support

This goal of this series of commits is to add MSVC support to the Rust compiler
and build system, allowing it more easily interoperate with Visual Studio
installations and native libraries compiled outside of MinGW.

The tl;dr; of this change is that there is a new target of the compiler,
x86_64-pc-windows-msvc, which will not interact with the MinGW toolchain at
all and will instead use link.exe to assemble output artifacts.

Why try to use MSVC?

With today's Rust distribution, when you install a compiler on Windows you also
install gcc.exe and a number of supporting libraries by default (this can be
opted out of). This allows installations to remain independent of MinGW
installations, but it still generally requires native code to be linked with
MinGW instead of MSVC. Some more background can also be found in #1768 about the
incompatibilities between MinGW and MSVC.

Overall the current installation strategy is quite nice so long as you don't
interact with native code, but once you do the usage of a MinGW-based gcc.exe
starts to get quite painful.

Relying on a nonstandard Windows toolchain has also been a long-standing "code
smell" of Rust and has been slated for remedy for quite some time now. Using a
standard toolchain is a great motivational factor for improving the
interoperability of Rust code with the native system.

What does it mean to use MSVC?

"Using MSVC" can be a bit of a nebulous concept, but this PR defines it as:

  • The build system for Rust will build as much code as possible with the MSVC
    compiler, cl.exe.
  • The build system will use native MSVC tools for managing archives.
  • The compiler will link all output with link.exe instead of gcc.exe.

None of these are currently implemented today, but all are required for the
compiler to fluently interoperate with MSVC.

How does this all work?

At the highest level, this PR adds a new target triple to the Rust compiler:

x86_64-pc-windows-msvc

All logic for using MSVC or not is scoped within this triple and code can
conditionally build for MSVC or MinGW via:

#[cfg(target_env = "msvc")]

It is expected that auto builders will be set up for MSVC-based compiles in
addition to the existing MinGW-based compiles, and we will likely soon start
shipping MSVC nightlies where x86_64-pc-windows-msvc is the host target triple
of the compiler.

Summary of changes

Here I'll explain at a high level what many of the changes made were targeted
at, but many more details can be found in the commits themselves. Many thanks to
@retep998 for the excellent writeup in rust-lang/rfcs#1061 and @RicK26 for a lot
of the initial proof-of-concept work!

Build system changes

As is probably expected, a large chunk of this PR is changes to Rust's build
system to build with MSVC. At a high level it is an explicit non goal to
enable building outside of a MinGW shell, instead all Makefile infrastructure we
have today is retrofitted with support to use MSVC instead of the standard MSVC
toolchain. Some of the high-level changes are:

  • The configure script now detects when MSVC is being targeted and adds a number
    of additional requirements about the build environment:
    • The --msvc-root option must be specified or cl.exe must be in PATH to
      discover where MSVC is installed. The compiler in use is also required to
      target x86_64.
    • Once the MSVC root is known, the INCLUDE/LIB environment variables are
      scraped so they can be reexported by the build system.
    • CMake is required to build LLVM with MSVC (and LLVM is also configured with
      CMake instead of the normal configure script).
    • jemalloc is currently unconditionally disabled for MSVC targets as jemalloc
      isn't a hard requirement and I don't know how to build it with MSVC.
  • Invocations of a C and/or C++ compiler are now abstracted behind macros to
    appropriately call the underlying compiler with the correct format of
    arguments, for example there is now a macro for "assemble an archive from
    objects" instead of hard-coded invocations of $(AR) crus liboutput.a ...
  • The output filenames for standard libraries such as morestack/compiler-rt are
    now "more correct" on windows as they are shipped as foo.lib instead of
    libfoo.a.
  • Rust targets can now depend on native tools provided by LLVM, and as you'll
    see in the commits the entire MSVC target depends on llvm-ar.exe.
  • Support for custom arbitrary makefile dependencies of Rust targets has been
    added. The MSVC target for rustc_llvm currently requires a custom .DEF
    file to be passed to the linker to get further linkages to complete.

Compiler changes

The modifications made to the compiler have so far largely been minor tweaks
here and there, mostly just adding a layer of abstraction over whether MSVC or a
GNU-like linker is being used. At a high-level these changes are:

  • The section name for metadata storage in dynamic libraries is called .rustc
    for MSVC-based platorms as section names cannot contain more than 8
    characters.
  • The implementation of rustc_back::Archive was refactored, but the
    functionality has remained the same.
  • Targets can now specify the default ar utility to use, and for MSVC this
    defaults to llvm-ar.exe
  • The building of the linker command in rustc_trans::back::link has been
    abstracted behind a trait for the same code path to be used between GNU and
    MSVC linkers.

Standard library changes

Only a few small changes were required to the stadnard library itself, and only
for minor differences between the C runtime of msvcrt.dll and MinGW's libc.a

  • Some function names for floating point functions have leading underscores, and
    some are not present at all.
  • Linkage to the advapi32 library for crypto-related functions is now
    explicit.
  • Some small bits of C code here and there were fixed for compatibility with
    MSVC's cl.exe compiler.

Future Work

This commit is not yet a 100% complete port to using MSVC as there are still
some key components missing as well as some unimplemented optimizations. This PR
is already getting large enough that I wanted to draw the line here, but here's
a list of what is not implemented in this PR, on purpose:

Unwinding

The revision of our LLVM submodule does not seem to implement does not
support lowering SEH exception handling on the Windows MSVC targets, so
unwinding support is not currently implemented for the standard library (it's
lowered to an abort).

It looks like, however, that upstream LLVM has quite a bit more support for SEH
unwinding and landing pads than the current revision we have, so adding support
will likely just involve updating LLVM and then adding some shims of our own
here and there.

dllimport and dllexport

An interesting part of Windows which MSVC forces our hand on (and apparently
MinGW didn't) is the usage of dllimport and dllexport attributes in LLVM IR
as well as native dependencies (in C these correspond to
__declspec(dllimport)).

Whenever a dynamic library is built by MSVC it must have its public interface
specified by functions tagged with dllexport or otherwise they're not
available to be linked against. This poses a few problems for the compiler, some
of which are somewhat fundamental, but this commit alters the compiler to attach
the dllexport attribute to all LLVM functions that are reachable (e.g. they're
already tagged with external linkage). This is suboptimal for a few reasons:

  • If an object file will never be included in a dynamic library, there's no need
    to attach the dllexport attribute. Most object files in Rust are not destined
    to become part of a dll as binaries are statically linked by default.
  • If the compiler is emitting both an rlib and a dylib, the same source object
    file is currently used but with MSVC this may be less feasible. The compiler
    may be able to get around this, but it may involve some invasive changes to
    deal with this.

The flipside of this situation is that whenever you link to a dll and you import
a function from it, the import should be tagged with dllimport. At this time,
however, the compiler does not emit dllimport for any declarations other than
constants (where it is required), which is again suboptimal for even more
reasons!

  • Calling a function imported from another dll without using dllimport causes
    the linker/compiler to have extra overhead (one jmp instruction on x86) when
    calling the function.
  • The same object file may be used in different circumstances, so a function may
    be imported from a dll if the object is linked into a dll, but it may be
    just linked against if linked into an rlib.
  • The compiler has no knowledge about whether native functions should be tagged
    dllimport or not.

For now the compiler takes the perf hit (I do not have any numbers to this
effect) by marking very little as dllimport and praying the linker will take
care of everything. Fixing this problem will likely require adding a few
attributes to Rust itself (feature gated at the start) and then strongly
recommending static linkage on Windows! This may also involve shipping a
statically linked compiler on Windows instead of a dynamically linked compiler,
but these sorts of changes are pretty invasive and aren't part of this PR.

CI integration

Thankfully we don't need to set up a new snapshot bot for the changes made here as our snapshots are freestanding already, we should be able to use the same snapshot to bootstrap both MinGW and MSVC compilers (once a new snapshot is made from these changes).

I plan on setting up a new suite of auto bots which are testing MSVC configurations for now as well, for now they'll just be bootstrapping and not running tests, but once unwinding is implemented they'll start running all tests as well and we'll eventually start gating on them as well.


I'd love as many eyes on this as we've got as this was one of my first interactions with MSVC and Visual Studio, so there may be glaring holes that I'm missing here and there!

cc @retep998, @ricky26, @vadimcn, @klutzy

r? @brson

ricky26 and others added some commits Mar 4, 2015

Very hacky MSVC hacks.
Conflicts:
	mk/platform.mk
	src/librustc/session/config.rs
	src/librustc_back/target/aarch64_apple_ios.rs
	src/librustc_back/target/aarch64_linux_android.rs
	src/librustc_back/target/arm_linux_androideabi.rs
	src/librustc_back/target/arm_unknown_linux_gnueabi.rs
	src/librustc_back/target/arm_unknown_linux_gnueabihf.rs
	src/librustc_back/target/armv7_apple_ios.rs
	src/librustc_back/target/armv7s_apple_ios.rs
	src/librustc_back/target/i386_apple_ios.rs
	src/librustc_back/target/i686_apple_darwin.rs
	src/librustc_back/target/i686_pc_windows_gnu.rs
	src/librustc_back/target/i686_unknown_dragonfly.rs
	src/librustc_back/target/i686_unknown_linux_gnu.rs
	src/librustc_back/target/mips_unknown_linux_gnu.rs
	src/librustc_back/target/mipsel_unknown_linux_gnu.rs
	src/librustc_back/target/mod.rs
	src/librustc_back/target/powerpc_unknown_linux_gnu.rs
	src/librustc_back/target/x86_64_apple_darwin.rs
	src/librustc_back/target/x86_64_apple_ios.rs
	src/librustc_back/target/x86_64_pc_windows_gnu.rs
	src/librustc_back/target/x86_64_unknown_dragonfly.rs
	src/librustc_back/target/x86_64_unknown_freebsd.rs
	src/librustc_back/target/x86_64_unknown_linux_gnu.rs
	src/librustc_back/target/x86_64_unknown_openbsd.rs
	src/librustc_llvm/lib.rs
	src/librustc_trans/back/link.rs
	src/librustc_trans/trans/base.rs
	src/libstd/os.rs
	src/rustllvm/RustWrapper.cpp
rustc_trans: Abstract linker support behind a trait
This trait will be used to correctly build a command line for link.exe with MSVC
and may perhaps one day be used to generate a command line for `lld`, but this
commit currently just refactors the bindings used to call `ld`.
rustc_llvm: Expose setting more DLL storage classes
Currently only `dllexport` is used, but more integration will require using
`dllimport` as well.
@brson

This comment has been minimized.

Copy link
Contributor

brson commented May 12, 2015

Wow.

@alexcrichton alexcrichton force-pushed the alexcrichton:msvc branch from 6ce8c71 to e7d92d2 May 12, 2015

@brson

This comment has been minimized.

Copy link

brson commented on src/libstd/num/f32.rs in 315750a May 12, 2015

The type signatures changed here to return c_double, making these functions differently-typed based on platform. Does this change leak into any public APIs?

This comment has been minimized.

Copy link

brson replied May 12, 2015

Oh, actually that's not what's happening here. These double functions are pure additions. Why are these added?

This comment has been minimized.

Copy link

brson replied May 12, 2015

These are private to this file so can't be used. Seems like this shouldn't have been added.

This comment has been minimized.

Copy link
Author

ricky26 replied May 13, 2015

They're used by the replacements to the ldexpf (and co) functions below - since ldexpf (and co) doesn't actually exist in the MSVC runtime (it's defined as an inline wrapper function)!

@brson

This comment has been minimized.

Copy link

brson commented on src/libstd/num/f32.rs in 4cc025d May 12, 2015

Well that answers that.

@retep998

This comment has been minimized.

Copy link
Member

retep998 commented May 13, 2015

😱

@klutzy

This comment has been minimized.

Copy link
Contributor

klutzy commented May 13, 2015

Wow. Great! Excited to see this.

At a high level it is an explicit non goal to enable building outside of a MinGW shell

minor nit: MinGW shell -> MSYS shell :)

this commit alters the compiler to attach the dllexport attribute to all LLVM functions that are reachable

See also this comment of #7196. The change is, therefore, necessary to MinGW part as well.

The flipside of this situation is that whenever you link to a dll and you import a function from it, the import should be tagged with dllimport.
At this time, however, the compiler does not emit dllimport for any declarations other than constants (where it is required),

This is also true for MinGW/gcc side, isn't it? I've heard that dllimport is an optimization for functions, but a requirement for data. But then I'm curious why we haven't have such issue with gcc.

fn link_rlib(&mut self, lib: &Path) { self.cmd.arg(lib); }
fn add_object(&mut self, path: &Path) { self.cmd.arg(path); }
fn args(&mut self, args: &[String]) { self.cmd.args(args); }
fn build_dylib(&mut self, _out_filename: &Path) { self.cmd.arg("/DLL"); }

This comment has been minimized.

@cmr

cmr May 13, 2015

Member

Shouldn't this say /DLL /OUT:$out_filename?

This comment has been minimized.

@alexcrichton

alexcrichton May 13, 2015

Author Member

Ah I guess this argument is a little confusing here, but there's a separate function called output_filename which controls the /OUT parameter (this is just responsible for /DLL). On OSX we pass an argument which requires knowledge of the filename which is why this argument exists.

// as there's been trouble in the past of linking the C++ standard
// library required by LLVM. This likely needs to happen one day, but
// in general Windows is also a more controlled environment than
// Windows, so it's not necessarily as critical that this be

This comment has been minimized.

@cmr

cmr May 13, 2015

Member

Typo: s/Windows/Unix/ ?

This comment has been minimized.

@cmr

cmr May 13, 2015

Member

Also, this is more critical than you might think because linking to (native) static libraries requires CRT agreement, and sometimes you want to opt out of linking to the CRT yourself so as to take the one the static library provides.

This comment has been minimized.

@alexcrichton

alexcrichton May 13, 2015

Author Member

Very interesting! I'll add some words here to that effect.

@vadimcn

This comment has been minimized.

Copy link
Contributor

vadimcn commented May 13, 2015

Nice work!

@alexcrichton: configure seems to have a problem with msvc root path containing spaces. What is the exact command line one is supposed to use?

@klutzy: gnu ld links to dll data using a bit of magic called auto-import (edit: wrong url).


fn link_whole_staticlib(&mut self, lib: &str, _search_path: &[PathBuf]) {
// not supported?
self.link_staticlib(lib);

This comment has been minimized.

@cmr

This comment has been minimized.

@retep998

retep998 May 13, 2015

Member

It would be nice if we didn't have to do that though. Forcing the linker to not strip anything will really hurt binary sizes.

This comment has been minimized.

@alexcrichton

alexcrichton May 13, 2015

Author Member

Hm I'm not quite sure if MSVC takes the same approach as GNU ld where the arguments are processed in a stateful fashion, but I'll try to investigate this and see what's going on. A special directive may not necessarily be required here (we'll see once I can start running tests), but I agree with @retep998 that we really need to pass /OPT:REF as much as possible.

This comment has been minimized.

@retep998

retep998 May 13, 2015

Member

As far as I can tell link.exe does not care about the order of arguments, so I don't think there's a way to turn off /OPT:REF for just one input.

@cmr

This comment has been minimized.

Copy link
Member

cmr commented May 13, 2015

This is an excellent opportunity to add manifest support and take a stab at #11207 and rust-lang/rfcs#721, at least for MSVC.

configure Outdated
err "msvc root not found, pass --msvc-root or ensure \
a 64-bit cl.exe is in your PATH"
fi
CFG_CL="$CFG_MSVC_ROOT/VC/bin/amd64/cl.exe"

This comment has been minimized.

@retep998

retep998 May 13, 2015

Member

For future reference, there is also a 32-bit cl.exe that can target 64-bit. You know, in case some weirdo really wants to cross compile from a 32-bit OS.

This comment has been minimized.

@alexcrichton

alexcrichton May 13, 2015

Author Member

I thought I saw inklings of that in my install, do you know if it has a special "help page" that can be used to detect it? Also, is there a canonical default location for the compiler?

This comment has been minimized.

@alexcrichton

alexcrichton May 13, 2015

Author Member

Although note that an MSVC build still currently requires MSYS2 as it's used to build compiler-rt.lib, and so that requires a 64-bit MSYS2 shell, consequently requiring a 64-bit computer likely (unfortunately)

This comment has been minimized.

@retep998

retep998 May 13, 2015

Member

That toolchain is in x86_amd64 instead of amd64.
You can easily find out where the compiler is using the registry.

This comment has been minimized.

@alexcrichton

alexcrichton May 13, 2015

Author Member

Is there an easy way to read the registry from this configure script? I'm not sure if there's a nice command or two that MSYS provides us to do so.

This comment has been minimized.

@retep998

retep998 May 14, 2015

Member

Yes, you can use the REG command, which is a standard Windows command line utility. No need for MSYS to be involved.

This comment has been minimized.

@alexcrichton

alexcrichton May 14, 2015

Author Member

Perfect! I've updated to at least discover VS 12.0 and we can add more versions over time.

@cmr

This comment has been minimized.

Copy link
Member

cmr commented May 13, 2015

(Alternatively, use MANIFESTUAC and just close #16455)

@aturon

This comment has been minimized.

Copy link
Member

aturon commented May 13, 2015

@alexcrichton: you are a force of nature.

@richo

This comment has been minimized.

Copy link
Contributor

richo commented May 13, 2015

Between this and musl support, amazing!

base.cpu = "x86-64".to_string();

Target {
// FIXME: Test this. Copied from linux (#2398)

This comment has been minimized.

@retep998

retep998 May 13, 2015

Member

How worried should I be about this comment?

This comment has been minimized.

@alexcrichton

alexcrichton May 13, 2015

Author Member

It appears that this is just being copied around, I copied this file from x86_64-pc-windows-gnu which has the exact same comment, so I'm not really sure what the implications are unfortunately :(.

I'll add some words here to this effect though.

fn add_object(&mut self, path: &Path) { self.cmd.arg(path); }
fn args(&mut self, args: &[String]) { self.cmd.args(args); }
fn build_dylib(&mut self, _out_filename: &Path) { self.cmd.arg("/DLL"); }
fn gc_sections(&mut self, _is_dylib: bool) { self.cmd.arg("/OPT:REF"); }

This comment has been minimized.

@retep998

retep998 May 13, 2015

Member

Would be nice to specify /OPT:ICF as well to strip read-only data.

This comment has been minimized.

@alexcrichton

alexcrichton May 13, 2015

Author Member

Sounds like a good idea to me!

self.link_staticlib(lib);
}
fn optimize(&mut self) {
// Needs more investigation of `/OPT` arguments

This comment has been minimized.

@retep998

retep998 May 13, 2015

Member

This would be more along the line of /LTCG. I'm just not sure whether llvm supports the metadata for this kind of LTO.

This comment has been minimized.

@alexcrichton

alexcrichton May 13, 2015

Author Member

I have a feeling that /LTCG has to do with the IL files that cl.exe produces (e.g. performing LTO by loading the whole world and then optimizing it), so I don't think that we'd benefit from /LTCG, but I'm not 100% certain.

I do suspect, however, that linking native libraries may benefit from /LTCG

This comment has been minimized.

@luser

luser May 13, 2015

Contributor

/LTCG causes link.exe to do whole-program optimization, but if you haven't compiled the input files with /GL to produce IL instead of normal object files it's not going to do anything useful. Additionally, the linker can detect if you're trying to link IL files and forgot to pass /LTCG and will re-start the link for you (albeit at the cost of making your link a little slower).

For a pure Rust compilation/link this is sort of irrelevant.

// GCC reuses the same personality routine as for the other architectures by wrapping it
// with an "API translator" layer (_GCC_specific_handler).

#[cfg(all(windows, target_arch = "x86_64", not(test)))]

This comment has been minimized.

@vadimcn

vadimcn May 13, 2015

Contributor

So what happened to win64 unwinding? I can't find it in neither gcc.rs nor seh.rs.

This comment has been minimized.

@alexcrichton

alexcrichton May 13, 2015

Author Member

Oh gah! There was a transition period where I tried to port everything to using real SEH stuff before I realized our LLVM didn't have support for it, and I forgot to re-add it. I will add these bits back into unwind/rt/gcc.rs.

@alexcrichton

This comment has been minimized.

Copy link
Member Author

alexcrichton commented May 13, 2015

@klutzy, @vadimcn

See also this comment of #7196. The change is, therefore, necessary to MinGW part as well.

...

This is also true for MinGW/gcc side, isn't it? I've heard that dllimport is an optimization for functions, but a requirement for data. But then I'm curious why we haven't have such issue with gcc.

...

@klutzy: gnu ld links to dll data using a bit of magic called auto-import (edit: wrong url).

Fascinating! The article from @vadimcn was quite informative and I suspect that's why everything "just works" today. I think we may want to even investigate turning off the auto-import flag by default, but it may be a little late in the game to do that.


@vadimcn

@alexcrichton: configure seems to have a problem with msvc root path containing spaces. What is the exact command line one is supposed to use?

Ah yeah I forgot to mention that, but --arg="val with spaces" is currently broken with our configure script, so I've been ensuring that cl.exe is in my PATH and then I invoke ./configure --host=x86_64-pc-windows-msvc,x86_64-pc-windows-gnu in a MSYS2 shell.

@alexcrichton

This comment has been minimized.

Copy link
Member Author

alexcrichton commented May 13, 2015

@cmr

This is an excellent opportunity to add manifest support and take a stab at #11207 and rust-lang/rfcs#721, at least for MSVC.

...

(Alternatively, use MANIFESTUAC and just close #16455)

I think for now I might hold off on trying to tackle these problems. I suspect that if link.exe has official support for manifest (and if GNU ld does not) that it should be easy enough for builds to use -C link-args, and otherwise adding some sort of interface for this would require adding API surface area to the compiler itself. Definitely possible to solve in the near future, but may best wait for a follow-up PR.

@retep998 retep998 referenced this pull request May 13, 2015

Open

Rust, Windows, and MSVC #1061

18 of 47 tasks complete
@ricky26

This comment has been minimized.

Copy link
Contributor

ricky26 commented May 13, 2015

Crikey - when I named that commit "Very hacky MSVC hacks." it was to try and make sure it didn't make it into a real rust branch without being heavily rebased! Nice work! :D

alexcrichton added some commits May 12, 2015

rustc_llvm: Don't export constants across dlls
For imports of constants across DLLs to work on Windows it *requires* that the
import be marked with `dllimport` (unlike functions where the marker is
optional, but strongly recommended). This currently isn't working for importing
FFI constants across boundaries, however, so the one constant exported from
`rustc_llvm.dll` is now a function to be called instead.
std: Implement aborting stubs for MSVC unwinding
At this time unwinding support is not implemented for MSVC as
`libgcc_s_seh-1.dll` is not available by default (and this is used on MinGW),
but this should be investigated soon. For now this change is just aimed at
getting the compiler far enough to bootstrap everything instead of successfully
running tests.

This commit refactors the `std::rt::unwind` module a bit to prepare for SEH
support eventually by moving all GCC-specific functionality to its own submodule
and defining the interface needed.
std: Mark rust_get_num_cpus as dllexport
This function is imported across the DLL boundary by the libtest dynamic
library, so it has to be marked as dllexport somehow, and for now this is done
with an attribute on the function specifically.
rustc_trans: Apply dllexport attributes for MSVC
This commit modifies the compiler to emit `dllexport` for all reachable
functions and data on MSVC targets, regardless of whether a dynamic library is
being created or not. More details can be found in the commit itself.
std: Don't require rust_try as an exported symbol
This commit adds a small non-generic non-inlineable shim function to
`rt::unwind::try` which the compiler can take care of for managing the exported
symbol instead of having to edit `src/rt/rust_try.ll`
mk: Update `make dist` for MSVC targets
This commit updates the `dist` target for MSVC to not build the mingw components
and to also ensure that the `llvm-ar.exe` binary is ferried along into the right
location for installs.

@alexcrichton alexcrichton force-pushed the alexcrichton:msvc branch from 617d7ef to cb3071b May 19, 2015

@alexcrichton

This comment has been minimized.

Copy link
Member Author

alexcrichton commented May 19, 2015

@brson OK I have cleaned up the commits and rebased everything, I'm going to do a full build locally again to make sure it all works, and I will also run this through dev on buildbot to make sure it bootstraps there as well. Once those pass I will r=you (pending other comments)!

@alexcrichton

This comment has been minimized.

Copy link
Member Author

alexcrichton commented May 19, 2015

@bors: r=brson cb3071b

bors added a commit that referenced this pull request May 20, 2015

Auto merge of #25350 - alexcrichton:msvc, r=brson
Special thanks to @retep998 for the [excellent writeup](rust-lang/rfcs#1061) of tasks to be done and @ricky26 for initially blazing the trail here!

# MSVC Support

This goal of this series of commits is to add MSVC support to the Rust compiler
and build system, allowing it more easily interoperate with Visual Studio
installations and native libraries compiled outside of MinGW.

The tl;dr; of this change is that there is a new target of the compiler,
`x86_64-pc-windows-msvc`, which will not interact with the MinGW toolchain at
all and will instead use `link.exe` to assemble output artifacts.

## Why try to use MSVC?

With today's Rust distribution, when you install a compiler on Windows you also
install `gcc.exe` and a number of supporting libraries by default (this can be
opted out of). This allows installations to remain independent of MinGW
installations, but it still generally requires native code to be linked with
MinGW instead of MSVC. Some more background can also be found in #1768 about the
incompatibilities between MinGW and MSVC.

Overall the current installation strategy is quite nice so long as you don't
interact with native code, but once you do the usage of a MinGW-based `gcc.exe`
starts to get quite painful.

Relying on a nonstandard Windows toolchain has also been a long-standing "code
smell" of Rust and has been slated for remedy for quite some time now. Using a
standard toolchain is a great motivational factor for improving the
interoperability of Rust code with the native system.

## What does it mean to use MSVC?

"Using MSVC" can be a bit of a nebulous concept, but this PR defines it as:

* The build system for Rust will build as much code as possible with the MSVC
  compiler, `cl.exe`.
* The build system will use native MSVC tools for managing archives.
* The compiler will link all output with `link.exe` instead of `gcc.exe`.

None of these are currently implemented today, but all are required for the
compiler to fluently interoperate with MSVC.

## How does this all work?

At the highest level, this PR adds a new target triple to the Rust compiler:

    x86_64-pc-windows-msvc

All logic for using MSVC or not is scoped within this triple and code can
conditionally build for MSVC or MinGW via:

    #[cfg(target_env = "msvc")]

It is expected that auto builders will be set up for MSVC-based compiles in
addition to the existing MinGW-based compiles, and we will likely soon start
shipping MSVC nightlies where `x86_64-pc-windows-msvc` is the host target triple
of the compiler.

# Summary of changes

Here I'll explain at a high level what many of the changes made were targeted
at, but many more details can be found in the commits themselves. Many thanks to
@retep998 for the excellent writeup in rust-lang/rfcs#1061 and @RicK26 for a lot
of the initial proof-of-concept work!

## Build system changes

As is probably expected, a large chunk of this PR is changes to Rust's build
system to build with MSVC. At a high level **it is an explicit non goal** to
enable building outside of a MinGW shell, instead all Makefile infrastructure we
have today is retrofitted with support to use MSVC instead of the standard MSVC
toolchain. Some of the high-level changes are:

* The configure script now detects when MSVC is being targeted and adds a number
  of additional requirements about the build environment:
  * The `--msvc-root` option must be specified or `cl.exe` must be in PATH to
    discover where MSVC is installed. The compiler in use is also required to
    target x86_64.
  * Once the MSVC root is known, the INCLUDE/LIB environment variables are
    scraped so they can be reexported by the build system.
  * CMake is required to build LLVM with MSVC (and LLVM is also configured with
    CMake instead of the normal configure script).
  * jemalloc is currently unconditionally disabled for MSVC targets as jemalloc
    isn't a hard requirement and I don't know how to build it with MSVC.
* Invocations of a C and/or C++ compiler are now abstracted behind macros to
  appropriately call the underlying compiler with the correct format of
  arguments, for example there is now a macro for "assemble an archive from
  objects" instead of hard-coded invocations of `$(AR) crus liboutput.a ...`
* The output filenames for standard libraries such as morestack/compiler-rt are
  now "more correct" on windows as they are shipped as `foo.lib` instead of
  `libfoo.a`.
* Rust targets can now depend on native tools provided by LLVM, and as you'll
  see in the commits the entire MSVC target depends on `llvm-ar.exe`.
* Support for custom arbitrary makefile dependencies of Rust targets has been
  added. The MSVC target for `rustc_llvm` currently requires a custom `.DEF`
  file to be passed to the linker to get further linkages to complete.

## Compiler changes

The modifications made to the compiler have so far largely been minor tweaks
here and there, mostly just adding a layer of abstraction over whether MSVC or a
GNU-like linker is being used. At a high-level these changes are:

* The section name for metadata storage in dynamic libraries is called `.rustc`
  for MSVC-based platorms as section names cannot contain more than 8
  characters.
* The implementation of `rustc_back::Archive` was refactored, but the
  functionality has remained the same.
* Targets can now specify the default `ar` utility to use, and for MSVC this
  defaults to `llvm-ar.exe`
* The building of the linker command in `rustc_trans:🔙:link` has been
  abstracted behind a trait for the same code path to be used between GNU and
  MSVC linkers.

## Standard library changes

Only a few small changes were required to the stadnard library itself, and only
for minor differences between the C runtime of msvcrt.dll and MinGW's libc.a

* Some function names for floating point functions have leading underscores, and
  some are not present at all.
* Linkage to the `advapi32` library for crypto-related functions is now
  explicit.
* Some small bits of C code here and there were fixed for compatibility with
  MSVC's cl.exe compiler.

# Future Work

This commit is not yet a 100% complete port to using MSVC as there are still
some key components missing as well as some unimplemented optimizations. This PR
is already getting large enough that I wanted to draw the line here, but here's
a list of what is not implemented in this PR, on purpose:

## Unwinding

The revision of our LLVM submodule [does not seem to implement][llvm] does not
support lowering SEH exception handling on the Windows MSVC targets, so
unwinding support is not currently implemented for the standard library (it's
lowered to an abort).

[llvm]: https://github.com/rust-lang/llvm/blob/rust-llvm-2015-02-19/lib/CodeGen/Passes.cpp#L454-L461

It looks like, however, that upstream LLVM has quite a bit more support for SEH
unwinding and landing pads than the current revision we have, so adding support
will likely just involve updating LLVM and then adding some shims of our own
here and there.

## dllimport and dllexport

An interesting part of Windows which MSVC forces our hand on (and apparently
MinGW didn't) is the usage of `dllimport` and `dllexport` attributes in LLVM IR
as well as native dependencies (in C these correspond to
`__declspec(dllimport)`).

Whenever a dynamic library is built by MSVC it must have its public interface
specified by functions tagged with `dllexport` or otherwise they're not
available to be linked against. This poses a few problems for the compiler, some
of which are somewhat fundamental, but this commit alters the compiler to attach
the `dllexport` attribute to all LLVM functions that are reachable (e.g. they're
already tagged with external linkage). This is suboptimal for a few reasons:

* If an object file will never be included in a dynamic library, there's no need
  to attach the dllexport attribute. Most object files in Rust are not destined
  to become part of a dll as binaries are statically linked by default.
* If the compiler is emitting both an rlib and a dylib, the same source object
  file is currently used but with MSVC this may be less feasible. The compiler
  may be able to get around this, but it may involve some invasive changes to
  deal with this.

The flipside of this situation is that whenever you link to a dll and you import
a function from it, the import should be tagged with `dllimport`. At this time,
however, the compiler does not emit `dllimport` for any declarations other than
constants (where it is required), which is again suboptimal for even more
reasons!

* Calling a function imported from another dll without using `dllimport` causes
  the linker/compiler to have extra overhead (one `jmp` instruction on x86) when
  calling the function.
* The same object file may be used in different circumstances, so a function may
  be imported from a dll if the object is linked into a dll, but it may be
  just linked against if linked into an rlib.
* The compiler has no knowledge about whether native functions should be tagged
  dllimport or not.

For now the compiler takes the perf hit (I do not have any numbers to this
effect) by marking very little as `dllimport` and praying the linker will take
care of everything. Fixing this problem will likely require adding a few
attributes to Rust itself (feature gated at the start) and then strongly
recommending static linkage on Windows! This may also involve shipping a
statically linked compiler on Windows instead of a dynamically linked compiler,
but these sorts of changes are pretty invasive and aren't part of this PR.

## CI integration

Thankfully we don't need to set up a new snapshot bot for the changes made here as our snapshots are freestanding already, we should be able to use the same snapshot to bootstrap both MinGW and MSVC compilers (once a new snapshot is made from these changes).

I plan on setting up a new suite of auto bots which are testing MSVC configurations for now as well, for now they'll just be bootstrapping and not running tests, but once unwinding is implemented they'll start running all tests as well and we'll eventually start gating on them as well.

---

I'd love as many eyes on this as we've got as this was one of my first interactions with MSVC and Visual Studio, so there may be glaring holes that I'm missing here and there!

cc @retep998, @ricky26, @vadimcn, @klutzy 

r? @brson
@bors

This comment has been minimized.

Copy link
Contributor

bors commented May 20, 2015

⌛️ Testing commit cb3071b with merge 43cf733...

@bors bors merged commit cb3071b into rust-lang:master May 20, 2015

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
homu Test successful
Details
@aturon

This comment has been minimized.

Copy link
Member

aturon commented May 20, 2015

\o/ Congrats @alexcrichton! Now enjoy some well-deserved time off :)

@alexcrichton alexcrichton deleted the alexcrichton:msvc branch May 26, 2015

@alexcrichton

This comment has been minimized.

Copy link
Member Author

alexcrichton commented May 26, 2015

❤️

@vadimcn

This comment has been minimized.

Copy link
Contributor

vadimcn commented on configure in 7cf0b17 Jun 1, 2015

@alexcrichton: My configure is failing this check, but it doesn't look like python is changing the path, but rather that pwd prints it in the unix format to begin with.

This comment has been minimized.

Copy link
Member Author

alexcrichton replied Jun 1, 2015

Oh bah, this is relying being run in a MSYS shell so the Unix path being printed out by pwd is translated to a Windows path by the shell (passed as an argument to Python), and then ensures that it's not re-translated by python again. I wonder if this could perhaps just be a hardcoded C:\foo argument?

This comment has been minimized.

Copy link
Contributor

vadimcn replied Jun 1, 2015

So it's relying on this? Hmm, that might explain it. So far I've bee using msys2's version of python while building Rust, and it worked without problems. I take it, you are using non-msys python?
What about cmake? msys or non-msys?

I would encourage you to document the environment required to build msvc target. :-) Clearly, it isn't what it says here.

This comment has been minimized.

Copy link
Member Author

alexcrichton replied Jun 2, 2015

I take it, you are using non-msys python?

There are two MSYS pythons, and there's also a non-MSYS python (e.g. download directly). This is attempting to detect the one broken MSYS python, but all others should work.

What about cmake? msys or non-msys?

Both should work just fine

I would encourage you to document the environment required to build msvc target. :-)

I'd prefer to actually just get it to work everywhere, as if it's not working it's definitely a bug! After some more testing it appears the problem here is a little more subtle than I originally thought:

  • Of the three pythons, /usr/bin/python, /mingw64/bin/python, C:/PythonXX/python, all work.
  • If sh runs /usr/bin/python, it does not work

The problem is that when sh runs /usr/bin/python with an argument of a unix path, it does not auto-translate the argument to Windows path, unlike all other configurations. As to why, I have no idea!

Can you try modifying this check to see if this works for you?

if /bin/sh -c "$CFG_PYTHON -c 'import sys; print sys.argv[1]' /foo"  | grep '^/'

This comment has been minimized.

Copy link
Contributor

vadimcn replied Jun 2, 2015

The problem is that when sh runs /usr/bin/python with an argument of a unix path, it does not auto-translate the argument to Windows path, unlike all other configurations. As to why, I have no idea!

I think the above link actually explains why: /usr/bin/python depends on msys-2.0.dll, and is therefore considered a native posix program, which gets the /c/foo/bar version.

Do you know, at what stage do the posix paths break cmake? Can we use cygpath -w <path> to convert path explicitly?

This comment has been minimized.

Copy link
Member Author

alexcrichton replied Jun 2, 2015

Ah yep, that'd do it! Unfortunately if MSYS does not translate a Unix path to a Windows path for the python-in-use, it will end up leaking into LLVM's build system as a Unix path, and that path is then emitted in turn to a file which is later read by CMake. CMake then tries to read a file at a Unix path, which definitely doesn't exist, so it chokes.

The specific path taken is:

I'm not 100% sure, but it seems that the current directory of CMake is Unix-style due to being run from MSYS (maybe?) and this then leaks into python via __file__, and eventually leaks into CMake as well.

If there's a way to force windows style paths all the way, then that'd be great! Unfortunately last time I tried that I couldn't figure out a way to do so :\

This comment has been minimized.

Copy link
Contributor

vadimcn replied Jun 2, 2015

CMake probably isn't at fault here. It seems that the act of importing a module in msys python is what causes the conversion of path into unix format:
c:\rust\foo.py: print(__file__); import bar
c:\rust\bar.py: print(__file__)

C:\> python2.7 c:\rust\foo.py
c:\rust\foo.py
/c/rust/bar.py

I don't think there is a way to force windows paths in msys, but here's something that sort of works:

C:\> mount c:\rust /rust
mount: warning - /rust does not exist.
C:\> python2.7 c:\rust\foo.py
c:\rust\foo.py
/rust/bar.py

Apparently the /rust mount point wins over /c/rust when converting c:\rust to unix style.
/rust/... can be fed to most native windows programs, including cmake, without problems; they just interpret it as a current drive-relative path.

Thoughts?

This comment has been minimized.

Copy link
Member Author

alexcrichton replied Jun 2, 2015

I'd probably err on the side of avoiding the mount, but being able to detect a MSYS python through the imports reliably sounds like it's more beneficial than this check today. We could include some src/etc scripts which just print out __file__ and detect if a unix path comes out?

We could also probably improve this error message (e.g. suggest an alternate python package)

@ghost

This comment has been minimized.

Copy link

ghost commented Jun 15, 2015

@alexcrichton,

I have two suggestions:

  1. Declare VS2015 (currently RC, will be RTMd before Windows 10 launch: 29th July?) as minimum requirement to build Rust source code.
  2. Remove all the guarded inclusions and C99 fallback implementations for VS < v2015 and avoid all (well most of) the non-std code for MSVC. ;)

Since there is VS2015's full featured 'community' and 'express-for-desktop' versions available for free of cost, the Windows users will have enough choices: MinGW, VS2015 pro/ultimate/express-for-desktop/community to build the Rust solution.

Besides, you guys have made the win32 installer available for mainstream consumer. Building from scratch is only for enthusiasts and contributors, so I don't see it as a real deal-breaker.

Note that two main Windows supporting CI systems: AppVeyor and Jenkins already support VS2015 RC. :)

@alexcrichton

This comment has been minimized.

Copy link
Member Author

alexcrichton commented Jun 15, 2015

@jasonwilliams200OK thanks for the suggestions! So far we have so little C code we haven't run too much into VS compatibility just yet, but seems like sound advice regardless! We do actually currently run in AppVeyor today as well (I set up CI for a bunch of our crates over the weekend), so we're looking pretty good on that front!

In terms of the CRT stuff has some discussion, but we may try to actually be totally independent of the CRT in the standard library, which would be quite nice!

@petrochenkov

This comment has been minimized.

Copy link
Contributor

petrochenkov commented Jun 15, 2015

Declare VS2015 (currently RC, will be RTMd before Windows 10 launch: 29th July?) as minimum requirement to build Rust source code.

Yeah, I have the same feeling, mostly because of the CRT rewrite, but I don't know what would be the practical consequences of not supporting VS2013 (and <VS2013 can't be used anyway because it can't build LLVM).

@ghost

This comment has been minimized.

Copy link

ghost commented Jun 15, 2015

we may try to actually be totally independent of the CRT in the standard library, which would be quite nice!

@alexcrichton, sounds like a great plan. 👍
There are some more cool bits around CRT and such:

@petrochenkov, while installing VS 2015 requires at least Windows 7, it can build for targets as far back as Windows XP (info for the RC, but that shouldn't change for RTM).

@luser

This comment has been minimized.

Copy link
Contributor

luser commented Jun 16, 2015

If VS2015 is only required to build the toolchain, and not a requirement for linking Rust code with the compiler then it's probably fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment