Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSVC rustc is unnaturally slower than Linux rustc #66192

Open
alexcrichton opened this issue Nov 7, 2019 · 11 comments
Open

MSVC rustc is unnaturally slower than Linux rustc #66192

alexcrichton opened this issue Nov 7, 2019 · 11 comments
Labels
C-bug Category: This is a bug. I-compiletime Issue: Problems and improvements with respect to compile times. O-windows-msvc Toolchain: MSVC, Operating system: Windows T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue.

Comments

@alexcrichton
Copy link
Member

While it's generally assumed that build systems on Windows are slower than build systems on Linux, I'm seeing a discrepancy of up to nearly 2x differences in compile times per crate on a Windows machine vs a Linux machine. These are personal machines I work on and they're not exactly equivalent machines, but I'm pretty surprised about the 2x differences I'm seeing here and wanted to open an issue to see if we can investigate to get to the bottom of what's going on.

The specifications of the machines I have are:

  • Linux - Intel(R) Core(TM) i9-7940X CPU @ 3.10GHz, 14-core/28-thread, 64GB ram
  • Windows - Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz, 4-core/8-thread, 32GB ram

I don't really know a ton about Intel CPUs, so I'm not actually sure if these are expected where the i9 is 2x faster than the i7. I wanted to write down some details though to see if others have thoughts. All Cargo commands were executed with -j4 to ensure that neither machine had an unfair parallelism advantage, and also to ideally isolate the effect of hyperthreads.

I started out by building https://github.com/cranestation/wasmtime/tree/ab3cd945bc2f4626a2fae8eabf6c7108973ce1a5, and the full -Ztimings graph I got was:

For the same project and the same compiler commit the Windows build is nearly 70% slower! I don't think that my CPUs have a 70% performance difference between them, and I don't have a perfect test environment for this, but 70% feels like a huge performance discrepancy between Linux and Windows.

Glancing at the slow building crates (use the "min unit time" slider to see them more easily) I'm seeing that almost all crates are 2x slower on Windows than on Linux. This doesn't look like a "chalk it up to windows being slow" issue, but this is where I started thinking that this was more likely to be a bug somewhere in rustc and/or LLVM.

Next up I wanted to try out -Z self-profile on a particular crate. One I wrote recently was the wast crate, which took 13.76s on Linux and 23.05s on Windows. I dug in a bit more building just that crate at https://github.com/alexcrichton/wat/tree/2288911124001d30de0a68e284db9ab010495536/crates/wast.

Here sure enough, the command cargo +nightly build --release -p wast -j4 has a huge discrepancy:

  • Linux - 5.18s
  • Windows - 8.58s

Next up I tried -Z self-profile and using measurme I ran summarize diff and got this output, notably:

+---------------------------------------------+---------------+------------+------------+--------------+-----------------------+
| Item                                        | Self Time     | Item count | Cache hits | Blocked time | Incremental load time |
+---------------------------------------------+---------------+------------+------------+--------------+-----------------------+
| LLVM_thin_lto_optimize                      | +3.86042516s  | +0         | +0         | +0ns         | +0ns                  |
+---------------------------------------------+---------------+------------+------------+--------------+-----------------------+
| LLVM_module_optimize_module_passes          | +3.152410865s | +0         | +0         | +0ns         | +0ns                  |
+---------------------------------------------+---------------+------------+------------+--------------+-----------------------+
| LLVM_module_codegen_emit_obj                | +1.783877999s | +0         | +0         | +0ns         | +0ns                  |
+---------------------------------------------+---------------+------------+------------+--------------+-----------------------+
| codegen_crate                               | +1.021669947s | +0         | +0         | +0ns         | +0ns                  |
+---------------------------------------------+---------------+------------+------------+--------------+-----------------------+
| LLVM_thin_lto_import                        | +245.950489ms | +0         | +0         | +0ns         | +0ns                  |
+---------------------------------------------+---------------+------------+------------+--------------+-----------------------+
| codegen_module                              | +220.253166ms | +0         | +0         | +0ns         | +0ns                  |
+---------------------------------------------+---------------+------------+------------+--------------+-----------------------+
| LLVM_module_optimize_function_passes        | +134.256719ms | +0         | +0         | +0ns         | +0ns                  |
+---------------------------------------------+---------------+------------+------------+--------------+-----------------------+
| LLVM_module_codegen_make_bitcode            | +111.530996ms | +0         | +0         | +0ns         | +0ns                  |
+---------------------------------------------+---------------+------------+------------+--------------+-----------------------+

For whatever reason, it appears that LLVM is massively slower on Windows than it is on Linux.

It was at this point that I decided to write up the issue here and get this all down in a report. I suspect that this is either a build system problem with Windows or it's a compiler problem. We're using Clang on Linux but we're not using Clang on Windows yet, so it may be time to make the transition!

@jonas-schievink jonas-schievink added I-compiletime Issue: Problems and improvements with respect to compile times. O-windows-msvc Toolchain: MSVC, Operating system: Windows T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Nov 7, 2019
@alexcrichton
Copy link
Member Author

Ok I've confirmed that our intention is to compile LLVM with clang-cl.exe, but due to bugs in CI configuration that actually isn't happening. I'll look to fix that!

@jonas-schievink jonas-schievink added T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue. C-bug Category: This is a bug. labels Nov 7, 2019
@mati865
Copy link
Contributor

mati865 commented Nov 7, 2019

FWIW Clang can also be used for MinGW but it will require few tweaks.

bors added a commit that referenced this issue Nov 7, 2019
Update Clang & build MSVC LLVM with it

This is a general update of our builders to Clang 9, and then it also attempts to tackle a bit of #66192 by building LLVM for rustc with Clang, not with the system `cl.exe` on MSVC.
bors added a commit that referenced this issue Nov 8, 2019
Update Clang & build MSVC LLVM with it

This is a general update of our builders to Clang 9, and then it also attempts to tackle a bit of #66192 by building LLVM for rustc with Clang, not with the system `cl.exe` on MSVC.
@alexcrichton
Copy link
Member Author

Ok I've done a slightly more scientific test. I spun up two instance on AWS, one Ubuntu and one Windows. They're both using AMD EPYC 7571 cpus, 4 cores (virtualize). Naturally AWS is very noisy, but the hope is to get a baseline measurement between Windows/Ubuntu which at least gets the differences in CPU out of the way.

Again compiling the wast crate I got ~13s on Ubuntu and ~18s on Windows, again a pretty large discrepancy. That was using 50f8aad.

Using a Windows compiler produced from #66194 I get ~17s, so while compiling with Clang instead of cl.exe is a modest improvement, it doesn't explain the remaining 4 ish seconds of compile time difference. The next thing to check is probably ThinLTO because we enable that on Linux, but we don't enable it anywhere else.

@nagisa
Copy link
Member

nagisa commented Nov 8, 2019

@alexcrichton how feasible is it to measure user (spent doing work) and system (spent waiting on syscalls) time in seconds for Linux and Windows?

bors added a commit that referenced this issue Nov 8, 2019
Update Clang & build MSVC LLVM with it

This is a general update of our builders to Clang 9, and then it also attempts to tackle a bit of #66192 by building LLVM for rustc with Clang, not with the system `cl.exe` on MSVC.
@retep998
Copy link
Member

retep998 commented Nov 9, 2019

GetProcessTimes provides both the kernel and user times so it is very feasible to do it on Windows at least.

@mati865
Copy link
Contributor

mati865 commented Nov 9, 2019

Maybe shell32 is the reason here (or similar lib). Even thought Rust avoids them at all costs LLVM still links it because it won't build without.

@ojeda
Copy link
Contributor

ojeda commented Nov 9, 2019

Another potential culprit is Windows Defender scanning the new files as they are produced. Try adding an exclusion for the entire build folder.

@retep998
Copy link
Member

The shell32 issue just caused process load times to increase, which is noticeable for short running processes. It would not affect the speed at which LLVM generates code.

@alexcrichton
Copy link
Member Author

@nagisa 98.962% of the time is spent in user mode, so I don't think this is a kernel difference thing. @mati865 as mentioned while possible that's historically only related to startu ptime. @ojeda given that rustc creates very few files, I suspect that is not the issue.

@ollie27
Copy link
Member

ollie27 commented Nov 13, 2019

I wonder if the use of jemalloc on Linux but not MSVC could explain some of the difference?

@elibroftw
Copy link

Can someone take a look at this issue? Rust compilation on github actions is significantly longer than linux and macOS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. I-compiletime Issue: Problems and improvements with respect to compile times. O-windows-msvc Toolchain: MSVC, Operating system: Windows T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

8 participants