Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows, C++: investigate '/experimental:deterministic' compiler flag #9466

Open
laszlocsomor opened this issue Oct 2, 2019 · 24 comments
Open
Labels
area-Windows Windows-specific issues and feature requests P3 We're not considering working on this, but happy to review a PR. (No assignee) team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website type: feature request

Comments

@laszlocsomor
Copy link
Contributor

Feature requests: what underlying problem are you trying to solve with this feature?

Investigate MSVC's /experimental:deterministic flag's effect, maybe use it in MSVC crosstool. (Maybe put it behind a Bazel experimental flag.)

Motivation: to make C++ builds deterministic. For example, MSVC doesn't support overriding non-deterministic macros like __DATE__, maybe this flag solves that problem.

What operating system are you running Bazel on?

Windows 10

What's the output of bazel info release?

release 0.29.1

@laszlocsomor laszlocsomor added P3 We're not considering working on this, but happy to review a PR. (No assignee) area-Windows Windows-specific issues and feature requests type: feature request labels Oct 2, 2019
@laszlocsomor
Copy link
Contributor Author

/cc @meteorcloudy @hlopko

@meteorcloudy
Copy link
Member

We have no experience so far.
From https://developercommunity.visualstudio.com/idea/426033/support-for-cls-experimentaldeterministic-becoming.html, looks like this is not yet documented anywhere and is still under development?

@nico
Copy link

nico commented Nov 9, 2019

I found https://developercommunity.visualstudio.com/content/problem/502528/changing-pdb-source-file-paths-with-experimentalde.html which says:

"""PDBs are not deterministic and aren't the intended target of this feature, so there may still be a great many full paths still embedded in the PDBs after doing this. These switches are intended for object file and executable determinism, not PDBs."""

Since executables embed a ref to the PDB in /debug build, it looks like deterministic builds with debug information remain impossible with MSVC.

(Plug: They are possible with clang-cl and lld-link, see http://blog.llvm.org/2019/11/deterministic-builds-with-clang-and-lld.html for how.)

@philwo philwo added the team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website label Jun 15, 2020
@shuffle2
Copy link

shuffle2 commented Aug 28, 2020

Some other maybe interesting links:

I've been playing around with some trial and error and found that:

  • d1nodatetime: undefs __DATE__, __TIME__, __TIMESTAMP__. user may redefine them.
  • Brepro (should be supplied to all of cl, lib, and link): implies d1nodatetime and sets initial value of those #defines equal to 1. Additionally modifies some data stored in object files to make them binary-identical between builds.
  • experimental:deterministic implies Brepro and issues a warning if problematic code is found (usage of __TIME__, etc.)

Using Brepro, INCREMENTAL:NO, and Z7, I'm able to get outputs to be binary-identical between builds on the same computer, with some exceptions: If a .manifest and/or resource file (.rc) is processed during compilation/linking, it creates a temporary file during processing, and for whatever reason, this temporary path is included in the resulting pdb (obviously the temporary has been deleted by the time the build is done, so...pointless). This can be "solved" by just omitting the .rc or using a side-by-side .manifest instead of embedding it. However, pdbs with these different temporary paths inside are completely interchangeable - the difference does not prevent a debugger loading it for example.

d1trimfile:<path[=ENV_VAR]> (seen in above links) sounds like it was originally intended to trim the __FILE__ definition, which it does seem to do in my testing (handy if you use FC, I guess). As discussed in above links, they seem to have made it apply to paths stored in .obj files referencing a pdb as well, which occurs on non-Z7 builds, i.e. Zi builds. Unfortunately my quick test still shows full pdb path being in the eventual exe file.

However...why is this a problem? The differences in the pdb can be seen to be inconsequential and the exe will be identical apart from the pdb path, which is accounted for by existing tools (like debuggers and symsrv, which dont use a path to index the pdb).

Maybe make a request to msvc team to have d1trimfile also apply to the pdb path stored in exe? Since they recently made a related change perhaps it just needs to be asked for.

@nico
Copy link

nico commented Jan 11, 2021

(You can make the pdb path in the exe relative with /pdbaltpath:%_PDB%.)

@konste
Copy link
Contributor

konste commented Jan 11, 2021

@nico what exactly is `%_PDB%? Where it comes from? I wonder how to I change Bazel MSVC toolchain to use it?

@nico
Copy link

nico commented Feb 27, 2021

@konste: You put that there literally. See https://docs.microsoft.com/en-us/cpp/build/reference/pdbaltpath-use-alternate-pdb-path?view=msvc-160

@dgrunwald
Copy link
Contributor

I got /pdbaltpath:%_PDB% working in a toolchain file:

flag_set(
    actions = all_link_actions,
    flag_groups = [
        flag_group(
            flags = [
                "/Brepro",
                # Tell the linker to embed only the .pdb filename instead of
                # the full path.
                # Note that % has a special meaning for bazel so we need to
                # double it to pass a literal % to the linker.
                "/PDBALTPATH:%%_PDB%%"
            ],
        ),
    ],
),

I still have some trouble fitting /d1trimfile into the toolchain file.

@peakschris
Copy link

@dgrunwald did you get anywhere with /d1trimfile in the toolchain?

@dgrunwald
Copy link
Contributor

No.
As far as I can tell, the only possible workaround is to wrap the compiler with another binary that turns bazel-friendly command line arguments into actual MSVC commandline arguments.

@peakschris
Copy link

@dgrunwald thanks. I've got this working:

clwrap.bat
@set CLEXE=C:/apps/MVS174/VC/Tools/MSVC/14.34.31933/bin/HostX64/x64/cl.exe
@%CLEXE% /d1trimfile:%cd% %*

The next problem we are facing is that Bazel's use of /Fo to control output location appears to cause msvc to write an absolute path into the object file. Utterly bizarre, but removing the /Fo and allowing msvc to write to current directory causes this behaviour to stop. Has anyone seen this?

@malkia
Copy link

malkia commented Apr 11, 2024

@konste
Copy link
Contributor

konste commented Apr 12, 2024

I was experimenting with this recently using MSVC 2022 (compiler version 19.36.32535) and here is what I found.

/experimental:deterministic plus /Brepro given to both compiler and linker achieves the goal of building the primary binary (such as DLL) deterministically, i,e. binaries built on the different machines are identical.

Also /d1trimfile works as expected regardless of /Fo presense which is good. The only catch is the prefix to remove must be specified verbatim, i.e. with backslashes, not forward slashes. The problem: we cannot specify the prefix on the command line as it would immediately make the compilation action non-cacheable unless output_base is set identically everywhere. I am thinking about doing it through the .params file but I am not sure if its content is taking into account for the cache key calculation or not. Also when we set the prefix in the toolchain and let it create .params file the prefix gets escaped (backslashes get doubled) and it is enough for the prefix being not verbatim and /d1trimfile stops working.

Next observation: .obj files for some reason include absolute path to themselves. I don't know why and what what to do with it.

Also PDB files appear to be very different when produced on the different machines.

So there is a progress, but not the final solution unfortunately.

@peakschris
Copy link

Great, information, thanks for sharing. This sounds very similar behaviour to VS2019. I am working around the d1trimfile issue in two ways:

Approach 1) as mentioned above, write a clwrap.bat which figures out the trim directory and adds it to the command line, call clwrap.bat instead of cl.exe in the toolchain (we take a copy of Bazel's autogenerated toolchain and check it in to our codebase).
Approach 2) Write cl.cxx, compile it to cl.exe, and pass it into the toolchain as above.

Approach 1 worked fine and was simple, but confuses coverity's bazel integration. Coverity introspects the build and expects to see something called exactly "cl.exe" invoked. So we are now following Approach 2.

@konste did you try a compile without /Fo? If so, does the absolute path disappear from obj? That is what I see.

@malkia
Copy link

malkia commented Apr 12, 2024

BuildXL uses subst to map out the current folder (build/src) into an stable one.

subst B:\ d:\p\code

Then builds from B:\ - and paths are seeing as if coming from there.

One can go even further (maybe) and split - say source code comes from a:, and buid is in b:, etc.

@malkia
Copy link

malkia commented Apr 12, 2024

On the other hand C# has /pathmap to remap one or more roots to another stable name, I guess C++ (MSVC) is just missing this....

@konste
Copy link
Contributor

konste commented Apr 12, 2024

Approach 1) as mentioned above, write a clwrap.bat which figures out the trim directory and adds it to the command line, call clwrap.bat instead of cl.exe in the toolchain (we take a copy of Bazel's autogenerated toolchain and check it in to our codebase). Approach 2) Write cl.cxx, compile it to cl.exe, and pass it into the toolchain as above.

I understand the idea of the compiler wrapper (cmd or exe), but it does not help because of the full path to object files inside object files. By the way how you handle obj files at execution root when you compile without /Fo?
Do you use /Z7 or /Zi? We use /Z7 to avoid Bazel killing mspdbsrv.exe

@konste did you try a compile without /Fo? If so, does the absolute path disappear from obj? That is what I see.

Unfortunately both with and without /Fo I see absolute path in obj :-( It is very weird that we see different results here. Wonder if the difference is in the compiler version or some other options because /Fo does not seem to make a difference for me.

@konste
Copy link
Contributor

konste commented Apr 12, 2024

On the other hand C# has /pathmap to remap one or more roots to another stable name, I guess C++ (MSVC) is just missing this....

Interestingly I found pathmap string inside MSVC compiler binaries and when I tried to add it to the command line I got

cl : Command line error D8053 : argument to /pathmap must be of the form STR1=STR2 where STR1 is not empty

which makes me believe that it could be supported to some degree, just badly undocumented anywhere.
Do you have a plan how can we use it to make MSVC build cacheable?

@peakschris
Copy link

peakschris commented Apr 12, 2024

I've tried a simple reproducer, and I was using pathmap, and that does seem to be the key:
Preconditions: download sysinternals strings and put at d:\bin

set DIR=d:\workdir\deterministic
mkdir %DIR%\out
cd %DIR%

echo #include ^<stdio.h^> >hello.cxx
echo. >> hello.cxx
echo int hello() { >> hello.cxx
echo     printf("hello"); >> hello.cxx
echo     return 1; >> hello.cxx
echo } >> hello.cxx

cl /experimental:deterministic /pathmap:%DIR%\out=out /c hello.cxx /Fo:%DIR%\out\hello.obj
d:\bin\strings out\hello.obj | grep hello

outputs:

out\hello.obj
hello
?hello@@YAHXZ
$unwind$?hello@@YAHXZ
$pdata$?hello@@YAHXZ

Tested on:

  • Microsoft (R) C/C++ Optimizing Compiler Version 19.40.33521 for x86
  • Microsoft (R) C/C++ Optimizing Compiler Version 19.29.30153 for x64

Note: I am also using d1trimfile and Brepro, I recall they help in other scenarios.

@konste
Copy link
Contributor

konste commented Apr 12, 2024

I was using pathmap, and that does seem to be the key

But we still need compiler wrapper script to set pathmap right?

@peakschris
Copy link

But we still need compiler wrapper script to set pathmap right?

Yes

@malkia
Copy link

malkia commented Apr 12, 2024

Holy... didn't know this worked for C++ too! Thanks for exploring!!!

@konste
Copy link
Contributor

konste commented Apr 13, 2024

BTW link.exe also have undocumented /pathmap parameter and it helped with .exp file (not like we need it) but PDB files are still a problem - they have absolute path to execroot embedded anyway. Ideas?

@peakschris
Copy link

PDB is non-deterministic for us too. My hope is that as it is a leaf artifact this will not matter. Do you see it as being a practical issue in your deployment?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-Windows Windows-specific issues and feature requests P3 We're not considering working on this, but happy to review a PR. (No assignee) team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website type: feature request
Projects
None yet
Development

No branches or pull requests

9 participants