Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Produce MS Coff by default when targetting windows #13110

Merged
merged 8 commits into from
Jan 21, 2022

Conversation

thewilsonator
Copy link
Contributor

This rebases #12825 but doesn't attempt to rename phobos libraries and doesn't change the build infra where possible(and doesn't have a large diff in link.d). See if this fixes the CI issues.

@dlang-bot
Copy link
Contributor

Thanks for your pull request and interest in making D better, @thewilsonator! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please verify that your PR follows this checklist:

  • My PR is fully covered with tests (you can see the coverage diff by visiting the details link of the codecov check)
  • My PR is as minimal as possible (smaller, focused PRs are easier to review than big ones)
  • I have provided a detailed rationale explaining my changes
  • New or modified functions have Ddoc comments (with Params: and Returns:)

Please see CONTRIBUTING.md for more information.


If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment.

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub run digger -- build "master + dmd#13110"

@thewilsonator thewilsonator added the Review:Needs Changelog A changelog entry needs to be added to /changelog label Sep 30, 2021
@thewilsonator thewilsonator force-pushed the rmomf-2 branch 3 times, most recently from 5ce61cb to 920d8d5 Compare September 30, 2021 06:31
@thewilsonator thewilsonator removed the Review:Needs Changelog A changelog entry needs to be added to /changelog label Sep 30, 2021
changelog/fix18964.dd Outdated Show resolved Hide resolved
@Geod24
Copy link
Member

Geod24 commented Sep 30, 2021

As much as I agree we should go in this direction, @WalterBright 's opposition is with the very essence of this PR.

@thewilsonator
Copy link
Contributor Author

#12825 had a misleading title, and I can understand, to a certain extent, any opposition based on misunderstandings from that PR.

That being said, none of Walter's points are backed up by current reality:

the Microsoft toolchain does not come with Windows.
we ship lld (LLVM based linker) and minGW import libraries on windows. Lack of toolchain is not a problem.

  1. Versions. Every customer has a different version of that linker. Some would work, some wouldn't. I couldn't just ship someone with a non-working MS-Link a linker that did work.
  2. Bugs. Microsoft placed no priority whatsoever on fixing bugs. I was faced with being dead in the water for indeterminate times. Obviously, this was untenable.
  3. The compiler didn't work out of the box. I had to tell customers they had to buy some MS dev system to get a linker.
  4. When MS-Link didn't work, or crashed, etc., I always got blamed.

#12825 (comment)

@WalterBright
Copy link
Member

Leaving it the way it is allows us to control the out-of-box experience. Relying on the Microsoft toolchain by default does not. Minimizing the friction for people wanting to try out D is critical.

@maxhaton
Copy link
Member

Leaving it the way it is allows us to control the out-of-box experience. Relying on the Microsoft toolchain by default does not. Minimizing the friction for people wanting to try out D is critical.

Isn't Nicholas's point that we don't depend on the Microsoft toolchain because we ship LLVM's linker and minGW?

@WalterBright
Copy link
Member

If I recall correctly, much of the impetus for this came from vibe.d causing optlink to fail. This turned out to not be a problem with optlink, but was caused by replacing the system lib files meant for optlink (in OMF format) with lib files from Microsoft, which were in MS-COFF format. The solution was to simply run coffimplib.exe on the new system .lib files, which is the responsibility of our team, not the customer. coffimplib converts the files to OMF format.

@WalterBright
Copy link
Member

Microsoft runtime libraries and the Microsoft linker are required to compile with the -m32mscoff switch. They are not part of the Windows operating system distribution and are not on the user's machine by default. Ideally,

dmd hello

should be the user's initial experience after unzipping the download. Not "linker not found" and "library not found" errors. One could argue this isn't a big deal, but in my experience, you'll lose half the customers right off the bat with that.

@WalterBright
Copy link
Member

Recall the success Python had with "batteries included" (and consumer products in general - I remember when things didn't come with batteries.)

@adamdruppe
Copy link
Contributor

dmd -m32mscoff has worked out of the box for some time now. The dmd distribution has included the batteries for.... a year or two.

@WalterBright
Copy link
Member

Ok. I have dmd 2.092 installed on my machine. I wrote:

import std.stdio;
void main {
    writeln("hello world\n");
}

compiling:

dmd hello -m32mscoff
lld-link: error: undefined symbol: __imp__InterlockedIncrement@4
>>> referenced by f:\dd\vctools\crt_bld\self_x86\crt\src\tidtable.c:402
>>>               libcmt.lib(tidtable.obj):(__initptd)
>>> referenced by f:\dd\vctools\crt_bld\self_x86\crt\src\mbctype.c:524
>>>               libcmt.lib(mbctype.obj):($LN14)
>>> referenced by f:\dd\vctools\crt_bld\self_x86\crt\src\mbctype.c:618
>>>               libcmt.lib(mbctype.obj):(__setmbcp)
>>> referenced by f:\dd\vctools\crt_bld\self_x86\crt\src\localref.c:39
>>>               libcmt.lib(localref.obj):(___addlocaleref)
>>> referenced by f:\dd\vctools\crt_bld\self_x86\crt\src\setlocal.c:463
>>>               libcmt.lib(setlocal.obj):($LN10)

lld-link: error: undefined symbol: __imp__InterlockedDecrement@4
>>> referenced by f:\dd\vctools\crt_bld\self_x86\crt\src\tidtable.c:596
>>>               libcmt.lib(tidtable.obj):(__freefls@4)
>>> referenced by f:\dd\vctools\crt_bld\self_x86\crt\src\mbctype.c:511
>>>               libcmt.lib(mbctype.obj):($LN14)
>>> referenced by f:\dd\vctools\crt_bld\self_x86\crt\src\mbctype.c:612
>>>               libcmt.lib(mbctype.obj):(__setmbcp)
>>> referenced by f:\dd\vctools\crt_bld\self_x86\crt\src\mbctype.c:638
>>>               libcmt.lib(mbctype.obj):(__setmbcp)
>>> referenced by f:\dd\vctools\crt_bld\self_x86\crt\src\localref.c:79
>>>               libcmt.lib(localref.obj):(___removelocaleref)
>>> referenced by f:\dd\vctools\crt_bld\self_x86\crt\src\initctyp.c:195
>>>               libcmt.lib(initctyp.obj):(___init_ctype)
>>> referenced by f:\dd\vctools\crt_bld\self_x86\crt\src\initctyp.c:227
>>>               libcmt.lib(initctyp.obj):($error_cleanup$28334)
>>> referenced by f:\dd\vctools\crt_bld\self_x86\crt\src\setlocal.c:260
>>>               libcmt.lib(setlocal.obj):(__free_locale)
>>> referenced by f:\dd\vctools\crt_bld\self_x86\crt\src\setlocal.c:905
>>>               libcmt.lib(setlocal.obj):(__setlocale_get_all)
>>> referenced by f:\dd\vctools\crt_bld\self_x86\crt\src\setlocal.c:921
>>>               libcmt.lib(setlocal.obj):($LN26)
>>> referenced 5 more times

lld-link: error: undefined symbol: __imp__InterlockedExchange@8
>>> referenced by f:\dd\vctools\crt_bld\self_x86\crt\src\setmode.c:149
>>>               libcmt.lib(setmode.obj):(__set_fmode)
>>> referenced by f:\dd\vctools\crt_bld\self_x86\crt\src\rand_s.c:94
>>>               libcmt.lib(rand_s.obj):(_rand_s)
Error: linker exited with status 1

There is no f:\dd directory on my system.

dmd hello
E:\dmd2.092\windows\bin>dmd hello

E:\dmd2.092\windows\bin>hello
hello world


E:\dmd2.092\windows\bin>

@WalterBright
Copy link
Member

P.S. I was in the e:\dmd2.092\windows\bin directory.

@thewilsonator
Copy link
Contributor Author

what is the linker command issued by dmd? because the libraries are right there in the install

@WalterBright
Copy link
Member

what is the linker command issued by dmd?

E:\dmd2.092\windows\bin\lld-link.exe /NOLOGO "hello.obj"   /DEFAULTLIB:phobos32mscoff /OPT:NOICF  /LIBPATH:"C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\lib" /LIBPATH:".\..\lib32mscoff\mingw" /SAFESEH:NO

The weird f:\dd path is another problem. No, it is not in any of my environment variables.

@adamdruppe
Copy link
Contributor

it is kinda weird that it includes a MSVC path in your link. that doesn't happen on my out-of-the-box setup that just works.

@thewilsonator
Copy link
Contributor Author

Do you have a bjorked VC installation?

On a freshly installed clean windows box with no VS installation it should be picking up the _InterlockedIncrement function in .\..\lib32mscoff\mingw\kernel32.lib.

@WalterBright
Copy link
Member

So I need to delete my VC installation in order to use dmd? It's hardly bjorked, it's the totally default install. Also, when I use it, I use a separate command prompt with the VC environment variables. The command prompt I was running did not set that environment.

I still don't know where it is getting things from, nor where that f:\dd comes from.

If you say "you shouldn't be using VC anymore, it's obsolete, it won't work with dmd" ok, but that sort of thing was exactly why I got optlink in the first place. Telling customers they had to change which Microsoft dev products they had to use was a total non-starter with them. It invariably produced a "no".

@adamdruppe
Copy link
Contributor

adamdruppe commented Oct 1, 2021 via email

@thewilsonator
Copy link
Contributor Author

nor where that f:\dd comes from.

FWIW, that's probably a Microsoft build farm internal path, I've seen similar things with with Apple and error messages from developer tool crashing and from kernel panics from all systems. Unless you built them, the debug info for e.g. the symbols corresponding to the C runtime startup routines weren't build on your machine and won't make sense on your filesystem.

@WalterBright
Copy link
Member

Did you change your sc.ini?

No. I just unzipped it.

E:\dmd2.092\windows\bin>type sc.ini
[Version]
version=7.51 Build 020


; environment for both 32/64 bit
[Environment]
DFLAGS="-I%@P%\..\..\src\phobos" "-I%@P%\..\..\src\druntime\import"

; optlink only reads from the Environment section so we need this redundancy
; from the Environment32 section (bugzilla 11302)
LIB="%@P%\..\lib"


[Environment32]
LIB="%@P%\..\lib"
LINKCMD=%@P%\optlink.exe


[Environment64]
LIB=%@P%\..\lib64
; needed to avoid COMDAT folding (bugzilla 10664)
DFLAGS=%DFLAGS% -L/OPT:NOICF

; -----------------------------------------------------------------------------
[Environment32mscoff]
LIB=%@P%\..\lib32mscoff
; needed to avoid COMDAT folding (bugzilla 10664)
DFLAGS=%DFLAGS% -L/OPT:NOICF

@thewilsonator
Copy link
Contributor Author

The command prompt I was running did not set that environment.

did you try one that does set them? You're supposed to use that one for using VS tools and things that rely on it. Perhaps detection of non-VS envvar command lines don't work when you have VS installed?

@WalterBright
Copy link
Member

06/13/2020  12:38 AM        33,119,228 dmd.2.092.1.windows.7z

@thewilsonator
Copy link
Contributor Author

did you try one that does set them? You're supposed to use that one for using VS tools and things that rely on it. Perhaps detection of non-VS envvar command lines don't work when you have VS installed?

To put that another way:
It is not unreasonable to expect users who have VS installed to run commands related to programming things from a VS command line. Similarly we expect people who have not installed VS to run such commands from a console that does not have the VS env vars set. For users that have VS installed the default libraries used should be the ones built by MS, the minGW ones are a fallback for a reason.

However I suspect that the edge case of having a VS installation AND not using a VS console is not handled properly. i.e. it detects a VS installation (which is correct, there is one) but does not account for the fact the env vars (or whatever the underlying issue is) are not properly set.

@thewilsonator
Copy link
Contributor Author

This is now green.

@RazvanN7
Copy link
Contributor

RazvanN7 commented Jan 18, 2022

Since @WalterBright has expressed concern about this PR, I think that ultimately he should be the one to accept it.

@thewilsonator
Copy link
Contributor Author

All of his concerns have been addressed. If he has new ones then he has best speak up soon.

@RazvanN7 RazvanN7 added the Merge:72h no objection -> merge The PR will be merged if there are no objections raised. label Jan 18, 2022
@thewilsonator
Copy link
Contributor Author

@RazvanN7 times up. Probably best to squash this.

@RazvanN7 RazvanN7 removed the Merge:72h no objection -> merge The PR will be merged if there are no objections raised. label Jan 21, 2022
@RazvanN7 RazvanN7 merged commit 7d1f444 into dlang:master Jan 21, 2022
@thewilsonator thewilsonator deleted the rmomf-2 branch January 21, 2022 09:25
@WalterBright
Copy link
Member

Didn't try: "can you try a larger executable?"

@adamdruppe
Copy link
Contributor

It is hard to find a large program that actually even works with optlink. The biggest programs I have - two day job applications - just crash the linker.

So gotta grab something medium sized from my other computer and I keep having other things to do. I'll find something today though.

But frankly if the linker crashes for the commercial applications and loses the speed war for simple applications, it isn't a great default regardless of how fast it is for medium sized things.

@adamdruppe
Copy link
Contributor

Pain to test with my new game website application since I'd have to convert the database library file format for it.

I know, I can do it, but I mention it because this is a pain new users to D experience pretty frequently too. "just use -m32mscoff lol" is the easier answer than "buy the extended utilities package and run the program"

@adamdruppe
Copy link
Contributor

I abandoned optlink entirely on my cross-compilation setup because it would just lock up about 30% of builds forcing me to cancel and retry when it seemed to be taking forever. The -m32mscoff build, which uses lld-link by default in the dmd distribution for quite some time now, just works consistently. Never seen it lock up like that.

@adamdruppe
Copy link
Contributor

hah i assumed my web application would be bigger, but it came in at 1.2 MB. I just write overly efficient code apparently.

Nevertheless, its numbers:

MS Link full build: 2.4 s
Optlink full build: 2.4 s
lld-link full build: 2.4s

All within the margin of measurement error.

Well, the nice thing about the web framework is it generates code automatically. Maybe I can just static foreach and add a few thousand functions, then let its templates expand that to more. But the problem is... I write efficient code and avoid unnecessary codegen since I can't stand overly slow compiles. Still worth a try.

Adding 2,000 functions brings it up to 3.6 MB. Still not as big as the day job apps by far, but again, they don't even work at all with optlink alas.

Here we go. Note that linking is going to be a relatively small minority of this time since most of it is spent generating the 2000 functions then generating the 2000 wrappers to present them to the web. Anyway:

optlink full build: 32.4s (so much for my efficient code)
mslink full build: 24.1s ....... ok im surprised, maybe more time spent in the linker than i thought.
lld-link full build: 24.3s

..... optlink got blown out. I expected the difference would be within a second or two again. Running it a few more times does show some variation: 31.6 s, 32.8s. MS and lld showed a peak of 27s. We have a margin of error here of 1-3s, but they still consistently win. The slowest MS or LLD link time is still several seconds faster than the fastest optlink time.

But both MS and lld link are well ahead. Doing a dmd -c thing just to compare that ... 24-27s. The link time is negligible with them, 99% of the time spent in dmd instead of in link (which is exactly what I thought would happen, 2000 functions still isn't a terribly large symbol table), but with optlink we're spending about 15% of the full build time in link alone.

No wonder it has no hope of linking the day job application, which takes 5-10 seconds in the lld-link phase alone with a godawful number of symbols.

I wouldn't be surprised if there's just a linear scan in optlink causing n^2 behavior that could be optimized to a hash table or something to make it competitive again.

But....... who cares? Speed is the least of its problems.

I like optlink. I used it for some seventeen years, the latter ones being when people were pushing lld-link but it didn't just work, so dmd -m32 with optlink was a default I defended since it actually did work out of the box for many things. I still go back to 16 bit DOS code sometimes too where it is fun.

But there's just no reason to keep it the default for dmd anymore. lld-link is packaged in the box and just works both in simple cases and more advanced cases with no need to convert library formats and it consistently does as well or a better job in speed.

@WalterBright
Copy link
Member

@adamdruppe thanks, this is good information. The two remaining reasons I wanted to keep optlink were having it "in the box", and it being faster. These are no longer the case, so I'm good with leaving it behind.

I could fix it, but the way it is written would consume hundreds of hours of my time (converting it to D). It's not worth it.

It's nice that finally other linkers caught up with it. It only took about 30 years :-) That's quite an accomplishment.

@WalterBright
Copy link
Member

P.S. Nearly all of its bugs appear to be capacity problems. Programs have increased enormously in size.

@adamdruppe
Copy link
Contributor

Yeah, the other linkers were brutally slow until quite recently, I think it was only about ten years ago that the gold linker became usable and kicked the others back into gear making an effort.

But D programs too tend to generate a large number of symbols. I try to code my things to keep that under control, using techniques like ctfe-only functions or type erasure forwarding. But it can still add up pretty fast.

@maxhaton
Copy link
Member

maxhaton commented Jan 23, 2022 via email

@RazvanN7
Copy link
Contributor

@thewilsonator Did this PR cause this [1]:

lib\druntime.lib: Error: Object module `errno_c_32.obj` is 32 bit OMF, but it should be 64 bit MS-Coff
lib\druntime.lib: Error: Object module `minit.obj` is 32 bit OMF, but it should be 64 bit MS-Coff

It seems that the phobos azure testing pipeline fails with this.

[1] https://dev.azure.com/dlanguage/Phobos/_build/results?buildId=27539&view=logs&j=8d924a33-246f-5b0d-e08e-f17990ff8db8&t=8958c708-7ea3-5ebf-c999-469c7dc97bb7

@thewilsonator
Copy link
Contributor Author

dlang/druntime#3703

@schveiguy
Copy link
Member

FYI, there is a potentially significant problem with targeting mscoff by default with -betterC: MS stdio (at least the way we use it) requires runtime hooks to set up stdout/stdin/stderr. See some discussion here: https://forum.dlang.org/post/zqfesvprqujdcvhdedgd@forum.dlang.org

Yes, this is happening before this PR is in the released compiler, but dub has been building using mscoff by default for a while, which could be causing this problem.

I'm not 100% sure, because I don't know what his environment looks like. He says it works on 2.091.0, but the dub switch to mscoff by default was done before that release. Which is why I said "potential". We should at least investigate if it's possible to use libc without druntime for -betterC.

@strawberry9
Copy link

strawberry9 commented Feb 9, 2022

dmd -m32mscoff -betterC (no, it does not play well together - i.e the stdout/stdin/stderr thing)

and you can forget about -betterC with -m64 ... (again, the stdout/stdin/stderr thing)..

IMO. it's a complete mess on Windows. I don't like it. It's a real turnoff.

I would prefer to have 100% reliance on the free MS toolchain (i.e. get rid of optlink, lld, and mingw completely - as all they really do, is combine in different way to create a confusing development environment on windows, IMO.)

The free MS toolchain is really not that hard to install, very easy to script, and not hard to get the correct ENV settings either.

D on Windows, should be 64bit by default, and rely 100% on the MS toolchain.

btw. IMO, this comment from Walter last year, is an irrelvant objection (for why dmd needs to continue shipping with a mulitude of toolchains, rather than relying on the buildchain that comes with the platform) :

#12825 (comment)

Additionally, this argument by Walter, for why D needs to continue shipping with this confusing, multitude of toolchains, just to avoid relying on MS toolchain, is also irrelvant IMO (i.e. the friction for developers on Windows, comes precisely from D not relying on the platforms toolchains, but instead shipping its own multitude of toolchains.) :

#13110 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.