Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ABI breakages on ucrt64 #9363

Closed
LABPBS opened this issue Aug 12, 2021 · 44 comments
Closed

ABI breakages on ucrt64 #9363

LABPBS opened this issue Aug 12, 2021 · 44 comments
Labels

Comments

@LABPBS
Copy link
Contributor

LABPBS commented Aug 12, 2021

It seems binaries built after august have broken ucrt imports
Example (libtiff, however it can be seen in python, libdeflate)
4.3.0-3
1
4.3.0-4
2

im guessing binutils 2.37 broke something, but its just a guess

those broken imports are not present in either mingw64 or clang64 so it seems to be gcc on ucrt specific

@Biswa96
Copy link
Member

Biswa96 commented Aug 12, 2021

What do you mean by broken imports? Does the program fail to execute normal operations or not?

@LABPBS
Copy link
Contributor Author

LABPBS commented Aug 12, 2021

the screenshots are from peview a handy gui that shows imported and exported symbols from windows binaries

equivalent cli:
python-3.9.6-2

objdump -x libpython3.9.dll | sort | uniq -d

python-3.9.6-3

objdump -x libpython3.9.dll | sort | uniq -d
        2ff630     10  __setusermatherr
        2ff644     15  _heapmin
        2ff650     24  _set_new_mode
        2ff660     25  calloc
        2ff66a     26  free
        2ff672     27  malloc
        2ff67c     28  realloc
        2ff686     35  _fdopen
        2ff690     52  _hypot
        2ff69a    216  frexp

Sin título

@LABPBS
Copy link
Contributor Author

LABPBS commented Aug 12, 2021

This basic example gives a similar error

int main(void)
{
    return 0;
}

I managed to track the issue down to mingw-w64 #9259 works while #9284 doesnt

it seems to only affect ucrt64, mingw64 and clang64 work just fine, and binutils doesnt seem to cause the issue

@jeremyd2019
Copy link
Member

that's pretty odd. There's only 2 commits difference to mingw-w64 between there, mingw-w64/mingw-w64@8548ab7 and mingw-w64/mingw-w64@1db0f47.

@mati865
Copy link
Collaborator

mati865 commented Aug 12, 2021

If clang64 is fine then it's unlikely to be mingw-w64 issue (it uses UCRT after all) given the changes between mingw-w64 versions.

@jeremyd2019
Copy link
Member

maybe the binutils used to build mingw-w64-crt? (just brainstorming, no actual idea)

@Biswa96
Copy link
Member

Biswa96 commented Aug 13, 2021

"Works in my machine". The api-ms-win-crt-math-l1-1-0.dll is resolved to ucrtbase.dll by Windows OS automatically. It is not an actual DLLs file, generally called API Set DLLs. If you open the ucrt import library in Windows SDK and mingw-w64, both has api-ms-win-crt-math-l1-1-0.dll exports. I did not understand the language in the screenshot, but the _heapmin is exported from api-ms-win-crt-heap-l1-1-0.dll.

@lazka
Copy link
Member

lazka commented Aug 13, 2021

those broken imports are not present in either mingw64 or clang64 so it seems to be gcc on ucrt specific

Why do you think they are broken? And what tool are you using to show them?

@LABPBS
Copy link
Contributor Author

LABPBS commented Aug 13, 2021

"Works in my machine".

Maybe its a bit more fiddly for me because im running windows 7?

what tool are you using to show them?

peview.exe from https://github.com/processhacker/processhacker/releases/download/v2.39/processhacker-2.39-bin.zip

Why do you think they are broken?

api-ms-win-crt-math-l1-1-0.dll doesnt export malloc or calloc or free so thats why that "the procedure entry point could not be located in the dll" error

i just tried compiling that example with mingw64 passing -lucrt and theres no problem, maybe it is binutils?

@lazka
Copy link
Member

lazka commented Aug 13, 2021

Maybe its a bit more fiddly for me because im running windows 7?

That could be, no one tests for win7 anymore.

api-ms-win-crt-math-l1-1-0.dll doesnt export malloc or calloc or free so thats why that "the procedure entry point could not be located in the dll" error

Ah, yes I see that too now.

@lazka
Copy link
Member

lazka commented Aug 14, 2021

@lhmouse do you know what could cause this? (#9363 (comment))

@lhmouse
Copy link
Contributor

lhmouse commented Aug 14, 2021

I have no idea. Maybe @mstorsjo knows more.

I don't have any Windows 7 machine at hand. I may have a look next week.

@lhmouse
Copy link
Contributor

lhmouse commented Aug 14, 2021

Dependency Walker shows an imported malloc from the math DLL, which does not exist:
6265

x64dbg shows two imports with the name malloc, which is possibly an indirect dependency:
6266

@lazka
Copy link
Member

lazka commented Aug 14, 2021

Simply rebuilding mingw-w64-crt-git and then mingw-w64-libtiff fixes the issue as can be seen here: #9384

@Biswa96
Copy link
Member

Biswa96 commented Aug 14, 2021

Can anyone reproduce this issue in Windows 10?

@LABPBS
Copy link
Contributor Author

LABPBS commented Aug 14, 2021

I can confirm that #9386 fixes int main() {return 0;}

@jeremyd2019
Copy link
Member

But why? Maybe comparing the import libraries would reveal something?

@jeremyd2019
Copy link
Member

I was putting together a tool to try to detect this (specifically, finding an image trying to import malloc from the api-ms-win-crt-math-l1-1-0.dll), and found this is apparently not just between #9259 and #9284. I checked /ucrt64/bin/gettext.exe which has a modify date of 2021-03-23 and it attempts to import malloc from that library, despite being much older than the supposedly problematic CRT update (or the binutils update, for that matter).

@jeremyd2019
Copy link
Member

jeremyd2019 commented Aug 15, 2021

https://gist.github.com/jeremyd2019/d3cf9ae792958b9f470ff9a57d3c5f30 - this uses the img* code from https://sourceforge.net/p/mingw-w64/mingw-w64/ci/master/tree/mingw-w64-tools/genpeimg/src/ for dealing with the PE headers. No error checking or anything... just a quick hack

@jeremyd2019
Copy link
Member

I was putting together a tool to try to detect this (specifically, finding an image trying to import malloc from the api-ms-win-crt-math-l1-1-0.dll), and found this is apparently not just between #9259 and #9284. I checked /ucrt64/bin/gettext.exe which has a modify date of 2021-03-23 and it attempts to import malloc from that library, despite being much older than the supposedly problematic CRT update (or the binutils update, for that matter).

Oops, just noticed that my quick tool was looking at argv[0] not argv[1]... (I was running under ucrt64 and hadn't updated
the crt after the latest rebuild) Nevermind...

lazka added a commit that referenced this issue Aug 15, 2021
@Biswa96 Biswa96 added the ucrt label Aug 15, 2021
@jeremyd2019
Copy link
Member

It would be nice to know what went wrong, to hopefully avoid it happening again, and maybe adding a test for it to mingw-w64-crt-git

@lazka
Copy link
Member

lazka commented Aug 15, 2021

I'm not well versed in how this all works, so I hope someone else can. I compared the output of nm for libucrt.a of both versions and they look the same, besides some random identifiers.. so, one possibility is that it triggered a bug in binutils.

Here are both versions:

@jeremyd2019
Copy link
Member

Yeah I tried looking at objdump output, but I didn't see anything.

@lazka
Copy link
Member

lazka commented Aug 15, 2021

I'll try to add a check for CI so we notice when it comes back.

@lazka
Copy link
Member

lazka commented Aug 15, 2021

so, one possibility is that it triggered a bug in binutils.

which would also explain why we don't see it with clang

@jeremyd2019
Copy link
Member

Without knowing what went wrong, I don't know if just looking for malloc being imported from the "math" dll would catch everything.

Clang/lld uses a different import library format than binutils too.

@mstorsjo
Copy link
Contributor

I understand what's wrong.

GNU ld.bfd sorts object files within archives alphabetically, in some cases. It's essential for how DLL import tables are constructed. Let's have a look at libucrt.a in the -2 version of the archive, the good one that works.

If you have an undefined reference to malloc, it'll pull in dodhs00026.o from libucrt.a. That object file has an undefined reference to _head_lib64_libapi_ms_win_crt_heap_l1_1_0_a, which is provided by dodhh.o, which then has an undefined reference to __lib64_libapi_ms_win_crt_heap_l1_1_0_a_iname which is provided by dodht.o. In this case, the pattern is dodh<suffix>.o. You have a header object h, individual object files for each symbol with suffix s%05d, and a trailer with suffix t. These can end up included in various orders, but the lexical sorting of them will order them as h, s00001, s00026, s000..., t so the individual chunks that construct the PE import table turn out correct.

The prefix, dodh, is random when import libraries are generated by dlltool.

Normally, I think this lexical sorting only is applied within object files of each library, and in normal cases (when one import library corresponds to one DLL), each import library would contain only one random prefix. But in the case of libucrt.a, there's a dozen of different import libraries, and as the linker sees them like one single library, the lexical sorting applies to all of them.

In the good version of the import library, libapi_ms_win_crt_heap_l1_1_0_a used a prefix of dodh and libapi_ms_win_crt_math_l1_1_0_a used diab.

Now if you look at the -1 version of the libraries, the broken one, both the heap and math libraries had gotten the same random prefix dste. So when the linker does lexical sorting of the individual object files from libucrt.a, object files corresponding to the heap and math can end up mixed together.

(As a related side note; you might have noticed the bug where MS link.exe can seemlingly manage to link against an import library created by GNU ld.bfd, but if you pass it two such libraries in the same invocation, they end up intermixed. But if you instead use an import library created by GNU dlltool instead of straight from the linker, MS link.exe doesn't run into this issue. The reason is that ld.bfd doesn't use a random prefix, iirc, but dlltool does.)

Therefore: It's not enough to just check if heap+math are mixed, it can happen to any pair of libraries in libucrt.a. A better check would be to make sure that all the libraries in libucrt.a use unique prefixes, or simpler, just check that all members in libucrt.a use different member file names. (Import libraries created by MS tools, and by lld and llvm-dlltool, use the short import library format, and there the member names are quite different and you'll have lots of members with the same names.)

One could think that a random string of 4 chars should be quite enough random space to avoid collisions if you only pick around a dozen random strings. But maybe the random string selection is seeded with something as coarse as the current realtime clock in whole seconds. And then it's suddenly very very plausible that all of them end up with the same prefix. So improving the random seed here could reduce the risk of the issue occurring.

Looking at some other import libraries, in the mingw-w64-x86-64-dev package in Ubuntu 20.04, the import libraries seem to be using a prefix based on the actual import library name instead of a random one. I don't know offhand if that's a separate dlltool feature that they've enabled, or if it's a custom patch or something. They're at least shipping binutils 2.34 so it's not too far off (and binutils have been using the random prefix for quite some time I think).

@jeremyd2019
Copy link
Member

So something like
ar t libmsvcrt.a | grep -E '\.o\s*$' | sort | uniq -d. (needed the \s* apparently due to #2603)

@lazka
Copy link
Member

lazka commented Aug 15, 2021

could --enable-deterministic-archives in binutils be related? all distros seem to use it

@mstorsjo
Copy link
Contributor

could --enable-deterministic-archives in binutils be related? all distros seem to use it

Oh, that's probably it indeed. That would probably avoid the whole issue with potential random clashing of the member names.

If one would make GNU ld use the output name as member object prefix for its generated import libraries, some projects could also avoid needing to use dlltool separately for making MSVC compatible import libraries :-)

@lazka
Copy link
Member

lazka commented Aug 15, 2021

Without knowing what went wrong, I don't know if just looking for malloc being imported from the "math" dll would catch everything.

Checking if all imports happen only once should work, no?

@lazka
Copy link
Member

lazka commented Aug 15, 2021

could --enable-deterministic-archives in binutils be related? all distros seem to use it

that doesn't seem to help, the prefix is still different between two consecutive builds.

@lazka
Copy link
Member

lazka commented Aug 15, 2021

@mstorsjo
Copy link
Contributor

From what I understand it uses getpid() always: https://github.com/bminor/binutils-gdb/blob/a32a7fdc94efe68926019d870575d0968d8a0a28/binutils/dlltool.c#L3936

Hmm, is there any code that would correspond to what I'm seeing in the ubuntu import library then:

$ ar t /usr/x86_64-w64-mingw32/lib/libkernel32.a  | head
libkernel32t.o
libkernel32h.o
libkernel32s01622.o
libkernel32s01621.o
libkernel32s01620.o
[...]

@mstorsjo
Copy link
Contributor

Oh, this explains it: https://salsa.debian.org/mingw-w64-team/mingw-w64/-/blob/master/debian/patches/dlltool-temp-prefix.patch

Unfortunately, llvm-dlltool doesn't know that option yet. (It's not of much use there so it'd need to just ignore it.)

@jeremyd2019
Copy link
Member

configure could check if the option is supported and only use it if it is... (yay autotools)

@jeremyd2019
Copy link
Member

jeremyd2019 commented Aug 15, 2021

Without knowing what went wrong, I don't know if just looking for malloc being imported from the "math" dll would catch everything.

Checking if all imports happen only once should work, no?

If you are building an 'image'. I think mingw-w64-crt-git only builds archives though, I was trying to find something that could be checked against mingw-w64-crt-git directly.

I suppose another important test of mingw-w64-crt-git is that it can be used to successfully link executables 😉

@jeremyd2019
Copy link
Member

Even in the -2 version, there are a lot of duplicated object files in libmincore.a and libwindowsapp.a, and lib64_libmingw32_a-crt0_w.o duplicated in libmingw32.a and lib64_libmingwex_a-strtof.o duplicated in libmingwex.a.

for f in ucrt64/x86_64-w64-mingw32/lib/*.a; do
    ar t "${f}"  | sort | uniq -d | grep -E '\.o\s*$' && echo $f
done

@lazka
Copy link
Member

lazka commented Aug 16, 2021

I've created #9398 and #9399

@mstorsjo
Copy link
Contributor

mstorsjo commented Aug 16, 2021

I posted a patch on mingw-w64-public where I implement detection for support for the flag.

Even in the -2 version, there are a lot of duplicated object files in libmincore.a and libwindowsapp.a, and lib64_libmingw32_a-crt0_w.o duplicated in libmingw32.a and lib64_libmingwex_a-strtof.o duplicated in libmingwex.a.

The former seems to be an actual duplicate, the latter is a case of the library containing both gdtoa/strtof.c and stdio/strtof.c.

@jeremyd2019
Copy link
Member

The libmincore.a and libwindowsapp.a duplicates are gone from -3, leaving just the strtof and crt0_w ones

@Biswa96
Copy link
Member

Biswa96 commented Aug 16, 2021

The libmincore.a and libwindowsapp.a duplicates are gone

But some api-ms-win*.def files contain duplicate names. We could follow the list from here https://docs.microsoft.com/en-us/uwp/win32-and-com/win32-apis. But in many cases that list do not match with actual library file in WinSDK. Also fear of breaking existing things.

@mstorsjo
Copy link
Contributor

The libmincore.a and libwindowsapp.a duplicates are gone

But some api-ms-win*.def files contain duplicate names. We could follow the list from here https://docs.microsoft.com/en-us/uwp/win32-and-com/win32-apis. But in many cases that list do not match with actual library file in WinSDK. Also fear of breaking existing things.

That’s a totally different thing from the discussed here, please don’t derail the discussion.

Whether a library in WinSDK contains a function or not is the only truth. Out of the api-ms-* DLLs, many of them contain the same function. But when you link against an import library in WinSDK, that import library will only contain that function once, for one such DLL that provides it.

jon-y pushed a commit to mingw-w64/mingw-w64 that referenced this issue Aug 17, 2021
When GNU dlltool generates import libraries, it picks a semi-random
prefix string for its file names based on the pid of the process.
Normally, the prefix doesn't matter much, but when we merge multiple
import libraries into one, like for libucrt.a, the prefixes need
to be unique (otherwise their import tables get entangled).

In practice it has been noticed that these aren't always unique
(see msys2/MINGW-packages#9363).
Instead pass an option to give it use a unique prefix for each
library (based on the target dll name).

LLVM dlltool uses a different format of import library, where
there's no corresponding semi random prefix. LLVM dlltool doesn't
support the --temp-prefix option either (yet). Therefore, try to
detect whether the option is supported.

Based on a Debian patch by Stephen Kitt.

Signed-off-by: Martin Storsjö <martin@martin.st>
@mstorsjo
Copy link
Contributor

FYI I pushed upstream commits now that add the same --temp-prefix flag to dlltool if supported, and got rid of the duplicate crt0_w.o object file.

@LABPBS LABPBS closed this as completed Oct 17, 2021
kou pushed a commit to kou/MINGW-packages that referenced this issue Oct 18, 2021
kou pushed a commit to kou/MINGW-packages that referenced this issue Oct 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants