New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#[link(kind="raw-dylib")] #2627

Open
wants to merge 14 commits into
base: master
from

Conversation

@retep998
Copy link
Member

retep998 commented Jan 22, 2019

Striving towards a future unburdened by the limitations of the traditional linker model.

Rendered

retep998 added some commits Jan 22, 2019

@Screwtapello

This comment has been minimized.

Copy link

Screwtapello commented Jan 23, 2019

...we'd no longer need to pretend the pc-windows-gnu toolchain is standalone, and we'd be able to stop bundling MinGW bits entirely in favor of the user's own MinGW installation, thereby resolving a bunch of issues.

Is this talking about rust-lang/rust#53454 ?

@retep998

This comment has been minimized.

Copy link
Member Author

retep998 commented Jan 23, 2019

@Screwtapello

This comment has been minimized.

Copy link

Screwtapello commented Jan 23, 2019

Huzzah! That bit me in the ass the other day, trying to cross-compile Windows binaries from Linux, so this sounds wonderful.

@GuentherVIII

This comment has been minimized.

Copy link

GuentherVIII commented Jan 23, 2019

Prior art: Delphi doesn't use import libraries and instead one specifies the dll file name and optionally the function name or index:

procedure foo; external 'bar.dll'; name 'fooA';
procedure foo; external 'bar.dll'; index 1;
@retep998

This comment has been minimized.

Copy link
Member Author

retep998 commented Jan 23, 2019

@GuentherVIII Does Delphi compile to native binaries where those imports are resolved by the Windows PE Loader, and not Delphi doing it's own LoadLibrary/GetProcAddress stuff? An easy way to check is to produce such a binary and use depends.exe or dumpbin.exe to check if that symbol is in the imports.

@zunzster

This comment has been minimized.

Copy link

zunzster commented Jan 23, 2019

@retep998 Delphi supports both ways. However, originally it was just the native binaries model i.e. normal Windows PE loading. Support for delay-loaded bindings came second with the addition of the delayed keyword e.g.

function GetDesktopWindow: HWND; stdcall; external user32 name 'GetDesktopWindow' delayed;

The delayed model is useful when declaring APIs that may or may not exist on the current platform version. You can do a version check before you attempt to call them and everything works transparently and you don't get PE loader errors on startup about missing APIs.

Of course, you can do your own delayed loading by hand via LoadLibrary/GetProcAddress as I used to in earlier versions of Delphi but it's tedious boilerplate. Having it built into the compiler is quite nice since the compiler/linker can collect up all the APIs and generate nice thunks that do the minimum necessary work, reuse previous work (e.g. only one LoadLibrary per DLL not per API called) and then patch themselves out so that those APIs calls are no less efficient than directly linked ones.

Could you achieve something similar in Rust with a macro? It would probably need to be a procedural macro to have enough visibility and be 'smart'.

But does Rust support delayed or lazy APIs bindings for other dylibs?

If not, maybe a general delayed attribute would be useful for that since it's not a platform specific concept. Cleanly and easily handling versioned or missing APIs is a common systems programming annoyance/papercut.

Declaration of self-interest: I'm from a long term Delphi shop and I'd really like to shift to Rust being the preferred language for all our future development. Anything that makes Rust more capable/enjoyable for Windows development is a boon for me :-)

@retep998

This comment has been minimized.

Copy link
Member Author

retep998 commented Jan 24, 2019

@zunzster

Having runtime loaded variants of all external functions in winapi is something I hope to achieve in the future, but shortcomings of declarative macros make it hard to do so with minimal effort. Having a generalized approach in Rust to do lazy loading of any external function could be a future extension of kind="dll".

Having delayed loading of Rust crates does not make any sense at the moment, as Rust does not have any sort of stable ABI.

@zunzster

This comment has been minimized.

Copy link

zunzster commented Jan 24, 2019

@retep998 Sure, I can see that an optional #[link(delayed=true)] addition might be nice in the future and looks like a pretty clean approach syntax-wise. The compiler could then collect up the delayed APIs and generate smart run-time thunks for each API.

And you're right that delay loading is not that useful for calling Rust ABIs, it's more about convenient access to versioned platform APIs.

I'm not sure I see that having delay loaded variants of all the winapi's would be that useful, unless I misunderstand, (which is entirely possible :-))

Are you thinking these would have different names from the non-delayed win apis? e.g.

#[link(name = "kernel32.dll", kind = "dll", delayed = true)]
#[allow(non_snake_case)]
extern "system" {
    fn DelayGetStdHandle(nStdHandle: u32) -> *mut u8;
}

In my Delphi code, most Windows platform APIs I need are present in all versions. It's just those annoying few that were added later that I need to support that require declaring and calling those APIs via a delay mechanism.

If in my local crate, I have my own block like the above which has the same name as the non-delay case in some winapi crate, but I've added a delayed=true directive, would that be an error in Rust or would it shadow the declaration in the winapi crate?

I can see that copying the declaration from the winapi crate into my crate and adding the delayed=true and Delay in front of the name (for example) would be very easy.

Ah, now I think I see your idea for delayed versions of all the APIs. A whole crate with delayed=true already done for you for all the APIs would be quite nice provided it could easily be generated mechanically so it had everything and was kept in sync without being a maintenance burden.

Assuming your RFC goes forward, adding delayed support might make a nice first project for learning how to extend the Rust compiler by a suitably interested party. I'm thinking of a future me :-)

@aturon

This comment has been minimized.

Copy link
Member

aturon commented Jan 31, 2019

@joshtriplett

This comment has been minimized.

Copy link
Member

joshtriplett commented Jan 31, 2019

This looks great to me! Let's make import libraries a thing of the past.

I don't think this needs any bikeshedding on names; this seems quite clear.

@Centril
Copy link
Contributor

Centril left a comment

Some proof-reading and some questions.

Show resolved Hide resolved text/0000-dll-kind.md Outdated
Show resolved Hide resolved text/0000-dll-kind.md Outdated
Show resolved Hide resolved text/0000-dll-kind.md Outdated
Show resolved Hide resolved text/0000-dll-kind.md Outdated
Show resolved Hide resolved text/0000-dll-kind.md Outdated
Show resolved Hide resolved text/0000-dll-kind.md Outdated
Show resolved Hide resolved text/0000-dll-kind.md Outdated
Show resolved Hide resolved text/0000-dll-kind.md Outdated
Show resolved Hide resolved text/0000-dll-kind.md Outdated

If that were to happen, we'd no longer need to pretend the pc-windows-gnu toolchain is standalone, and we'd be able to stop bundling MinGW bits entirely in favor of the user's own MinGW installation, thereby resolving a bunch of issues such as [rust-lang/rust#53454](https://github.com/rust-lang/rust/issues/53454).

A future extension of this feature would be the ability to optionally lazily load such external functions, since Rust would naturally have all the information required to do so.

This comment has been minimized.

@Centril

Centril Jan 31, 2019

Contributor

If you were to "speculate", what might that look like?

This comment has been minimized.

@zunzster

zunzster Feb 1, 2019

I think that might be a reference to my delayed=true optional extension discussed above e.g.

#[link(name = "kernel32.dll", kind = "dll", delayed = true)]

This comment has been minimized.

@retep998

retep998 Feb 1, 2019

Author Member

I'd personally prefer the option where the caller of the function chooses whether to do a lazy loaded call of it, and not having to choose at declaration time whether it is lazy loaded. I don't know what the syntax would look like for that.

This comment has been minimized.

@zunzster

zunzster Feb 1, 2019

@retep998 Can you say a bit more about that? I'm curious as to why that's useful. Going that way would seem to preclude (or at least complicate) the compiler from being able to optimize all the generated thunks so they are not doing any repeated LoadLibrary/GetProcAddress calls. It's likely you have a use case in mind that I've not come across before which warrants giving up that benefit.

This comment has been minimized.

@retep998

retep998 Feb 1, 2019

Author Member

The compiler could still ensure there is only one LoadLibrary for a given dll and one GetProcAddress for each symbol. Allowing the caller to choose whether they want to lazy load it does not prevent the compiler from ensuring GetProcAddress is only called once. If no dylib crates are involved, the generation of GetProcAddress thunks can be lazily deferred all the way until binary creation time. If dylib crates are involved, then either the dylib that contains the crate with the declarations would have to generate all the thunks, or there could potentially be some duplication across dylibs. Since nobody uses dylib except rustc and people who accidentally used it when they meant to use cdylib, it doesn't seem like too big of an issue.

This comment has been minimized.

@retep998

retep998 Feb 1, 2019

Author Member

As for why, it depends on whether the caller is able to deal with the function not existing. If the caller has a fallback, then it can use the lazy loaded versions and fallback if it fails to load. If the caller doesn't have any fallback, then it can use the more efficient static version.

This comment has been minimized.

@Arnavion

Arnavion Feb 1, 2019

@zunzster The user can also write such thunks themselves. It would be preferable for the user to write them since they can choose how to handle the LL/GPA failure if the DLL / function doesn't exist (crash / no-op / return a custom Err()) rather than have the compiler enforce a specific choice.

This comment has been minimized.

@zunzster

zunzster Feb 1, 2019

@Arnavion Oh, sure. I know users can write the LL/GPA thunks themselves. I've done that many a time myself. It's just tedious and hence sometimes error prone since writing robust error handling around each API call is annoying. Hence, it's nice if the compiler provides a 'pit of success' offering convenient lazy loading with robust error handling as the default.

@retep998 If you're making the lazy loading transparent to the user, you can't play with the return values since they're already a concrete type defined by some random API. So, yes, having a standard 'missing_API_handler' hook which the user can potentially override is nice. You can make the hook a no-op or your own custom handler (similar to panic handlers I suppose) but the default one should probably panic with a descriptive error and back trace.

Actually, I can't see the no-op change being especially easy though with regard to specifying what the return value (if any) should be in case of a missing API.

Some Windows API return BOOL (typedefed Integers) with 0 meaning failure.
Some APIs return HANDLE with -1 (aka INVALID_HANDLE_VALUE) for failure.

So, offering the no-op case would seem to require inelegant/complicated declarative support. I think if users want that kind of no-op behavior, they probably should have to do their own LL/GPA handling since the alternative isn't well-specified enough to be safe.

Of course, this is all just my opinion. Maybe there is a better solution I'm just not seeing since I'm used to the Delphi approach. When all you have is a hammer, every problem can start to look like a nail. :-)

Centril and others added some commits Feb 1, 2019

Apply suggestions from code review
Co-Authored-By: retep998 <retep998@gmail.com>
@joshtriplett

This comment has been minimized.

Copy link
Member

joshtriplett commented Feb 1, 2019

Show resolved Hide resolved text/0000-dll-kind.md Outdated
Show resolved Hide resolved text/0000-dll-kind.md Outdated
@zunzster

This comment has been minimized.

Copy link

zunzster commented Feb 1, 2019

@joshtriplett Might kind = "dylib" make sense as a platform-neutral term? It neatly parallels the crate-types specifiers that Rust uses i.e. dylib (and cdylib) produce a .dll, .so or .dylib respectively and this linkage spec allows such entities to be used.

@@ -0,0 +1,107 @@
- Feature Name: dll_kind

This comment has been minimized.

@joshtriplett

joshtriplett Feb 6, 2019

Member

s/dll_kind/kind_raw_dylib/ (both here and in the filename) please.

This comment has been minimized.

@retep998

retep998 Feb 8, 2019

Author Member

I did the thing.

@Akira13641

This comment has been minimized.

Copy link

Akira13641 commented Feb 6, 2019

Prior art: Delphi doesn't use import libraries and instead one specifies the dll file name and optionally the function name or index:

procedure foo; external 'bar.dll'; name 'fooA';
procedure foo; external 'bar.dll'; index 1;

I think it's probably also worth noting here that Free Pascal has identical functionality to this, except in a fully cross-platform sense (i.e. you could write that same code with "libbar.so.2" or what have you instead of "bar.dll" and have it "just work" on any Unix platform as well.)

To address the question someone else had (for @zunzster) about how it's done, I can confirm that in that scenario on Windows, FPC specifically (I don't know what exactly Delphi does) generates its own import archive stub named like: "libimpnameoflibrary.a" at compile time and links against it (meaning, there's no implicit loading by address via functions such as GetProcAddress, and also no need for a pre-existing import archive.) This does work for MSVC-built DLLs as well, not just MinGW-built ones, to be clear.

On Linux and similiar OSes, it takes the typical "ld-linux.so.2"-style approach with basically exactly the sort of use of NEEDED described elsewhere in this thread. For Mac it depends on whether it's a .dylib or .so being used.

For anyone interested in further technical details, as a few examples the source files where a large chunk of the linkage-related functionality for Windows, Linux and Mac (as well as traditional BSDs) is implemented for FPC can be found here, here, and here.

@joshtriplett

This comment has been minimized.

Copy link
Member

joshtriplett commented Feb 8, 2019

Looks good to me! I look forward to future versions of winapi, and I'm excited at the prospect of a standalone Rust target for windows that does not depend on MSVC or MinGW.

@rfcbot merge

@rfcbot

This comment has been minimized.

Copy link

rfcbot commented Feb 8, 2019

Team member @joshtriplett has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@retep998

This comment has been minimized.

Copy link
Member Author

retep998 commented Feb 8, 2019

Oh, I just realized another benefit of this. If two dlls both provide the symbol foo, with this feature you'd be able to link to both of them at the same time!

@joshtriplett

This comment has been minimized.

Copy link
Member

joshtriplett commented Feb 8, 2019

(I didn't realize until poking rfcbot here that this was tagged with multiple teams. I don't think it needs to be a multi-team RFC.)

@rfcbot cancel

@rfcbot

This comment has been minimized.

Copy link

rfcbot commented Feb 8, 2019

@joshtriplett proposal cancelled.

@joshtriplett

This comment has been minimized.

Copy link
Member

joshtriplett commented Feb 8, 2019

@rfcbot merge

@rfcbot

This comment has been minimized.

Copy link

rfcbot commented Feb 8, 2019

Team member @joshtriplett has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@estebank

This comment was marked as resolved.

Copy link
Contributor

estebank commented Feb 8, 2019

@Zoxc

This comment has been minimized.

Copy link

Zoxc commented Feb 8, 2019

I don't like how to this uses the normal form of the link attribute, which effectively apply to multiple crates, while this only applies to the extern section. Also raw-dylibs are exactly the same as dylibs, they are just linked differently. You'd also want to be able to use for static and framework kinds (so you can link to things with the same symbol from different libraries). I think we'd want a new attribute or a modifier on the link attribute instead of introducing new kinds.

@retep998

This comment has been minimized.

Copy link
Member Author

retep998 commented Feb 9, 2019

Frankly, I find it kind of weird that most of the time #[link] technically has nothing to do with the extern block it is attached to (the only exception being whether dllimport is applied on Windows).

Doing the equivalent of raw-dylib but for static libraries is a completely different challenge because it's just not possible with current linkers. Unlike with dlls, you need the static library around to link to it and nothing can ever change that. As for linking to the same symbol from multiple static libraries, that would require Rust to use LLD with some fairly major changes to allow Rust to dictate that sort of thing to LLD.

I am not opposed to having a new property for #[link] that would modify the dylib kind to function as described in the RFC. I'll leave it up to the lang team to decide what they think is better.

@joshtriplett

This comment has been minimized.

Copy link
Member

joshtriplett commented Feb 9, 2019

@Zoxc

Also raw-dylibs are exactly the same as dylibs, they are just linked differently.

Yes, that's the idea, and where the name raw-dylib came from.

You'd also want to be able to use for static and framework kinds (so you can link to things with the same symbol from different libraries).

That would be a completely different proposal, and one that, as @retep998 mentions, would be quite different in implementation to the extent it's possible. That shouldn't be part of this proposal, which is focused on a specific problem that's causing practical issues on Windows platforms today.

@liigo

This comment has been minimized.

Copy link
Contributor

liigo commented Feb 11, 2019

The new (and fixed) rendered link: https://github.com/retep998/rfcs/blob/kindly-idata-my-dlls/text/0000-raw-dylib-kind.md

By the way, how are you think about #[link(kind="cdylib")]? Rust already use crate-type = cdylib to build a dynamic library. @retep998 @joshtriplett

@retep998

This comment has been minimized.

Copy link
Member Author

retep998 commented Feb 11, 2019

@liigo That would imply dylib is for linking to Rust dylibs while cdylib is for linking to normal dynamic libraries, which would be excessively confusing. It's already confusing enough as is to get people to use the cdylib crate type, let's not overload that terminology here.

@aturon

This comment has been minimized.

Copy link
Member

aturon commented Feb 14, 2019

@rfcbot reviewed

Marking as "reviewed" to let this go forward, but I'm abstaining on this one.

@rfcbot

This comment has been minimized.

Copy link

rfcbot commented Feb 14, 2019

🔔 This is now entering its final comment period, as per the review above. 🔔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment