Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correctly handle dllimport on Windows #27438

Open
alexcrichton opened this issue Jul 31, 2015 · 117 comments
Open

Correctly handle dllimport on Windows #27438

alexcrichton opened this issue Jul 31, 2015 · 117 comments

Comments

@alexcrichton
Copy link
Member

@alexcrichton alexcrichton commented Jul 31, 2015

Currently the compiler makes basically no attempt to correctly use dllimport. As a bit of a refresher, the Windows linker requires that if you're importing symbols from a DLL that they're tagged with dllimport. This helps wire things up correctly at runtime and link-time. To help us out, though, the linker will patch up a few cases where dllimport is missing where it would otherwise be required. If a function in another DLL is linked to without dllimport then the linker will inject a local shim which adds a bit of indirection and runtime overhead but allows the crate to link correctly. For importing constants from other DLLs, however, MSVC linker requires that dllimport is annotated correctly. MinGW linkers can sometimes workaround it (see this commit description.

If we're targeting windows, then the compiler currently puts dllimport on all imported constants from external crates, regardless of whether it's actually being imported from another crate. We rely on the linker fixing up all imports of functions. This ends up meaning that some crates don't link correctly, however (see this comment: #26591 (comment)).

We should fix the compiler's handling of dllimport in a few ways:

  • Functions should be tagged with dllimport where appropriate
  • FFI functions should also be tagged with dllimport where appropriate
  • Constants should not always be tagged with dllimport if they're not actually being imported from a DLL.

I currently have a few thoughts running around in my head for fixing this, but nothing seems plausible enough to push on.

EDIT: Updated as @mati865 requested here.

alexcrichton added a commit to alexcrichton/rust that referenced this issue Aug 11, 2015
This commit leverages the runtime support for DWARF exception info added
in rust-lang#27210 to enable unwinding by default on 64-bit MSVC. This also additionally
adds a few minor fixes here and there in the test harness and such to get
`make check` entirely passing on 64-bit MSVC:

* The invocation of `maketest.py` now works with spaces/quotes in CC
* debuginfo tests are disabled on MSVC
* A link error for librustc was hacked around (see rust-lang#27438)
alexcrichton added a commit to alexcrichton/rust that referenced this issue Aug 11, 2015
This commit leverages the runtime support for DWARF exception info added
in rust-lang#27210 to enable unwinding by default on 64-bit MSVC. This also additionally
adds a few minor fixes here and there in the test harness and such to get
`make check` entirely passing on 64-bit MSVC:

* The invocation of `maketest.py` now works with spaces/quotes in CC
* debuginfo tests are disabled on MSVC
* A link error for librustc was hacked around (see rust-lang#27438)
alexcrichton added a commit to alexcrichton/rust that referenced this issue Aug 11, 2015
This commit leverages the runtime support for DWARF exception info added
in rust-lang#27210 to enable unwinding by default on 64-bit MSVC. This also additionally
adds a few minor fixes here and there in the test harness and such to get
`make check` entirely passing on 64-bit MSVC:

* The invocation of `maketest.py` now works with spaces/quotes in CC
* debuginfo tests are disabled on MSVC
* A link error for librustc was hacked around (see rust-lang#27438)
alexcrichton added a commit to alexcrichton/rust that referenced this issue Aug 12, 2015
This commit leverages the runtime support for DWARF exception info added
in rust-lang#27210 to enable unwinding by default on 64-bit MSVC. This also additionally
adds a few minor fixes here and there in the test harness and such to get
`make check` entirely passing on 64-bit MSVC:

* The invocation of `maketest.py` now works with spaces/quotes in CC
* debuginfo tests are disabled on MSVC
* A link error for librustc was hacked around (see rust-lang#27438)
alexcrichton added a commit to alexcrichton/rust that referenced this issue Aug 12, 2015
This commit leverages the runtime support for DWARF exception info added
in rust-lang#27210 to enable unwinding by default on 64-bit MSVC. This also additionally
adds a few minor fixes here and there in the test harness and such to get
`make check` entirely passing on 64-bit MSVC:

* The invocation of `maketest.py` now works with spaces/quotes in CC
* debuginfo tests are disabled on MSVC
* A link error for librustc was hacked around (see rust-lang#27438)
@vadimcn
Copy link
Contributor

@vadimcn vadimcn commented Aug 12, 2015

So basically, with MSVC toolchain you are supposed to know whether you are going to be linking with a static lib or a dll version of the upstream library when compiling your crate.

Here's some ideas that come to my mind:

  • Add attributes for marking up linking modes of extern crates (e.g. #[link_dll] extern crate foo).
    Cons: since this wart is unique to MSVC toolchain, it is unlikely that crate authors, who are not on Windows, will pay any attention to it. Besides, sometimes you simply can't tell in advance whether you'll need to link with a dll.
  • Don't mark anything as dllimport
    • Code refs: let the linker fix them (extra jmp overhead in the case of dll library),
    • Data refs: avoid imported data altogether. Many Windows libraries take this approach using getter/setter functions where having exported data cannot be avoided. Cons: same as above.
  • Mark all imported symbols as dllimport:
    • Code refs: let the linker fix them (indirect call overhead for static libraries).
    • Data refs: emit extra _imp__foo = &foo symbols into static libraries (linking succeeds, but adds the overhead of indirect access).
  • Since rustc knows which flavor of the library it it going to link, it could take bitcode saved in Rust crates for LTO, add dllimport attribute to the symbols that need it, then re-emit the object file.
    Cons: increases compilation time, won't work for staticlibs (since those are not necessarily linked via rustc)

@alexcrichton
Copy link
Member Author

@alexcrichton alexcrichton commented Aug 12, 2015

I agree that strategies like #[link_dll] probably won't work out so hot, whatever we do probably needs to be baked into the compiler as much as possible instead of requiring annotations.

I'm personally a bit up in the air about how to tackle this. I only know of one absolute failure mode today (#26591) and otherwise the drawbacks today are lots of linker warnings and some extra indirection (not so bad). In that sense it doesn't seem incredibly urgent to tackle this, but it's certainly a wart!

@Diggsey
Copy link
Contributor

@Diggsey Diggsey commented Aug 13, 2015

Here's some links with useful information:
Using dllexport from static libs: https://support.microsoft.com/en-us/kb/141459
dllimport/dllexport details: https://msdn.microsoft.com/en-us/library/aa271769%28v=vs.60%29.aspx
#26591 is not specific to rust: http://blogs.msdn.com/b/oldnewthing/archive/2014/03/21/10509670.aspx

Looks like you can use a module definition file to get the current strategy to at least work in all cases, although it wouldn't remove the unnecessary indirection.

In general it seems impossible to get best performance in all situations, unless you either have two sets of binaries for dynamic vs static linking (as microsoft does with the standard library), or always build from source, and use compiler flags to fine-tune its behaviour. Only cargo really exists at a high enough level to be able to do that automatically.

Is there any danger of actual runtime unsafety here, eg. the msdn article implies that getting this wrong can result in code which uses the address of the constant getting the address of the import table instead?

@alexcrichton
Copy link
Member Author

@alexcrichton alexcrichton commented Aug 13, 2015

@Diggsey that MSDN blog post is actually different than #26591 I believe, handling dllexport to the best of my knowledge is done 100% correctly now that #27416 has landed. The bug you referenced, #26591, has to do with applying dllimport to all imported statics, and if they're not actually imported via a DLL then the linker won't include the object file. It's kinda the same problem, but not precisely.

Is there any danger of actual runtime unsafety here, eg. the msdn article implies that getting this wrong can result in code which uses the address of the constant getting the address of the import table instead?

I'm unaware of any impending danger, but which MSDN article were you referencing here? I'm under the impression that the linker fixes up references to functions automatically, and it apparently also fixes up dllimported statics to not use dllimport if not necessary.

@retep998
Copy link
Member

@retep998 retep998 commented Aug 13, 2015

@alexcrichton When attempting to link to some statics in the CRT directly from Rust, the code would link fine but the generated code was wrong. It would would follow the pointer to the static, but instead of accessing the static it then treated the static as another pointer and would segfault since the static wasn't supposed to be a pointer. I'm not sure whether this was the fault of dllimport or shoddy LLVM code generation or even rustc itself. We'll probably need some make tests covering all the possibilities.

@vadimcn
Copy link
Contributor

@vadimcn vadimcn commented Aug 16, 2015

The MS link's behavior to require an object file to be chosen for linking before starting to auto-insert __imp__ stubs for dllimport'ed symbols is pretty strange. I could not find any mention of this on MSDN (which, hopefully, would explain the rationale), but I can confirm that it indeed works this way.
Manually inserting __imp__foo = &foo symbol into the static library, as I proposed in option 3 above, fixed the case of data-only static lib, and seems to have no ill effects for dlls, so maybe that's the way to go?
Here's my test case:


Static library:

__declspec(dllexport) int foo = 1;
int* _imp__foo = &foo;

__declspec(dllexport) int bar() { return 2; }
void* _imp__bar = &bar;

Main executable:

__declspec(dllimport) int foo;
__declspec(dllimport) int bar();

int main()
{
    int f = foo;
    int b = bar();
    printf("%d %d\n", f, b);
    return 0;
}

If you comment out both _imp__foo and _imp__bar, you'll end up with an "unresolved external symbol" link error. Commenting out only one of them makes linking succeed with a warning.

@retep998
Copy link
Member

@retep998 retep998 commented Aug 16, 2015

@vadimcn That doesn't quite solve the issue if I need to link to statics from a library that I don't control, like a system library.

@vadimcn
Copy link
Contributor

@vadimcn vadimcn commented Aug 16, 2015

@retep998, Yeah, this only solves the problem for Rust crate writers.
For FFI we still might need some sort of #[dllimport]/#[no_dllimport attribute (depending on which is the default).

@vadimcn
Copy link
Contributor

@vadimcn vadimcn commented Aug 16, 2015

Actually, I wonder if we could we use the #[link(kind="...")] attribute as a cue. At first sight, it tells the compiler exactly what it needs to know - whether the library is static or a dll.

@alexcrichton
Copy link
Member Author

@alexcrichton alexcrichton commented Aug 17, 2015

@vadimcn, @retep998 to solve that issue (needed to bootstrap with LLVM) the compiler now has a #[linked_from] attribute for extern blocks like so:

#[linked_from = "foo"]
extern {
    // ...
}

This instructs the compiler that all the items in the block specified come from the native library foo, so dllexport and dllimport can be applied accordingly. The way that foo is linked in is determined by either -l flags or #[link] annotations elsewhere.

Note that #[linked_from] is unstable right now as we'll probably want to think about the design a little more, but it should in theory provide the ability for the compiler to apply dllimport to all native library imports appropriately

@vadimcn
Copy link
Contributor

@vadimcn vadimcn commented Aug 17, 2015

@alexcrichton: I am confused about the purpose of #[linked_from=...]. How is it different from #[link(name=...)]?

@alexcrichton
Copy link
Member Author

@alexcrichton alexcrichton commented Aug 17, 2015

There's no connection between #[link], -l, and a set of symbols. As a result we don't actually know where symbols in an extern block came from (e.g. what native library they were linked from). The #[linked_from] attribute serves to provide that connection

@vadimcn
Copy link
Contributor

@vadimcn vadimcn commented Aug 17, 2015

There's no connection between #[link], -l, and a set of symbols...

This seems a bit bizarre. #[link]'s can be placed only at extern {} blocks, so it would seem that they are associated with each other. Why did we need a new attribute?

@alexcrichton
Copy link
Member Author

@alexcrichton alexcrichton commented Aug 17, 2015

True, but they're frequently not attached to the functions in question. It's pretty common to have a #[link] on an empty extern block which is actually affecting something somewhere else.

Many of these could probably be fixed with the advent of #[cfg_attr], but it doesn't solve the Cargo problem where most native libraries come from a -l flag to the compiler, where we definitely don't have a connection from an arbitrary -l flag to an extern block and a set of symbols.

@vadimcn
Copy link
Contributor

@vadimcn vadimcn commented Aug 21, 2015

Here's my attempt at fixing the problem with data dllimports: vadimcn/rust@d5d7ac5
TODO: do this for windows targets only
Should I also check that the symbol is in reachable?

@alexcrichton
Copy link
Member Author

@alexcrichton alexcrichton commented Aug 21, 2015

@vadimcn isn't that basically just applying dllexport to all items? How would ensuring that __imp_foo exists help with dllimport?

@retep998
Copy link
Member

@retep998 retep998 commented Aug 21, 2015

@alexcrichton I think the idea is that if you dllimport a static but the static is statically linked, and not coming from a DLL, the linker will look for __imp_foo and find it so that it works. Since making __imp_foo exist would only help when dynamically linking rust libraries, shouldn't we already have all the metadata when we link a rust library to determine whether dllimport is needed thus making that __imp_foo thing unnecessary? Really the big issue is telling Rust whether a symbol from a native library needs dllimport or not. When you link a native library on Windows it could have static statics or it could have dynamic statics, and there's no way for Rust to know which it is except through user added annotations.

@alexcrichton
Copy link
Member Author

@alexcrichton alexcrichton commented Aug 21, 2015

Hm yeah I can see how that would solve our linkage problems (#26591), but I wouldn't consider it as closing this issue. There'd still be an extra level of indirection in many cases which we otherwise shouldn't have.

We also unfortunately don't have the information to determine whether to apply dllimport currently. That attribute can only be applied at code-generation time, and when we're generating an object file we don't actually know how it's going to end up getting linked. For example an rlib may be later used to link against an upstream dylib or an upstream rlib. This basically means that at code generation time we don't know the right set of attributes to emit.

For native libraries it'll certainly require user annotations, but that's what I was hoping to possibly stabilize the #[linked_from] attribute with at some point.

@vadimcn
Copy link
Contributor

@vadimcn vadimcn commented Aug 21, 2015

@alexcrichton: I think dllexport only works when creating an actual dll. The __imp__ stubs go into the import library, which does not exist in the case of a Rust static library (.rlib).

As you mention above, the determination of whether to apply dllimport must be made at code generation time. So if we want Rust crate linking to Just Work, we are going to have to accept some overhead (unless we use LTO bitcode to do some just-before-linking code generation, as I proposed in option 4. But you didn't seem to like it too much).

For data, there isn't much choice, since marking dllimport is the only case that works for both static and dynamic linking. Fortunately, public data is not common in Rust crates.

For code, we can choose between:

  • not marking with dlimport, and having an extra jmp when linking to a dll, or,
  • always marking with dlimport, and suffering from indirect calls when linking statically (actually, the linker is supposed to be smart enough to re-write these as direct calls + some nop padding, but I haven't seen MS linker actually do that).

@vadimcn
Copy link
Contributor

@vadimcn vadimcn commented Aug 21, 2015

For native libs, I think we should be able to use information from #[link(kind="...")]?

@alexcrichton
Copy link
Member Author

@alexcrichton alexcrichton commented Aug 21, 2015

Ah interesting! So the foo.dll doesn't actually have __imp_foo symbols, just the code in foo.lib? That... would make sense!

I've toyed around with a few ideas to handle our dllimport problem, and we could in theory just start imposing more restrictions on consumers of rust libraries to solve this. Whenever an object file is generated the compiler would need to make a decision about whether it's linking statically or dynamically to upstream dependencies. The compiler knows what formats are available, and the only ambiguous case is when both are available. Once a decision is made, the decision is encoded into the metadata to ensure that future linkage against the upstream library always remains the same.

Thinking this through though in the past I've convinced myself that we'll run into snags. I can't quite recall them at this time, however. In theory though this would enable us to actually properly apply dllimport in all cases.

Also yeah, for native libraries we always precisely know how they're being linked (statically, dynamically, framework, etc), so this isn't a problem in their case. We just need to know what symbols come from what library and that's what #[linked_from] is serving as.

@retep998
Copy link
Member

@retep998 retep998 commented Oct 22, 2015

@alexcrichton

Oh, there is a plugin field in the [lib] section of Cargo.toml. Never noticed that before. Well, I guess that means Cargo does know everything it needs to know. In which case all we'd have to do is forbid mixing of rlib/dylib, make sure the plugin and all its dependencies are built and linked as dylibs, and make sure the dependencies of the binaries are built and linked as rlibs (unless someone decides to do everything as dylib instead because they want to have their own application plugins). Then rustc would know whether upstream dependencies will be linked dylibs or rlibs when doing code gen and dllimport can be applied precisely and correctly.

@retep998
Copy link
Member

@retep998 retep998 commented Oct 27, 2015

Cargo does not currently build dependencies of plugins as dylibs. So Cargo would first have to be modified to build all dependencies of plugins as dylibs. This would result in a shared dependency being built twice, once as an rlib for the binary, and again as a dylib for the plugin. Ideally rustc would be able to do it as a single compilation, only internally duplicating the code generation steps when there's a split between whether to use dllimport or not, but that could potentially be complicated so two separate rustc invocations is the easier way for now. Also -Cprefer-dynamic would be passed when compiling each dylib.

Once Cargo has been modified then rustc can be taught to always use dllimport when referring to an upstream dylib and never use dllimport when referring to an upstream rlib. It would also be nice to store metadata on this so that rustc doesn't later try to link against the wrong thing as a sanity check.

@KindDragon
Copy link

@KindDragon KindDragon commented Jul 12, 2016

JFI GCC suggest always use special C++ visibility attributes https://gcc.gnu.org/wiki/Visibility

#if defined _WIN32 || defined __CYGWIN__
  #ifdef BUILDING_DLL
    #ifdef __GNUC__
      #define DLL_PUBLIC __attribute__ ((dllexport))
    #else
      #define DLL_PUBLIC __declspec(dllexport) // Note: actually gcc seems to also supports this syntax.
    #endif
  #else
    #ifdef __GNUC__
      #define DLL_PUBLIC __attribute__ ((dllimport))
    #else
      #define DLL_PUBLIC __declspec(dllimport) // Note: actually gcc seems to also supports this syntax.
    #endif
  #endif
  #define DLL_LOCAL
#else
  #if __GNUC__ >= 4
    #define DLL_PUBLIC __attribute__ ((visibility ("default")))
    #define DLL_LOCAL  __attribute__ ((visibility ("hidden")))
  #else
    #define DLL_PUBLIC
    #define DLL_LOCAL
  #endif
#endif

@upsuper
Copy link
Contributor

@upsuper upsuper commented Aug 19, 2016

@nikomatsakis

static variables are unusual

FWIW, this is probably not true. Stylo extensively uses static variables to reference string atoms from Gecko side (see this script), so it could potentially be affected. Also, I see tons of warnings like

geckoservo.lib(style.0.o) : warning LNK4217: locally defined symbol ?rubyTextContainer@nsCSSAnonBoxes@@2PEAVnsICSSAnonBoxPseudo@@EA (public: static class nsICSSAnonBoxPseudo * nsCSSAnonBoxes::rubyTextContainer) imported in function _ZN5style19gecko_selector_impl13PseudoElement9from_atom17h219208fbc1c0455dE

when linking Gecko with Servo on Windows.

@retep998
Copy link
Member

@retep998 retep998 commented Aug 19, 2016

@upsuper Those warnings mean that static is being referenced with dllimport despite the static being statically linked in and not actually imported from a DLL. Ideally Rust would know whether to apply dllimport and you wouldn't see those warnings.

@Ericson2314
Copy link
Contributor

@Ericson2314 Ericson2314 commented Aug 23, 2016

rust-lang/rfcs#1717 should fix this? (I didn't see it fixed)

@Aatch
Copy link
Contributor

@Aatch Aatch commented Aug 23, 2016

@Ericson2314 did you mean #31717? Since the issue you mentioned is about changing the precedence of as...

@Ericson2314
Copy link
Contributor

@Ericson2314 Ericson2314 commented Aug 23, 2016

Sorry, meant to link an RFC. Fixed now.

@retep998
Copy link
Member

@retep998 retep998 commented Aug 24, 2016

@Ericson2314 It will fix the situation for FFI. However it doesn't apply to inter-crate linking as Rust still doesn't know whether a dependency will be linked as an rlib or a dylib until link time, when it needs to know it at code generation time.

alexcrichton added a commit to alexcrichton/rust that referenced this issue Oct 18, 2017
On MSVC targets rustc will add symbols prefixed with `_imp_` to LLVM modules to
"emulate" dllexported statics as that workaround is still in place after rust-lang#27438
hasn't been solved otherwise. These statics, however, were getting gc'd by
ThinLTO accidentally which later would cause linking failures.

This commit updates the location we add such symbols to happen just before
codegen to ensure that (a) they're not eliminated by the optimizer and (b) the
optimizer doesn't even worry about them.

Closes rust-lang#45347
@retep998
Copy link
Member

@retep998 retep998 commented Oct 18, 2017

As an extreme solution to the problem of dllimport/dllexport between crates, we can ditch the dylib crate type completely. Thus rust crates will always be statically linked together so we can simply never apply dllimport or dllexport.

FFI currently has this working mostly thanks to kind=dylib applying dllimport and kind=static-nobundle not applying dllimport. We just need kind=static-nobundle to be stabilized (or to replace kind=static wholesale).

alexcrichton added a commit to alexcrichton/rust that referenced this issue Oct 18, 2017
On MSVC targets rustc will add symbols prefixed with `_imp_` to LLVM modules to
"emulate" dllexported statics as that workaround is still in place after rust-lang#27438
hasn't been solved otherwise. These statics, however, were getting gc'd by
ThinLTO accidentally which later would cause linking failures.

This commit updates the location we add such symbols to happen just before
codegen to ensure that (a) they're not eliminated by the optimizer and (b) the
optimizer doesn't even worry about them.

Closes rust-lang#45347
bors added a commit that referenced this issue Oct 20, 2017
rustc: Add `_imp_` symbols later in compilation

On MSVC targets rustc will add symbols prefixed with `_imp_` to LLVM modules to
"emulate" dllexported statics as that workaround is still in place after #27438
hasn't been solved otherwise. These statics, however, were getting gc'd by
ThinLTO accidentally which later would cause linking failures.

This commit updates the location we add such symbols to happen just before
codegen to ensure that (a) they're not eliminated by the optimizer and (b) the
optimizer doesn't even worry about them.

Closes #45347
bors added a commit that referenced this issue Oct 20, 2017
rustc: Add `_imp_` symbols later in compilation

On MSVC targets rustc will add symbols prefixed with `_imp_` to LLVM modules to
"emulate" dllexported statics as that workaround is still in place after #27438
hasn't been solved otherwise. These statics, however, were getting gc'd by
ThinLTO accidentally which later would cause linking failures.

This commit updates the location we add such symbols to happen just before
codegen to ensure that (a) they're not eliminated by the optimizer and (b) the
optimizer doesn't even worry about them.

Closes #45347
bors added a commit to rust-lang-ci/rust that referenced this issue Jul 29, 2020
MinGW: enable dllexport/dllimport

Fixes (only when using LLD) rust-lang#50176
Fixes rust-lang#72319

This makes `windows-gnu` on pair with `windows-msvc` when it comes to symbol exporting.
For MinGW it means both good things like correctly working dllimport/dllexport, ability to link with LLD and bad things like rust-lang#27438.

Not sure but maybe this should land behind unstable compiler option (`-Z`) or environment variable?
@mati865

This comment has been hidden.

@nikomatsakis

This comment has been hidden.

@mati865

This comment has been hidden.

@nikomatsakis nikomatsakis changed the title Correctly handle dllimport on Windows/MSVC Correctly handle dllimport on Windows Jul 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet