Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define a Rust ABI #600

Closed
steveklabnik opened this issue Jan 20, 2015 · 86 comments
Closed

Define a Rust ABI #600

steveklabnik opened this issue Jan 20, 2015 · 86 comments

Comments

@steveklabnik
Copy link
Member

@steveklabnik steveklabnik commented Jan 20, 2015

Right now, Rust has no defined ABI. That may or may not be something we want eventually.

@ranma42

This comment was marked as off-topic.

@nrc
Copy link
Member

@nrc nrc commented Aug 17, 2016

See #1675 for some motivation for this feature (implementing plugins, that is plugins for Rust programs, not for the compiler).

@genodeftest
Copy link

@genodeftest genodeftest commented Dec 16, 2016

Another motivation is the ability to ship shared libraries which could be reused by multiple applications on disk (reducing bandwith usage on update, reducing disk usage) and in RAM (through shared .text pages, reducing RAM usage).

@sdroege
Copy link

@sdroege sdroege commented Mar 27, 2017

It would also make Linux distributions more happy as it would allow usage of shared libraries. Which apart from memory reduction in different ways, also make handling of security issues (or otherwise important bugs) simpler. You only have to update the code in a single place and rebuild that, instead of having to fix lots of copies of the code in different versions and rebuild everything.

@Arzte
Copy link

@Arzte Arzte commented Mar 29, 2017

Shared libraries can be cool provided that there is a way to work around similar libraries accomplishing something in different ways such as openssl & libressl. There could also be value with making a way to allow a switch in-between static & dynamic libraries that can be set by the person compiling the crate. (possible flag?)

@Conan-Kudo
Copy link

@Conan-Kudo Conan-Kudo commented Mar 29, 2017

In my opinion, it's very hard to take Rust seriously as a replacement systems programming language if it's not possible in any reasonably sane manner to build applications linking to libraries with a safe, stable ABI.

There are a lot of good reasons for supporting shared libraries, not the least of which is building fully-featured systems for more resource constrained devices (like your average SBC or mobile device). Not having shared libraries really blows up the disk utilization in places where it's not cheap.

@jpakkane describes this really well in his blog post where he conducts an experiment to prove this problem.

@vks
Copy link

@vks vks commented Mar 29, 2017

In my opinion, it's very hard to take Rust seriously as a replacement systems programming language if it's not possible in any reasonably sane manner to build applications linking to libraries with a safe, stable ABI.

Note that C++ still doesn't have a stable ABI, and it took C decades to get one.

@steveklabnik
Copy link
Member Author

@steveklabnik steveklabnik commented Mar 29, 2017

It would also make Linux distributions more happy as it would allow usage of shared libraries.

There are a lot of good reasons for supporting shared libraries

(two quotes from two different people) Note that rust does support shared libraries. What it doesn't support is mixing them from different compiler toolchains. Some linux distros already do this with Rust; since they have one global rustc version, it works just fine.

@nagisa
Copy link
Member

@nagisa nagisa commented Mar 29, 2017

Rust also supports exporting functions with stable ABI just fine, as, for example, this shows.

@sdroege
Copy link

@sdroege sdroege commented Mar 29, 2017

Note that C++ still doesn't have a stable ABI, and it took C decades to get one.

While that is true, implementations (GCC, clang, MSVC at least) have a (somewhat) defined ABI and it only changes every now and then. With Rust there is no defined ABI at all and things might break in incompatible ways with any change in the compiler, a library you're using or your code, and you can't know when this is the case as the ABI is in no way defined (sure, you could look at the compiler code, but that could also change any moment).

What it doesn't support is mixing them from different compiler toolchains. Some linux distros already do this with Rust; since they have one global rustc version, it works just fine.

The problem is not only about compiler versions, but as written above about knowing what you can change in your code without breaking your ABI. And crates generally tracking their ABI in one way or another. Otherwise you can't use shared libraries created from crates in any reasonable way other than recompiling everything (and assuming ABI changed) whenever something changes.

@cuviper
Copy link
Member

@cuviper cuviper commented Mar 29, 2017

At least on Linux, everyone has pretty much settled on the Itanium C++ ABI. But even with the compiler locked down, it still requires very careful maintenance by a library author who hopes to export a stable ABI. Check out these KDE policies, for instance.

Rust crates would have many of the same challenges in presenting a stable ABI. Plus I think this is actually compounded by not having the separation between headers and source, so it's harder to actually tell what is reachable code. It's much more than just pub -- any generic code that gets monomorphized in your consumers may have many layers of both public and private calls to make to the original crate's library. And all of that monomorphized code has to remain supported as-is when you're shipping updates to your crate.

@plietar
Copy link

@plietar plietar commented Mar 29, 2017

Had a long term crazy idea to solve this.

Define a stable format for MIR and distribute all Rust binaries/shared libraries in that format. Package managers have a post-install step to translate the MIR into executables or .so files. Only the version of the MIR->binary backend "linker" has to be the same, the version of the source->MIR frontend compiler can differ.

Because of monomorphization and unboxed types, you need to still need to relink all the MIR files when a dependency is updated. Similarily, updating the backend compiler requires all the MIR files to be recompiled.

However, assuming we can push more and more optimisation passes into MIR rather than llvm, the time spent in the backend should be reduced to something acceptable.

If you want to push it even further, keep everything in MIR form and use miri (or a JIT version of it) to run them. Frequently used files can be linked and persisted to disk. And we've just reinvented the JVM/CLR/Webassembly.

@jpakkane
Copy link

@jpakkane jpakkane commented Mar 29, 2017

Rust also supports exporting functions with stable ABI just fine, as, for example, this shows.

For the purposes of this discussion that would require that every single Rust crate only exports a C ABI. Which is not going to happen.

@vks
Copy link

@vks vks commented Mar 29, 2017

While that is true, implementations (GCC, clang, MSVC at least) have a (somewhat) defined ABI and it only changes every now and then.

Try linking to any interface that uses std::string. You cannot mix different gcc versions and clang, because they use incompatible implementations of std::string.

@sdroege
Copy link

@sdroege sdroege commented Mar 29, 2017

While that is true, implementations (GCC, clang, MSVC at least) have a (somewhat) defined ABI and it only changes every now and then.

Try linking to any interface that uses std::string. You cannot mix different gcc versions and clang, because they use incompatible implementations of std::string.

Somewhat off-topic (we're not talking about C++ here), but as long as you stay in a compatible release series of those there is no problem. And with the correct compiler switches you can also e.g. also build your C++ code with gcc 7 against a library that was built with gcc 4.8 and uses/exposes std::string in its API.

The second part seems unnecessary but nice and useful (and a lot of work), but if the first part would be true for Rust that would be a big improvement already: a defined ABI, which might change for a new release whenever necessary

@FranklinYu
Copy link

@FranklinYu FranklinYu commented May 29, 2017

Quote from GCC about ABI compatibility (as a note to myself):

…Versioning gives subsequent releases of library binaries the ability to add new symbols and add functionality, all the while retaining compatibility with the previous releases in the series. Thus, program binaries linked with the initial release of a library binary will still run correctly if the library binary is replaced by carefully-managed subsequent library binaries. This is called forward compatibility. …

@le-jzr
Copy link

@le-jzr le-jzr commented May 29, 2017

Relying on libraries to correctly maintain binary compatibility is just an easily avoidable safety hazard. What is wrong with just letting the package repository rebuild dependent code? (To be clear, in this scenario using shared libraries is a given. Shared libraries and ABI stability are independent issues.)

And before someone argues about update download size, I must note that differential updates are not that difficult (no pun intended). In fact, if reproducible builds are done well, the resulting binary should be identical unless the library ABI has actually changed.

@Conan-Kudo
Copy link

@Conan-Kudo Conan-Kudo commented May 29, 2017

Because that assumes rebuilding is cheap. While it might be true for small projects, it's a fairly expensive and crappy process when you have long chains of things in big projects.

And it also makes it impossible to rely on Rust for actual systems programming because pure Rust libraries cannot be relied on for any given period of time.

@le-jzr
Copy link

@le-jzr le-jzr commented May 29, 2017

And it also makes it impossible to rely on Rust for actual systems programming because pure Rust libraries cannot be relied on for any given period of time.

What exactly can't be relied on? Systems programming is my area of interest and I don't see what you mean.

@jpakkane
Copy link

@jpakkane jpakkane commented May 29, 2017

To get an understanding of how much rebuilding and interdependencies there actually are in a full blown Linux distro, please read this blog post. It talks about static linking so it is not directly related to this discussion but useful to get a sense of scale.

What is wrong with just letting the package repository rebuild dependent code?

Well as an example on Debian there are 2906 packages that depend on GLib 2.0. Many more depend in it indirectly.

@le-jzr
Copy link

@le-jzr le-jzr commented May 29, 2017

Fair enough. But you don't actually need stable ABI for any of that. The ABI can change between rustc versions, but different builds on the same version are still compatible. So your distro only needs to rebuild stuff whenever they bump rustc version, which I assume is not gonna be often for a typical distro.

@Conan-Kudo
Copy link

@Conan-Kudo Conan-Kudo commented May 29, 2017

At least in Fedora and Mageia, you'd be wrong about that. We bump Rust almost right after the new version arrives.

@burdges
Copy link

@burdges burdges commented Jun 6, 2019

We noticed a flaw in the current undefined ABI that results way too much stack usage, copying, and poor cache performance in rust-random/rand#817

It appears fn(..) -> Large return Large inside its own stack frame, when the correct behavior would be for the caller to supply a correctly sized buffer.

Issues like this could cause significant performance penalties, especially with const generics ala fn foo<const n: usize>(..) -> [T;n], so they should be fixed before even considering a defined ABI.

@mzabaluev
Copy link
Contributor

@mzabaluev mzabaluev commented Jun 6, 2019

The new, well-defined, symbol mangling scheme is a big step towards stabilizing the ABI.

Other big items of work necessarily include:

  • The calling conventions (mostly punted to llvm targets).
  • Specification of the data structure layouts (at call boundary at least) for all supported reprs.
  • A stable metadata representation format for generics and inline functions.

@Centril
Copy link
Contributor

@Centril Centril commented Jun 6, 2019

The new, well-defined, symbol mangling scheme is a big step towards stabilizing the ABI.

It should not be interpreted this way. The language team has not been involved in the manging scheme design and has no current plans to work towards a stable ABI.

@eddyb
Copy link
Member

@eddyb eddyb commented Jun 6, 2019

@burdges I don't understand what you mean, we lower that as passing *mut Large.
You might be seeing two copies of the return value, but that's just for soundness and requires a MIR optimization that have been planned and worked on (and off) for the past couple years (i.e. "NRVO").

@awilfox
Copy link

@awilfox awilfox commented Aug 29, 2019

How should I, as someone who values tools that use Rust and does not enjoy recompiling over 400(!) libraries every time rustc is bumped, inform the language team that a stable ABI would be important to me?

How should we, a Linux distribution that would like to ship Rust packages, inform the language team that a stable ABI would be important to us? That six week release cycles for Rust mean we have multiple rustc bumps between releases of our distro, which means we're probably going to be sitting on 1.36 for the next six months until we have the time to bump up and rebuild everything multiple times?

I want to like Rust, I want to start writing Rust, but I can't without some form of stability.

@Centril
Copy link
Contributor

@Centril Centril commented Aug 29, 2019

There are no current plans to introduce anything resembling a wholesale stable ABI either for some new repr(v1) or repr(Rust) and there is active opposition to this within the language team. As such, I'm going to close this super broad wishlist issue.

@Centril Centril closed this Aug 29, 2019
@marmistrz
Copy link

@marmistrz marmistrz commented Aug 29, 2019

Since a part of the community seems to want stable ABI very badly and the language team is opposed to the idea, maybe there's some way to have the cake and eat it?

If ABI compatibility can be efficiently checked via software, Rust could define ABI revisions. If an ABI change is needed, the revision would simply be bumped - I suppose this wouldn't happen too often because Rust is already bent on stable API compatibility.

While this indeed requires some would, it would mean that the language team can change the ABI whenever they want and the distro packagers can reuse the builds whenever they can, saving disk space, network traffic and electricity.

@burdges
Copy link

@burdges burdges commented Aug 29, 2019

Is there any reason to expect that any two rustc versions to have compatible ABIs?

I'd kinda assume niches get altered in every single recent rustc version. Also, the current ABI has serious problems like lacking NRVO. We should not impose any friction on these improvements.

In the longer run, there are afaik no great proposals for optimized dynamic linking in modern languages like Rust, OCaML, Haskell, or even C++, so expect 5+ years of radical ABI churn whenever people really take an interest in that problem.

@whitequark
Copy link
Member

@whitequark whitequark commented Aug 29, 2019

In the longer run, there are afaik no great proposals for optimized dynamic linking in modern languages like Rust, OCaML, Haskell, or even C++, so expect 5+ years of radical ABI churn whenever people really take an interest in that problem.

OCaml allows dynamically loading and linking essentially arbitrary code, with the only restriction being that every module the plugin depends on has to be either (a) inside the plugin, or (b) inside the host application, and the hash of the module interface has to match. This preserves type safety.

However, OCaml's dynamic loading mechanism is in a position that's significantly easier to handle than Rust's because OCaml doesn't have monomorphization and it has an uniform value representation, and as a result dynamic loading in OCaml is not mutually exclusive with full type erasure; the types are only (indirectly) present in the interface hash.

@sdroege
Copy link

@sdroege sdroege commented Aug 29, 2019

(Not speaking for the Rust team) I would suggest that instead of adding more comments to this issue here, it would seem more likely to lead to results to handle the different, orthogonal issues that were discussed here (that all in one way or another are related to a Rust ABI) into separate issues.

Many of them can be solved without first defining a stable Rust ABI, and can be tackled independent of that. And hopefully bring us at the same closer to actually being able to work on the general "Rust ABI" problem in the future.

@mathstuf
Copy link

@mathstuf mathstuf commented Aug 30, 2019

For clarity, Is this:

and there is active opposition to this within the language team

a "Rust should never have a stable ABI" or a "we're still nowhere near ready to make a stable ABI" kind of opposition?

@Centril
Copy link
Contributor

@Centril Centril commented Aug 30, 2019

For clarity, Is this:

and there is active opposition to this within the language team

a "Rust should never have a stable ABI" or a "we're still nowhere near ready to make a stable ABI" kind of opposition?

It is my opinion (as a language team member but not speaking for the team as a whole) that Rust should never have a stable ABI for the default representation repr(Rust). It may or may not be a good idea to add a repr(v1) that is strictly opt-in but that has serious technical challenges (e.g. generics) and is in any case not a realistic goal or priority for the next years. I would also not like to see repr(v1) used in the standard library if we would ever add it.

@eddyb
Copy link
Member

@eddyb eddyb commented Aug 31, 2019

@burdges The current calling convention is RVO-oriented (except for things like returning the variants of Result separately, I guess).
When we say NRVO, we mean more like "removing copies within a function by writing to the destination directly", which is an optimization not requiring ABI changes (modulo the Result thing which is a rather advanced transformation).


Also, if anyone wants to see my informal take on this: https://twitter.com/eddyb_r/status/1166953126928277505

It's close to "we're still nowhere near ready to make a stable ABI" but also IMO, a lot of the stuff around the idea of a "stable ABI" is not thought through that well, or even outright misguided.

If you start from regarding C as an utter failure on pretty much all fronts other than its popularity, you might find better ways to do things.

But yeah it's far off, likely involving programming languages and tooling very different from what we're used to, especially in the systems programming / low-level areas.

@Diggsey
Copy link
Contributor

@Diggsey Diggsey commented Aug 31, 2019

@eddyb

So you would still be restricted to monomorphic declarations? What's the advantage over some sort of interop library using the C ABI that's provided as source?

With an interop library you always need a "serialization" and "deserialization" step to convert your types to some #[repr(C)] type, a fair bit of unsafe code, and you probably need some kind of procedural macro system to define your interface in a way that is not too painful.

If we had a #[repr(MoreThanC)] or something, that extends #[repr(C)] with support for more types (like Vecs, Strings) then you effectively solve the "plugin system" use-case: plugin interfaces typically involve only simple types anyway because you have to keep them stable, and this would make it faster, safer and simpler than if you had to use an interop library.

@eddyb
Copy link
Member

@eddyb eddyb commented Aug 31, 2019

@Diggsey You could probably prototype parts of the Swift approach (i.e. adding a bit of indirection to hide potential differences, while trying to minimize overhead) outside of the language.

You don't need full "serialization", just enough to provide access without relying on any assumptions.

For example, proc_macro::bridge::buffer contains minimal (e.g. T: Copy-only) versions of &[T] and Vec<T> that can be safely passed between two Rust "worlds" (potentially compiled by incompatible-ABI compilers and using incompatible global allocators), without significant overhead (e.g. when extending the Vec you only need to do a dynamic call once you run out of capacity).

We don't want to bake anything like stable ABIs into the standard library because that puts hard limits on what we can do with the implementation, whereas any experiments in the ecosystem could thrive and go through many iterations, with just proc macros and traits.


Since you mention serialization, here's an analogy: a stable ABI being used by the standard library is like a stable serialization framework that's baked into every single type supporting it and you can't version it.

There's only one way data is represented in memory, and Rust is already having trouble taking advantage of that representation, due to the compilation model.
So I'd rather move in the direction of delaying that choice of representation while also giving people who really need it more fine-grained control over custom representations (e.g. bit-packing).

@cuviper
Copy link
Member

@cuviper cuviper commented Aug 31, 2019

a stable ABI being used by the standard library is like a stable serialization framework that's baked into every single type supporting it and you can't version it.

GCC had to deal with this for C++11, and they ended up forking stuff with ABI tags.
https://developers.redhat.com/blog/2015/02/05/gcc5-and-the-c11-abi/

I mention this as a cautionary tale.

@burdges
Copy link

@burdges burdges commented Sep 1, 2019

Interesting, it sounds like "Define a Rust ABI" actually means roughly two-ish things:

If you want broader dynamic linking, then you pass some dyn(v1) Trait across the interface. It'd require do a stable trait based interfaces for the data structures, but this bring other benefits too. I'd think this might facilitate doing some large GUI toolkit in Rust for example.

If you want to "Rust OS", then you want some #[repr(redox-v1)] that constrains type parameters, probably disallows type generics, but lifetimes definitely work, and const generics might work, maybe at the cost of making them untestable in where clauses.

There is an extended version of this second form where you specify #[repr(wasm-v1)] to write directly into some form that ensures compatibility even with some virtual machine.

I'd think #[repr(redox-v1)] would reduce the depth of type erasure required to exploit dyn(v1) Trait, so maybe that's logically first.

@Serentty
Copy link

@Serentty Serentty commented Oct 20, 2019

Given that Rust already lets you switch between a bunch of different ABIs with repr and extern as it is, I don't see the harm in simply adding one that freezes the current unstable ABI as a stable one that you would have to explicitly ask for, while the unstable one would remain the default. In the future, if there's a better design, just add that one as a selectable ABI as well. I don't think it's the right course of action to wait until we've figured out the “ABI to end all ABIs” before we offer even a single Rust-specific ABI, especially considering that it looks like work on coming up with that perfect ABI isn't happening right now anyway.

Also consider that since most functions are only crate-internal, the issue of making sure the ABI is a good one isn't as pressing as C++, where everything is public by default.

@eddyb
Copy link
Member

@eddyb eddyb commented Oct 20, 2019

@Serentty Besides other reasons, the current rules can't be frozen because they don't exist.
You can't version implementation-defined behavior without copying the entirety of the implementation, it's not even implementation-specified, and ideally it should be specified at least by an RFC.

If an RFC for a specific ABI is presented, with details for all supported targets, that might work out.
But why would you bother with that when repr(C) and extern "C" exist?

Even if you have such an ABI, it won't be used by libstd types, just like the C one isn't.
(that's one of other reasons I alluded to above, probably the main one)

@Ixrec
Copy link
Contributor

@Ixrec Ixrec commented Oct 20, 2019

(looks like eddyb and I were writing these replies at the same time)

I don't see the harm in simply adding one that freezes the current unstable ABI as a stable one that you would have to explicitly ask for, while the unstable one would remain the default

This expresses a common misunderstanding that "the current unstable Rust ABI" is something that is already fully implemented in a well-defined, well-understood way that we know how to support to everyone's satisfaction.

A large chunk of the work here is achieving consensus on what a "Rust ABI" is even supposed to be, and that whatever it's supposed to be would even be a desirable thing to stabilize. There are already loads of comments in this thread expressing disagreement over what it is or reasons why it's undesirable to ever stabilize any version of it.

@Serentty
Copy link

@Serentty Serentty commented Oct 20, 2019

Okay, so if there's no rigid documentation for how the current ABI works, then that is indeed a problem that prevents it from being frozen.

If an RFC for a specific ABI is presented, with details for all supported targets, that might work out.
But why would you bother with that when repr(C) and extern "C" exist?

Because the C ABI is still quite limited. Not supporting trait objects is a fairly big hurdle, for example.

@Serentty
Copy link

@Serentty Serentty commented Oct 20, 2019

I just heard from a friend that trait pointers are currently an unstable feature in the C ABI. That actually brings it close to what I would want in a Rust ABI anyway. The only thing really left bothering me is that the standard library can't easily be shipped as separate form the executable, which would be a real bandwidth-saver over CDNs, but because of generics that's probably not entirely possible anyway.

@mzabaluev
Copy link
Contributor

@mzabaluev mzabaluev commented Nov 9, 2019

For reference, a blog post by @Gankra: How Swift Achieved Dynamic Linking Where Rust Couldn't

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet