New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to deal with breaking changes on platform ? [BSDs related] #570

Open
semarie opened this Issue Apr 7, 2017 · 41 comments

Comments

Projects
None yet
9 participants
@semarie
Contributor

semarie commented Apr 7, 2017

I open an issue on libc because it is here the problems will start to show up. Depending the solution or the way to deal with, modifications could occurs in rustc repository too.

At OpenBSD, we don't care about breaking API/ABI between releases. Once a release is done, the API/ABI is stable, but there is no guarantee that it will be compatible with the next release.

Currently, in the upcoming 6.2 version of OpenBSD (6.1-current), there is a breaking change that will affect libc : si_addr should be of type void *, not char * (caddr_t). Here the current definition in libc.

Under OpenBSD, we deal with ABI under LLVM by using a triple like: amd64-unknown-openbsd6.1. For Rust, instead we use an unversioned platform, resulting all OpenBSD versions to define the same ABI (which isn't properly right).

Do you think it possible to switch from *-unknown-openbsd to *-unknown-openbsd6.0, *-unknown-openbsd6.1, ... without having to duplicate all code in libc for each target ? and without having to add a new target in rustc for each OpenBSD release ?

Any others ideas on the way to deal with it ?

@alexcrichton

This comment has been minimized.

Show comment
Hide comment
@alexcrichton

alexcrichton Apr 8, 2017

Member

Unfortunately I don't really know how we'd handle this, I just figured that platforms wouldn't do this.

If this happens a lot we'll just need to document what's wrong and stop adding new bindings, it'll be up to crates to implement version compatibility.

Member

alexcrichton commented Apr 8, 2017

Unfortunately I don't really know how we'd handle this, I just figured that platforms wouldn't do this.

If this happens a lot we'll just need to document what's wrong and stop adding new bindings, it'll be up to crates to implement version compatibility.

@semarie

This comment has been minimized.

Show comment
Hide comment
@semarie

semarie Apr 9, 2017

Contributor

I think it isn't just a "version compatibility" issue. My purpose isn't to have a compatibility layer for missing/removed functions or types.

The problem is OpenBSD triple is versioned, meaning that API/ABI of one version could be different from another version.

I checked some others system (running llvm-config --host-target to see if the triple is versioned or not), and it seems it is a common situation in not-Linux world:

  • x86_64-apple-darwin16.0.0
  • x86_64-unknown-freebsd12.0
  • x86_64-unknown-freebsd11.0
  • i386-unknown-openbsd5.8
  • x86_64-unknown-netbsd7.99

I also checked in LLVM source tree: the OS version is a part of the triple definition.
see getOSVersion() in include/llvm/ADT/Triple.h.

Maybe a concept is missing in Rust ? If target_os_version would be available, it would solve the issue: parts that are only defined in some OS version could be isolated from another OS version.

I don't think it is a problem only on OpenBSD. Any OS using OS-Version could be hitted. OpenBSD exposes it because we heavy use the ability to not be API/ABI compatible (it is a way to be able remove old stuff that deserve security).

Contributor

semarie commented Apr 9, 2017

I think it isn't just a "version compatibility" issue. My purpose isn't to have a compatibility layer for missing/removed functions or types.

The problem is OpenBSD triple is versioned, meaning that API/ABI of one version could be different from another version.

I checked some others system (running llvm-config --host-target to see if the triple is versioned or not), and it seems it is a common situation in not-Linux world:

  • x86_64-apple-darwin16.0.0
  • x86_64-unknown-freebsd12.0
  • x86_64-unknown-freebsd11.0
  • i386-unknown-openbsd5.8
  • x86_64-unknown-netbsd7.99

I also checked in LLVM source tree: the OS version is a part of the triple definition.
see getOSVersion() in include/llvm/ADT/Triple.h.

Maybe a concept is missing in Rust ? If target_os_version would be available, it would solve the issue: parts that are only defined in some OS version could be isolated from another OS version.

I don't think it is a problem only on OpenBSD. Any OS using OS-Version could be hitted. OpenBSD exposes it because we heavy use the ability to not be API/ABI compatible (it is a way to be able remove old stuff that deserve security).

@alexcrichton

This comment has been minimized.

Show comment
Hide comment
@alexcrichton

alexcrichton Apr 10, 2017

Member

Yeah there's no concept of a versioned target in rustc right now, and we're unfortunately not really capable of doing so right now.

Our only recourse is basically to take the subset which currently works across all revisions, put that in libc, and then otherwise let downstream crates bind versions that change over time.

Member

alexcrichton commented Apr 10, 2017

Yeah there's no concept of a versioned target in rustc right now, and we're unfortunately not really capable of doing so right now.

Our only recourse is basically to take the subset which currently works across all revisions, put that in libc, and then otherwise let downstream crates bind versions that change over time.

@semarie

This comment has been minimized.

Show comment
Hide comment
@semarie

semarie Apr 11, 2017

Contributor

I hope you are kidding: you are asking to remove siginfo_t type for OpenBSD from libc and so to break stack_overflow detection for OpenBSD (libstd relies on it). And even if we can drop stack_overflow detection, it doesn't resolv the intrinsic problem.

So I am looking to extend Target to include os-version information in the target specification.

Contributor

semarie commented Apr 11, 2017

I hope you are kidding: you are asking to remove siginfo_t type for OpenBSD from libc and so to break stack_overflow detection for OpenBSD (libstd relies on it). And even if we can drop stack_overflow detection, it doesn't resolv the intrinsic problem.

So I am looking to extend Target to include os-version information in the target specification.

@alexcrichton

This comment has been minimized.

Show comment
Hide comment
@alexcrichton

alexcrichton Apr 11, 2017

Member

Well, I'm not really kidding. If we feel we must fix this then we currently have no choice but to not add the bindings. If we don't want to do that then the fix must go elsewhere. I don't know the best way to fix this, just spitballing.

Member

alexcrichton commented Apr 11, 2017

Well, I'm not really kidding. If we feel we must fix this then we currently have no choice but to not add the bindings. If we don't want to do that then the fix must go elsewhere. I don't know the best way to fix this, just spitballing.

@asomers

This comment has been minimized.

Show comment
Hide comment
@asomers

asomers Apr 20, 2017

Contributor

This isn't just a problem for OpenBSD. FreeBSD 12, when it comes out, will change a number of important types, like ino_t and struct stat. If libc's policy is to only bind the greatest common denominator between versions, then overtime it will shrink into irrelevance. Such a policy really just kicks the version compatibility can down the road.

Would it be possible to generate bindings dynamically at build time? When writing Ruby bindings, I've always preferred that approach to FFI. If not, then I think libc needs a way to distinguish between OS versions, just as it currently distinguishes between OSes.

Contributor

asomers commented Apr 20, 2017

This isn't just a problem for OpenBSD. FreeBSD 12, when it comes out, will change a number of important types, like ino_t and struct stat. If libc's policy is to only bind the greatest common denominator between versions, then overtime it will shrink into irrelevance. Such a policy really just kicks the version compatibility can down the road.

Would it be possible to generate bindings dynamically at build time? When writing Ruby bindings, I've always preferred that approach to FFI. If not, then I think libc needs a way to distinguish between OS versions, just as it currently distinguishes between OSes.

@alexcrichton

This comment has been minimized.

Show comment
Hide comment
@alexcrichton

alexcrichton Apr 20, 2017

Member

I would personally be afraid of generating bindings at compile time. It just pushes the problem to consumers without giving them tools to deal with it.

I do think that this sounds like this needs a way for libc to distinguish between OS versions, but Rust currently has no tool for doing so really.

Member

alexcrichton commented Apr 20, 2017

I would personally be afraid of generating bindings at compile time. It just pushes the problem to consumers without giving them tools to deal with it.

I do think that this sounds like this needs a way for libc to distinguish between OS versions, but Rust currently has no tool for doing so really.

@asomers

This comment has been minimized.

Show comment
Hide comment
@asomers

asomers May 1, 2017

Contributor

The problem just got worse. Linux 4.11, released today, added a new system call: statx. Until libc learns to understand versions, it can't add support for statx. I really think that cargo needs some sort of configure step analagous to autoconf's configure.
https://www.phoronix.com/scan.php?page=news_item&px=Linux-4.11-Statx-System-Call

Contributor

asomers commented May 1, 2017

The problem just got worse. Linux 4.11, released today, added a new system call: statx. Until libc learns to understand versions, it can't add support for statx. I really think that cargo needs some sort of configure step analagous to autoconf's configure.
https://www.phoronix.com/scan.php?page=news_item&px=Linux-4.11-Statx-System-Call

@alexcrichton

This comment has been minimized.

Show comment
Hide comment
@alexcrichton

alexcrichton May 1, 2017

Member

@asomers that's not quite true though, we can add bindings at any time. Rust supports Linux 2.6.18+ and there are a huge number of syscalls bound in libc not present in 2.6.18. It's up to crate authors to pick and choose apis for platform compatibility appropriately.

Member

alexcrichton commented May 1, 2017

@asomers that's not quite true though, we can add bindings at any time. Rust supports Linux 2.6.18+ and there are a huge number of syscalls bound in libc not present in 2.6.18. It's up to crate authors to pick and choose apis for platform compatibility appropriately.

@asomers

This comment has been minimized.

Show comment
Hide comment
@asomers

asomers May 2, 2017

Contributor

@alexcrichton I guess I was wrong about how libc's CI tests worked. Are you saying that libc's tests do not flag symbols defined in FFI but not present in the system's headers? If that's true, then a Rust program trying to use statx on Linux <= 4.10 would build but get ENOSYS at runtime, right? That's better than what happens if a Rust program tries to use aio_waitcomplete on FreeBSD 10, where the FFI binding would actually be wrong. But Linux is not immune from changing syscalls, either. The first example I could find was utimensat. Its signature changed in 2010, well after 2.6.18 was released. Any Rust program using libc will try to use the new version of utimensat, even when built for older systems.

Would you consider dynamically generating the bindings for select functions, even if most functions have static bindings? Right now, I don't see any way at all for consumers to deal with the versioning problem.

Contributor

asomers commented May 2, 2017

@alexcrichton I guess I was wrong about how libc's CI tests worked. Are you saying that libc's tests do not flag symbols defined in FFI but not present in the system's headers? If that's true, then a Rust program trying to use statx on Linux <= 4.10 would build but get ENOSYS at runtime, right? That's better than what happens if a Rust program tries to use aio_waitcomplete on FreeBSD 10, where the FFI binding would actually be wrong. But Linux is not immune from changing syscalls, either. The first example I could find was utimensat. Its signature changed in 2010, well after 2.6.18 was released. Any Rust program using libc will try to use the new version of utimensat, even when built for older systems.

Would you consider dynamically generating the bindings for select functions, even if most functions have static bindings? Right now, I don't see any way at all for consumers to deal with the versioning problem.

@alexcrichton

This comment has been minimized.

Show comment
Hide comment
@alexcrichton

alexcrichton May 2, 2017

Member

No, to be clear:

  • The libc crate is basically just a header file.
  • The libc crate is automatically verified on many platforms, but we have exceptions in libc-test/build.rs. It's not guaranteed that the verification passes on every instantiation of a platform.
  • If you use a function from libc you're referencing a symbol.
  • If that symbol doesn't actually exist on your system, you'll get a linker error.

Programs using statx will likely get a linker error and will then have to deal with that appropriately.

I would like to avoid dynamically generating the bindings, as that's typically not the actual solution to this problem. It makes cross compilation (even just across OS versions) much more difficult

Member

alexcrichton commented May 2, 2017

No, to be clear:

  • The libc crate is basically just a header file.
  • The libc crate is automatically verified on many platforms, but we have exceptions in libc-test/build.rs. It's not guaranteed that the verification passes on every instantiation of a platform.
  • If you use a function from libc you're referencing a symbol.
  • If that symbol doesn't actually exist on your system, you'll get a linker error.

Programs using statx will likely get a linker error and will then have to deal with that appropriately.

I would like to avoid dynamically generating the bindings, as that's typically not the actual solution to this problem. It makes cross compilation (even just across OS versions) much more difficult

@asomers

This comment has been minimized.

Show comment
Hide comment
@asomers

asomers May 2, 2017

Contributor

Cross-compilation could be solved by overriding the build script's platform detection. For example, on FreeBSD there's basically only one symbol that a build script would need to detect: __FreeBSD_version. When cross-compiling, cargo could set that in an environment variable and the build script wouldn't try to detect it from the system headers.

But it sounds like you might have something else in mind when you say it's "not the actual solution to this problem". Do you? What is the "actual solution", @alexcrichton ?

Contributor

asomers commented May 2, 2017

Cross-compilation could be solved by overriding the build script's platform detection. For example, on FreeBSD there's basically only one symbol that a build script would need to detect: __FreeBSD_version. When cross-compiling, cargo could set that in an environment variable and the build script wouldn't try to detect it from the system headers.

But it sounds like you might have something else in mind when you say it's "not the actual solution to this problem". Do you? What is the "actual solution", @alexcrichton ?

@alexcrichton

This comment has been minimized.

Show comment
Hide comment
@alexcrichton

alexcrichton May 2, 2017

Member

To me the "actual solution" is precisely what we're doing right now. We list a bunch of symbols and authors need to be vigilant about which ones they use. This does not solve the use case of OpenBSD, however, if there are ABI breaking changes. We may need more than one solution but to me there are too many downsides to dynamically generating an API on Linux at least.

Member

alexcrichton commented May 2, 2017

To me the "actual solution" is precisely what we're doing right now. We list a bunch of symbols and authors need to be vigilant about which ones they use. This does not solve the use case of OpenBSD, however, if there are ABI breaking changes. We may need more than one solution but to me there are too many downsides to dynamically generating an API on Linux at least.

@asomers

This comment has been minimized.

Show comment
Hide comment
@asomers

asomers May 2, 2017

Contributor

Not only does it not solve OpenBSD's use case; it doesn't solve the use case where operating systems make changes that don't break the ABI. Both FreeBSD and Linux occasionally change syscalls and provide backwards compatible syscalls with the old signature and syscall number but a new name. For example, FreeBSD 8's "compat7.shmctl" syscall is identical to FreeBSD 7's "shmctl". Similarly, operating systems make changes to system libraries and provide backwards compatibility by bumping the SHLIB version and providing the old libraries as optional packages.

Currently libc handles neither of these cases. Either the libc binding tracks the new function's signature, which breaks Rust programs at runtime on older versions, or the libc binding stays with the old signature, which breaks Rust programs at runtime on newer versions. It's simply not possible for the current libc to compile correctly on multiple versions of an operating system. Your previously suggested solution is to simply remove a binding whenever the OS changes it. But that would break any crates that use the binding, violate semver, and still result in runtime failures for crates that use the old libc but were built on a new OS.

You suggest that libc's consumers should be responsible for versioning issues, but I don't think that's possible. Let's take stat(2), for example, which will likely change in FreeBSD 12. Suppose that when FreeBSD 12 is released, somebody tries to compile the nix crate on it. The linker will be satisfied that stat is present in libc, but the signature will be totally wrong, so nix will fail at runtime. Cargo won't produce any kind of warning. If I understand you correctly, you suggest that stat should at this point be removed from libc. But that won't fix nix until somebody updates that crate's dependencies, and even then it will only change a runtime failure into a compile time failure. Must the nix developer then write a build script that checks __FreeBSD_version and reimplement all of stat's FFI bindings for FreeBSD 12? That would finally fix the problem. But according to crates.io, libc has 1268 dependent crates, and all of them would have to independently write the same build script and add the same FFI bindings for stat on FreeBSD 12.

Alternatively, libc could assume that all operating systems provide backwards but not forwards compatibility (sorry OpenBSD). Then it could pick a minimum supported version, and always link against that version's shared libraries, Currently Cargo doesn't provide a mechanism to specify an exact shared library version to link against, but that could be added. This would fix all of the runtime failures, but at substantial cost: dependent crates would lack access to new OS features, and both developers and users would have to install the compat library packages. Not only would new features that change APIs be unavailable, but the shared library lock would mean that entirely new functions would be unavailable as well, unlike the current situation where using newly added functions will generate link failures when building on an old OS.

In either case, developers will likely fork libc to update their favorite bindings, resulting in a Balkanization of libc and dependent crates that don't support older OS versions.

I understand that cross-compilation is a really cool feature, but I fear that you're underestimating the severity of this problem. Have you looked into how embedded cross development toolchains work? AFAIK the host system requires full headers for the target. Maybe Rust needs to do the same.

Contributor

asomers commented May 2, 2017

Not only does it not solve OpenBSD's use case; it doesn't solve the use case where operating systems make changes that don't break the ABI. Both FreeBSD and Linux occasionally change syscalls and provide backwards compatible syscalls with the old signature and syscall number but a new name. For example, FreeBSD 8's "compat7.shmctl" syscall is identical to FreeBSD 7's "shmctl". Similarly, operating systems make changes to system libraries and provide backwards compatibility by bumping the SHLIB version and providing the old libraries as optional packages.

Currently libc handles neither of these cases. Either the libc binding tracks the new function's signature, which breaks Rust programs at runtime on older versions, or the libc binding stays with the old signature, which breaks Rust programs at runtime on newer versions. It's simply not possible for the current libc to compile correctly on multiple versions of an operating system. Your previously suggested solution is to simply remove a binding whenever the OS changes it. But that would break any crates that use the binding, violate semver, and still result in runtime failures for crates that use the old libc but were built on a new OS.

You suggest that libc's consumers should be responsible for versioning issues, but I don't think that's possible. Let's take stat(2), for example, which will likely change in FreeBSD 12. Suppose that when FreeBSD 12 is released, somebody tries to compile the nix crate on it. The linker will be satisfied that stat is present in libc, but the signature will be totally wrong, so nix will fail at runtime. Cargo won't produce any kind of warning. If I understand you correctly, you suggest that stat should at this point be removed from libc. But that won't fix nix until somebody updates that crate's dependencies, and even then it will only change a runtime failure into a compile time failure. Must the nix developer then write a build script that checks __FreeBSD_version and reimplement all of stat's FFI bindings for FreeBSD 12? That would finally fix the problem. But according to crates.io, libc has 1268 dependent crates, and all of them would have to independently write the same build script and add the same FFI bindings for stat on FreeBSD 12.

Alternatively, libc could assume that all operating systems provide backwards but not forwards compatibility (sorry OpenBSD). Then it could pick a minimum supported version, and always link against that version's shared libraries, Currently Cargo doesn't provide a mechanism to specify an exact shared library version to link against, but that could be added. This would fix all of the runtime failures, but at substantial cost: dependent crates would lack access to new OS features, and both developers and users would have to install the compat library packages. Not only would new features that change APIs be unavailable, but the shared library lock would mean that entirely new functions would be unavailable as well, unlike the current situation where using newly added functions will generate link failures when building on an old OS.

In either case, developers will likely fork libc to update their favorite bindings, resulting in a Balkanization of libc and dependent crates that don't support older OS versions.

I understand that cross-compilation is a really cool feature, but I fear that you're underestimating the severity of this problem. Have you looked into how embedded cross development toolchains work? AFAIK the host system requires full headers for the target. Maybe Rust needs to do the same.

@alexcrichton

This comment has been minimized.

Show comment
Hide comment
@alexcrichton

alexcrichton May 3, 2017

Member

@asomers if you've got a proposal of what to do I'd recommend writing up an RFC, with so many dependencies changes such as what I think you're proposing can't be taken lightly.

Member

alexcrichton commented May 3, 2017

@asomers if you've got a proposal of what to do I'd recommend writing up an RFC, with so many dependencies changes such as what I think you're proposing can't be taken lightly.

@semarie

This comment has been minimized.

Show comment
Hide comment
@semarie

semarie May 6, 2017

Contributor

@alexcrichton I pushed a WIP branch on my github repository. I hope code will be more explicity than my explaination about what I called having support for OS version.

Tree is at https://github.com/semarie/rust/tree/target-os-version . Please note my code isn't working for now.

Basically, it is:

  • extending Target to embedding a (possibility empty) os-version string
  • exposing the string as target_os_version symbol (in the same way than target_os)

It would be possible to do have conditionnal code against OS version (OpenBSD 6.1 or OpenBSD 6.0), in the same way we have conditionnal code against OS name (OpenBSD or FreeBSD).

Contributor

semarie commented May 6, 2017

@alexcrichton I pushed a WIP branch on my github repository. I hope code will be more explicity than my explaination about what I called having support for OS version.

Tree is at https://github.com/semarie/rust/tree/target-os-version . Please note my code isn't working for now.

Basically, it is:

  • extending Target to embedding a (possibility empty) os-version string
  • exposing the string as target_os_version symbol (in the same way than target_os)

It would be possible to do have conditionnal code against OS version (OpenBSD 6.1 or OpenBSD 6.0), in the same way we have conditionnal code against OS name (OpenBSD or FreeBSD).

@asomers

This comment has been minimized.

Show comment
Hide comment
@asomers

asomers May 6, 2017

Contributor

Good work @semarie. BTW, I've been studying ELF symbol versioning and I think it would be possible to fix libc without modifying Rust itself. Basically, libc would need to grow a bunch of feature flags like "freebsd11+", "freebsd10+", etc, meaning "build code that will work on FreeBSD 11 or greater" and "build code that will work on FreeBSD 10 or greater". Of course, those flags could be conditionalized so they won't appear on other platforms. Then, for every symbol that differs between FreeBSD versions, libc will bind a different version depending on which feature flags are set. The link_name attribute will encode the specific ELF symbol version number used on the oldest OS version chosen. I don't have code yet, but I think this approach will work for FreeBSD and Linux. Does OpenBSD use ELF binaries or is it still using a.out?

Also, I've found several functions in glibc with multiple versions. Linux is not immune from this problem.

Contributor

asomers commented May 6, 2017

Good work @semarie. BTW, I've been studying ELF symbol versioning and I think it would be possible to fix libc without modifying Rust itself. Basically, libc would need to grow a bunch of feature flags like "freebsd11+", "freebsd10+", etc, meaning "build code that will work on FreeBSD 11 or greater" and "build code that will work on FreeBSD 10 or greater". Of course, those flags could be conditionalized so they won't appear on other platforms. Then, for every symbol that differs between FreeBSD versions, libc will bind a different version depending on which feature flags are set. The link_name attribute will encode the specific ELF symbol version number used on the oldest OS version chosen. I don't have code yet, but I think this approach will work for FreeBSD and Linux. Does OpenBSD use ELF binaries or is it still using a.out?

Also, I've found several functions in glibc with multiple versions. Linux is not immune from this problem.

@semarie

This comment has been minimized.

Show comment
Hide comment
@semarie

semarie May 6, 2017

Contributor

@asomers OpenBSD uses ELF on all platforms. but using ELF symbol versioning doesn't help for breaking changes if the OS doesn't use symbol versioning.

Contributor

semarie commented May 6, 2017

@asomers OpenBSD uses ELF on all platforms. but using ELF symbol versioning doesn't help for breaking changes if the OS doesn't use symbol versioning.

@alexcrichton

This comment has been minimized.

Show comment
Hide comment
@alexcrichton

alexcrichton May 8, 2017

Member

@semarie I'd personally probably reocmmend writing an RFC before sending that as a PR, I'm sure many others would have comments as well!

Member

alexcrichton commented May 8, 2017

@semarie I'd personally probably reocmmend writing an RFC before sending that as a PR, I'm sure many others would have comments as well!

@raphaelcohn

This comment has been minimized.

Show comment
Hide comment
@raphaelcohn

raphaelcohn May 17, 2017

Contributor

This has hit me too - in particular, with changes in Mac OS X major versions. However, a good solution probably isn't to use a version number in the target triple, as there's a distinction to be drawn between libc version and OS version; they do not necessarily go in lockstep. A classic example might be changes to Linux's uapi headers, which don't yet line up with changes in musl, say.

This problem is a general one: changes in third party (usually C) library APIs that are incompatible. It needs a good solution within Rust. It's a problem that's heavily compounded by set ups that use dynamic libraries (something I've come to see as more trouble than they're worth for secure or robust systems outside of the desktop. In practice, it's a far too difficult for most sysadmins to assess whether a security fix to a dynamic library affects more than on running program, and so they just go for the nuclear option of a reboot). Using autoconf like tests or dynamic bindings at runtime is probably the wrong way to solve this generally. Such approaches require too much of the system they are on (execute permissions, existence of compilation-associated tools, headers, etc). They are also deeply unfriendly to security audits and locked down systems (eg those built entirely from source). autoconf in particular makes the classic mistake that the build system is similar to deployment; it's always been an absolute beast to get things to cross-compile with it repeatedly, robustly and consistently. Too many things (eg time-of-date, location of shell interpreter, absolute sysroot paths, etc) creep into the deployed solution.

(Semantic version does absolutely nothing to solve this; in fact, semantic versioning is a deeply broken concept that's become popular recently. In practice, either a version is compatible or it isn't; semantic versioning is just the upstream's author's assessment. One man's inconsequential security version or minor change is another's nightmare. In practice, with large system set ups and deployments, I always encourage dev teams to think of only two kinds of version: likely-to-be-compatible security fix, and incompatible. Everything incompatible needs to go through the full test cycle before deployment. Security fixes can bypass that if urgent; risk vs reward and all that).

Contributor

raphaelcohn commented May 17, 2017

This has hit me too - in particular, with changes in Mac OS X major versions. However, a good solution probably isn't to use a version number in the target triple, as there's a distinction to be drawn between libc version and OS version; they do not necessarily go in lockstep. A classic example might be changes to Linux's uapi headers, which don't yet line up with changes in musl, say.

This problem is a general one: changes in third party (usually C) library APIs that are incompatible. It needs a good solution within Rust. It's a problem that's heavily compounded by set ups that use dynamic libraries (something I've come to see as more trouble than they're worth for secure or robust systems outside of the desktop. In practice, it's a far too difficult for most sysadmins to assess whether a security fix to a dynamic library affects more than on running program, and so they just go for the nuclear option of a reboot). Using autoconf like tests or dynamic bindings at runtime is probably the wrong way to solve this generally. Such approaches require too much of the system they are on (execute permissions, existence of compilation-associated tools, headers, etc). They are also deeply unfriendly to security audits and locked down systems (eg those built entirely from source). autoconf in particular makes the classic mistake that the build system is similar to deployment; it's always been an absolute beast to get things to cross-compile with it repeatedly, robustly and consistently. Too many things (eg time-of-date, location of shell interpreter, absolute sysroot paths, etc) creep into the deployed solution.

(Semantic version does absolutely nothing to solve this; in fact, semantic versioning is a deeply broken concept that's become popular recently. In practice, either a version is compatible or it isn't; semantic versioning is just the upstream's author's assessment. One man's inconsequential security version or minor change is another's nightmare. In practice, with large system set ups and deployments, I always encourage dev teams to think of only two kinds of version: likely-to-be-compatible security fix, and incompatible. Everything incompatible needs to go through the full test cycle before deployment. Security fixes can bypass that if urgent; risk vs reward and all that).

@comex

This comment has been minimized.

Show comment
Hide comment
@comex

comex May 25, 2017

Er, is the siginfo.h change in question actually ABI-breaking? char * and void * should have the same memory representation, so I'd expect that change to break the API (for newly compiled C code) but not the ABI.

Though it seems there's a more general problem to be solved here.

comex commented May 25, 2017

Er, is the siginfo.h change in question actually ABI-breaking? char * and void * should have the same memory representation, so I'd expect that change to break the API (for newly compiled C code) but not the ABI.

Though it seems there's a more general problem to be solved here.

@semarie

This comment has been minimized.

Show comment
Hide comment
@semarie

semarie May 25, 2017

Contributor

@comex yes, the change char * to void * is just an API break regarding OpenBSD. But it is an uncommitable change in crate libc without major version bump.

I started a discussion on internals, and I am working to submit a RFC.

Contributor

semarie commented May 25, 2017

@comex yes, the change char * to void * is just an API break regarding OpenBSD. But it is an uncommitable change in crate libc without major version bump.

I started a discussion on internals, and I am working to submit a RFC.

@Rufflewind

This comment has been minimized.

Show comment
Hide comment
@Rufflewind

Rufflewind Jun 16, 2017

Related: rust-lang/rust#42681

fs::metadata() crashes on FreeBSD 12 due to layout change in stat.h

Rufflewind commented Jun 16, 2017

Related: rust-lang/rust#42681

fs::metadata() crashes on FreeBSD 12 due to layout change in stat.h

@mattmacy

This comment has been minimized.

Show comment
Hide comment
@mattmacy

mattmacy Sep 21, 2017

#775

I don't quite understand how Rust initially missed out on OS and ABI versioning quite so badly - deciding to assuming that structures and types are immutable over time or removing key structures from libc altogether. Nonetheless, Rust can conditionally compile based on configuration values, what is stopping this?

mattmacy commented Sep 21, 2017

#775

I don't quite understand how Rust initially missed out on OS and ABI versioning quite so badly - deciding to assuming that structures and types are immutable over time or removing key structures from libc altogether. Nonetheless, Rust can conditionally compile based on configuration values, what is stopping this?

@semarie

This comment has been minimized.

Show comment
Hide comment
@semarie

semarie Sep 21, 2017

Contributor

@mattmacy my understanding of the problem is:

  • build time configuration adds more complexity for crosscompiling (you need to target a particular OS ABI)
  • crosscompiling is used a lot in Rust infrastructure (for testing, for rustup...) So using OS ABI would mean a potentially high number of new targets to check, and binaries to produce, resulting an infrastructure more complex to maintain (it needs to scale)
  • it is only a problem for BSDs, and they are not a high priority
Contributor

semarie commented Sep 21, 2017

@mattmacy my understanding of the problem is:

  • build time configuration adds more complexity for crosscompiling (you need to target a particular OS ABI)
  • crosscompiling is used a lot in Rust infrastructure (for testing, for rustup...) So using OS ABI would mean a potentially high number of new targets to check, and binaries to produce, resulting an infrastructure more complex to maintain (it needs to scale)
  • it is only a problem for BSDs, and they are not a high priority

@semarie semarie changed the title from How to deal with breaking changes on platform ? [OpenBSD] to How to deal with breaking changes on platform ? [BSDs related] Sep 21, 2017

@asomers

This comment has been minimized.

Show comment
Hide comment
@asomers

asomers Sep 21, 2017

Contributor

@semarie It's not just a BSD problem. Even glibc occasionally bumps the ELF symbol version of a function. Also, the version proliferation problem isn't quite as bad as you're thinking. Using feature flags and symbol versioning, we could build libc for targets like "freebsd10+", "freebsd11+", and "freebsd12+". A project could get pretty good test coverage just by using "freebsd10+".

Contributor

asomers commented Sep 21, 2017

@semarie It's not just a BSD problem. Even glibc occasionally bumps the ELF symbol version of a function. Also, the version proliferation problem isn't quite as bad as you're thinking. Using feature flags and symbol versioning, we could build libc for targets like "freebsd10+", "freebsd11+", and "freebsd12+". A project could get pretty good test coverage just by using "freebsd10+".

@mattmacy

This comment has been minimized.

Show comment
Hide comment
@mattmacy

mattmacy Sep 21, 2017

@asomers @semarie This is an artifact of myopic developers who have nothing more than a superficial understanding of the Linux ecosystem post-2010 - Linux is the post-M$ monoculture. This is going to be a pervasive issue over the longer term and thus runs counter to Rust's mission statement as a general system's language. This isn't rocket science and I hope we don't end up needing to go down the Go "replace all the host libs" route.

mattmacy commented Sep 21, 2017

@asomers @semarie This is an artifact of myopic developers who have nothing more than a superficial understanding of the Linux ecosystem post-2010 - Linux is the post-M$ monoculture. This is going to be a pervasive issue over the longer term and thus runs counter to Rust's mission statement as a general system's language. This isn't rocket science and I hope we don't end up needing to go down the Go "replace all the host libs" route.

@kev009

This comment has been minimized.

Show comment
Hide comment
@kev009

kev009 Sep 21, 2017

@semarie that's not really a nice way to put it, it's been demonstrated to be an issue on "priority" OSes like Linux and OS X. The fact that many things are working is an undefined behavior and this should not fester.

kev009 commented Sep 21, 2017

@semarie that's not really a nice way to put it, it's been demonstrated to be an issue on "priority" OSes like Linux and OS X. The fact that many things are working is an undefined behavior and this should not fester.

@mattmacy

This comment has been minimized.

Show comment
Hide comment
@mattmacy

mattmacy Sep 22, 2017

@kev009 - Exactly. A big selling point of Rust is its efforts to eliminate UBs. In light of that, no one can really claim it to be "cross platform" until this issue is formally addressed.

mattmacy commented Sep 22, 2017

@kev009 - Exactly. A big selling point of Rust is its efforts to eliminate UBs. In light of that, no one can really claim it to be "cross platform" until this issue is formally addressed.

@semarie

This comment has been minimized.

Show comment
Hide comment
@semarie

semarie Sep 22, 2017

Contributor

@asomers I don't understand in depth ELF symbol version of a function, and how particulary how glibc uses it. So I couldn't comment on that. But Rust language has syntax to bind a particular extern function (in Rust) to a specific symbol (in library).

What I know is some OS (and particulary OpenBSD) has different policy regarding compatibility between OS versions. In particular for OpenBSD, ABI could change between OS versions, so it is expected to a compiler to target a specific OS version. OpenBSD does it only if required (we don't do it just for pleasure), but it happens.

The current case with FreeBSD is interesting. First it is FreeBSD and not OpenBSD. The users base is more important, and for Rust the platform is tier-2 (OpenBSD is tier-3). The kind of change (passing underline ino_t struction from 32 bits to 64 bits) is problematic because it affects structure ABI and not only function ABI.

FreeBSD provides a thin compatibility layer for helping a binary compiled for FreeBSD 11 to run on FreeBSD 12, but the intent is to help migration, not to ensure a full compatibility (64 bits inode isn't representable on 32 bits structure ; the default behaviour on FreeBSD 12 is truncation). I assume OpenBSD doesn't provide such mecanism specifically for avoiding third-party to abuse them (like only using the compat layer for simplicity), and keep a clean and tested base (compat layer isn't a standard code, so it is less used and less tested).

Contributor

semarie commented Sep 22, 2017

@asomers I don't understand in depth ELF symbol version of a function, and how particulary how glibc uses it. So I couldn't comment on that. But Rust language has syntax to bind a particular extern function (in Rust) to a specific symbol (in library).

What I know is some OS (and particulary OpenBSD) has different policy regarding compatibility between OS versions. In particular for OpenBSD, ABI could change between OS versions, so it is expected to a compiler to target a specific OS version. OpenBSD does it only if required (we don't do it just for pleasure), but it happens.

The current case with FreeBSD is interesting. First it is FreeBSD and not OpenBSD. The users base is more important, and for Rust the platform is tier-2 (OpenBSD is tier-3). The kind of change (passing underline ino_t struction from 32 bits to 64 bits) is problematic because it affects structure ABI and not only function ABI.

FreeBSD provides a thin compatibility layer for helping a binary compiled for FreeBSD 11 to run on FreeBSD 12, but the intent is to help migration, not to ensure a full compatibility (64 bits inode isn't representable on 32 bits structure ; the default behaviour on FreeBSD 12 is truncation). I assume OpenBSD doesn't provide such mecanism specifically for avoiding third-party to abuse them (like only using the compat layer for simplicity), and keep a clean and tested base (compat layer isn't a standard code, so it is less used and less tested).

@comex

This comment has been minimized.

Show comment
Hide comment
@comex

comex Sep 22, 2017

The entire point of ELF symbol versioning is to prevent breaking ABI backwards compatibility, by allowing programs linked against an old version of the library to continue to use the old definition. glibc, as a well-behaved library, would not intentionally break ABI backwards compatibility without bumping the version in the soname, i.e. libc.so.6 to libc.so.7. Given that the transition from 5 to 6 happened in 1997, I don't expect that to happen anytime soon.

macOS also tries to preserve ABI backwards compatibility, and I'd like to hear more about the issues @raphaelcohn was experiencing. (It too migrated from 32-bit to 64-bit inodes once, with 10.5 Leopard in 2007, but the compatibility layer remains present to this day.)

What I suspect is that both people are actually talking about forwards compatibility breakage, not backwards - that is, binaries compiled against new headers won't run on old OSes, which can easily happen even if the code itself doesn't use any functionality specific to the new OS. That's an important issue, but it's different from backwards compatibility breakage, as it's more amenable to solutions that don't require maintaining totally separate builds for the two versions, e.g. using runtime detection.

edit: clarified "once"

comex commented Sep 22, 2017

The entire point of ELF symbol versioning is to prevent breaking ABI backwards compatibility, by allowing programs linked against an old version of the library to continue to use the old definition. glibc, as a well-behaved library, would not intentionally break ABI backwards compatibility without bumping the version in the soname, i.e. libc.so.6 to libc.so.7. Given that the transition from 5 to 6 happened in 1997, I don't expect that to happen anytime soon.

macOS also tries to preserve ABI backwards compatibility, and I'd like to hear more about the issues @raphaelcohn was experiencing. (It too migrated from 32-bit to 64-bit inodes once, with 10.5 Leopard in 2007, but the compatibility layer remains present to this day.)

What I suspect is that both people are actually talking about forwards compatibility breakage, not backwards - that is, binaries compiled against new headers won't run on old OSes, which can easily happen even if the code itself doesn't use any functionality specific to the new OS. That's an important issue, but it's different from backwards compatibility breakage, as it's more amenable to solutions that don't require maintaining totally separate builds for the two versions, e.g. using runtime detection.

edit: clarified "once"

@semarie

This comment has been minimized.

Show comment
Hide comment
@semarie

semarie Sep 22, 2017

Contributor

Here a resume for OpenBSD : the rule is compatibility isn't guarantee neither for backwards or forwards between OS releases (all releases are considered as major release versions).

But in general, it is possible to run a binary from version N-1 on version N, but it is only a facility provided if possible (without to much effort for OpenBSD developers). Older things are just removed (it helps to keep the system in a well tested and audited state, without keeping old crafted code paths).

Once released, the specific OS version has ABI/API stability guarantee. Only security and reliability changes will be commited on the branch: they are published errata.

OpenBSD doesn't use ELF symbol versioning. But libraries (system and packaged third-party) have a strict major/minor bump policy (we have currently a libc.so.90.0 in 6.2-beta, and OpenBSD 6.1 has libc.so.89.3).

Contributor

semarie commented Sep 22, 2017

Here a resume for OpenBSD : the rule is compatibility isn't guarantee neither for backwards or forwards between OS releases (all releases are considered as major release versions).

But in general, it is possible to run a binary from version N-1 on version N, but it is only a facility provided if possible (without to much effort for OpenBSD developers). Older things are just removed (it helps to keep the system in a well tested and audited state, without keeping old crafted code paths).

Once released, the specific OS version has ABI/API stability guarantee. Only security and reliability changes will be commited on the branch: they are published errata.

OpenBSD doesn't use ELF symbol versioning. But libraries (system and packaged third-party) have a strict major/minor bump policy (we have currently a libc.so.90.0 in 6.2-beta, and OpenBSD 6.1 has libc.so.89.3).

@asomers

This comment has been minimized.

Show comment
Hide comment
@asomers

asomers Sep 22, 2017

Contributor

@comex you are confusing ELF symbol version with soname versioning. Libraries bump their sonames due to major changes, like ncurses did two years ago and glibc did in 1997. But ELF symbol versioning is used to maintain backwards compatibility while making minor changes to the library. Glibc is very conservative in its use of ELF symbol versioning, but it does occasionally bump a symbol version. Look through its sources and you'll find a few examples.

Rust's difficulty comes from its extensive use of FFI and cross-compiling, which cause it to ignore all header files. The problem isn't backwards or forwards compatibility per-se. Rather, it's that libc simply cannot express bindings for more than one version of a library. Right now, it simply isn't possible to build a Rust program that uses libc and will run on FreeBSD 12. It doesn't matter what host you build it on. And if libc ever updates its bindings to the FreeBSD 12 ones, then it will no longer be possible to build a Rust program that will work on FreeBSD 11.

Contributor

asomers commented Sep 22, 2017

@comex you are confusing ELF symbol version with soname versioning. Libraries bump their sonames due to major changes, like ncurses did two years ago and glibc did in 1997. But ELF symbol versioning is used to maintain backwards compatibility while making minor changes to the library. Glibc is very conservative in its use of ELF symbol versioning, but it does occasionally bump a symbol version. Look through its sources and you'll find a few examples.

Rust's difficulty comes from its extensive use of FFI and cross-compiling, which cause it to ignore all header files. The problem isn't backwards or forwards compatibility per-se. Rather, it's that libc simply cannot express bindings for more than one version of a library. Right now, it simply isn't possible to build a Rust program that uses libc and will run on FreeBSD 12. It doesn't matter what host you build it on. And if libc ever updates its bindings to the FreeBSD 12 ones, then it will no longer be possible to build a Rust program that will work on FreeBSD 11.

@mattmacy

This comment has been minimized.

Show comment
Hide comment
@mattmacy

mattmacy Sep 22, 2017

@asomers As an old timer I can tell you that Linux distros never used symbol versioning with great success and thus shared libraries were viewed as broken by design. At my work places they would turn off automatic RHEL security updates because it frequently crippled all their systems with incompatible symbol resolution. Moving to a frozen ABI is a (healthy) response to their inability to make versioning work. The Rust developers assumption that everything looks like Linux has led them to overlook the fact that the source contract is not the same as the ABI contract. Building for a given version of the OS assumes that you're using the system headers from that version. FreeBSD will happily run dynamically linked binaries at least as far back as 4.x if not much further. However, if you're building on version X you're expected to use headers for version X, not whatever version you originally chose to copy the definitions from.

mattmacy commented Sep 22, 2017

@asomers As an old timer I can tell you that Linux distros never used symbol versioning with great success and thus shared libraries were viewed as broken by design. At my work places they would turn off automatic RHEL security updates because it frequently crippled all their systems with incompatible symbol resolution. Moving to a frozen ABI is a (healthy) response to their inability to make versioning work. The Rust developers assumption that everything looks like Linux has led them to overlook the fact that the source contract is not the same as the ABI contract. Building for a given version of the OS assumes that you're using the system headers from that version. FreeBSD will happily run dynamically linked binaries at least as far back as 4.x if not much further. However, if you're building on version X you're expected to use headers for version X, not whatever version you originally chose to copy the definitions from.

@comex

This comment has been minimized.

Show comment
Hide comment
@comex

comex Sep 22, 2017

@comex you are confusing ELF symbol version with soname versioning.

I understand the difference, and I think I stated it correctly in the first paragraph of my last comment, but perhaps my wording was unclear.

My point was that even if glibc occasionally bumps a symbol version, that is not a backwards compatibility break (for old binaries), unlike the situation with - at least - OpenBSD. According to @mattmacy's latest comment, FreeBSD actually doesn't break old binaries either; in that case #42681 is 'just' a mismatch between Rust bindings and the version of the system present in the build sysroot, not the version of the system present at runtime. I think FreeBSD implements this using ELF symbol versioning. If so, Rust could hypothetically choose to compile 11-compatible binaries regardless of whether the sysroot is version 11 or 12, by explicitly specifying a symbol version in the object file rather than leaving it to the linker; it could then add runtime detection to use the FreeBSD 12 symbols if available. I could be wrong about how it works, and in any case I'm not sure whether such a compatibility measure would be desirable. The alternative would be, as @asomers suggested, creating targets like freebsd11+ and freebsd12+, corresponding to the sysroot version.

But in general, I perceive a few differences between backwards-compatibility ABI breaks and forwards-compatibility breaks:

  • With forwards-compatibility breaks, the changes are probably not so far-reaching as to require excessive duplication in bindings that support both the old and the new version. After all, the library's own codebase must contain similar duplication. Admittedly, this is a heuristic, and "excessive" is in the eye of the beholder. As one example, on my Mac, I count 38 libc functions (quite a lot) that have duplicate 32-bit-inode and 64-bit-inode versions, and at least two structs (struct stat, struct statfs). On the other hand, backwards compatibility breaks have no limits on how much they can change, and no reason to care about the difficulty of supporting both the old and new ABI in one binary.

  • Ditto in time rather than space: on systems designed for ABI forwards compatibility, subtle changes like adding new symbol versions probably won't occur too frequently, because it's a pain to deal with them on the implementation side, and the old versions have to be kept around 'forever' (well, at least for some time). On a system that breaks ABI backwards compatibility every version (but preserves API compatibility), someone might decide to fiddle with a struct's layout for 5 versions in a row (because why not?), and an omni-compatible binary would have to contain 5 alternate implementations for everything that uses the struct. On systems that preserve backwards compatibility, in most cases, you're unlikely to see more than two versions of any given symbol.

  • With backwards-compatibility breaks, users do not expect the same binaries to run on both versions, and will generally replace all their binaries when upgrading. Although in most cases it would be possible for Rust to produce binaries that run on both via runtime detection, it may not be that useful, since it wouldn't fit into common user workflows. By contrast, for forwards-compatibility breaks, there's a preexisting concept of "X-or-later compatible" binaries, and a preexisting pattern whereby binary distributions of C programs can choose to maximize their compatibility by building against the oldest possible version - a pattern which Rust can directly emulate, and improve on by adding runtime detection. But this depends on cultural factors too. On, say, macOS, targeting old OS versions is a given, and you can do so while compiling against the latest SDK just by passing an appropriate value to -mmacosx-version-min; the compiler and system headers work together to ensure the output binary is compatible with the specified version. On FreeBSD, from what people have written in this thread, it seems like binary compatibility is seen more as a stopgap measure. There's an open-source vs. proprietary angle here, of course...

  • With forwards compatibility breaks, you can generally query at runtime whether the old and/or new symbols are available, just looking at the symbols themselves. With backwards compatibility breaks, it may be necessary to resort to indirect measures (like the OS version), which are less reliable. As a semi-related example, rustc has a C++ helper for LLVM features that aren't exposed via the stable C API, only the unstable C++ API, and it contains a ton of #if blocks for 'is LLVM version greater than X'. But at one point I tried building against Xcode's LLVM, which was a branch of an older trunk version with newer changes cherry-picked on top. API-wise, it mostly corresponded to an LLVM version different from the one it declared, which confused the heck out of Rust's helper - but changing the declared version wasn't enough, because it wasn't fully consistent with any one version. I imagine you could have similar issues with OS ABIs :)

  • Also, 'always use the old symbol' is always available as a starting point. It's usually not a good ending point; chances are the ABI break happened for a reason. If Rust decided to always target 32-bit inodes on FreeBSD, the resulting binaries would 'work' on all OS versions, but they'd break at runtime when encountering a file whose inode actually can't fit into 32 bits. But there may be cases where that isn't really necessary.

  • As a special case of the above: a system designed for ABI forward compatibility may have API changes that from the ABI's perspective are no change at all, like the void * vs. char * thing earlier in the thread. In this case, Rust should seriously consider always using one or the other.

All that said, even when increased compatibility compared to C is possible, its benefits have to be weighed against fundamental drawbacks. One is the binary bloat of runtime detection. Another is simply the fact of Rust acting differently from C. One of Rust's biggest selling points is that it feels like a native citizen of the OS, a companion to C; it doesn't live in its own world with its own VM or toolchain. Thus, users may justifiably expect the libc crate to work the same way, both API- and ABI-wise, as native C on the same platform.

Then again, Rust does already have its own build system and its own conception of 'target'. In addition, whereas C compilers are generally shipped as operating system components, with each OS vendor responsible for getting everything working on their OS, Rust has rust-lang.org binaries as a major distribution mechanism for the compiler, so it's more desirable to limit the number of separate targets that have to be built and maintained.

comex commented Sep 22, 2017

@comex you are confusing ELF symbol version with soname versioning.

I understand the difference, and I think I stated it correctly in the first paragraph of my last comment, but perhaps my wording was unclear.

My point was that even if glibc occasionally bumps a symbol version, that is not a backwards compatibility break (for old binaries), unlike the situation with - at least - OpenBSD. According to @mattmacy's latest comment, FreeBSD actually doesn't break old binaries either; in that case #42681 is 'just' a mismatch between Rust bindings and the version of the system present in the build sysroot, not the version of the system present at runtime. I think FreeBSD implements this using ELF symbol versioning. If so, Rust could hypothetically choose to compile 11-compatible binaries regardless of whether the sysroot is version 11 or 12, by explicitly specifying a symbol version in the object file rather than leaving it to the linker; it could then add runtime detection to use the FreeBSD 12 symbols if available. I could be wrong about how it works, and in any case I'm not sure whether such a compatibility measure would be desirable. The alternative would be, as @asomers suggested, creating targets like freebsd11+ and freebsd12+, corresponding to the sysroot version.

But in general, I perceive a few differences between backwards-compatibility ABI breaks and forwards-compatibility breaks:

  • With forwards-compatibility breaks, the changes are probably not so far-reaching as to require excessive duplication in bindings that support both the old and the new version. After all, the library's own codebase must contain similar duplication. Admittedly, this is a heuristic, and "excessive" is in the eye of the beholder. As one example, on my Mac, I count 38 libc functions (quite a lot) that have duplicate 32-bit-inode and 64-bit-inode versions, and at least two structs (struct stat, struct statfs). On the other hand, backwards compatibility breaks have no limits on how much they can change, and no reason to care about the difficulty of supporting both the old and new ABI in one binary.

  • Ditto in time rather than space: on systems designed for ABI forwards compatibility, subtle changes like adding new symbol versions probably won't occur too frequently, because it's a pain to deal with them on the implementation side, and the old versions have to be kept around 'forever' (well, at least for some time). On a system that breaks ABI backwards compatibility every version (but preserves API compatibility), someone might decide to fiddle with a struct's layout for 5 versions in a row (because why not?), and an omni-compatible binary would have to contain 5 alternate implementations for everything that uses the struct. On systems that preserve backwards compatibility, in most cases, you're unlikely to see more than two versions of any given symbol.

  • With backwards-compatibility breaks, users do not expect the same binaries to run on both versions, and will generally replace all their binaries when upgrading. Although in most cases it would be possible for Rust to produce binaries that run on both via runtime detection, it may not be that useful, since it wouldn't fit into common user workflows. By contrast, for forwards-compatibility breaks, there's a preexisting concept of "X-or-later compatible" binaries, and a preexisting pattern whereby binary distributions of C programs can choose to maximize their compatibility by building against the oldest possible version - a pattern which Rust can directly emulate, and improve on by adding runtime detection. But this depends on cultural factors too. On, say, macOS, targeting old OS versions is a given, and you can do so while compiling against the latest SDK just by passing an appropriate value to -mmacosx-version-min; the compiler and system headers work together to ensure the output binary is compatible with the specified version. On FreeBSD, from what people have written in this thread, it seems like binary compatibility is seen more as a stopgap measure. There's an open-source vs. proprietary angle here, of course...

  • With forwards compatibility breaks, you can generally query at runtime whether the old and/or new symbols are available, just looking at the symbols themselves. With backwards compatibility breaks, it may be necessary to resort to indirect measures (like the OS version), which are less reliable. As a semi-related example, rustc has a C++ helper for LLVM features that aren't exposed via the stable C API, only the unstable C++ API, and it contains a ton of #if blocks for 'is LLVM version greater than X'. But at one point I tried building against Xcode's LLVM, which was a branch of an older trunk version with newer changes cherry-picked on top. API-wise, it mostly corresponded to an LLVM version different from the one it declared, which confused the heck out of Rust's helper - but changing the declared version wasn't enough, because it wasn't fully consistent with any one version. I imagine you could have similar issues with OS ABIs :)

  • Also, 'always use the old symbol' is always available as a starting point. It's usually not a good ending point; chances are the ABI break happened for a reason. If Rust decided to always target 32-bit inodes on FreeBSD, the resulting binaries would 'work' on all OS versions, but they'd break at runtime when encountering a file whose inode actually can't fit into 32 bits. But there may be cases where that isn't really necessary.

  • As a special case of the above: a system designed for ABI forward compatibility may have API changes that from the ABI's perspective are no change at all, like the void * vs. char * thing earlier in the thread. In this case, Rust should seriously consider always using one or the other.

All that said, even when increased compatibility compared to C is possible, its benefits have to be weighed against fundamental drawbacks. One is the binary bloat of runtime detection. Another is simply the fact of Rust acting differently from C. One of Rust's biggest selling points is that it feels like a native citizen of the OS, a companion to C; it doesn't live in its own world with its own VM or toolchain. Thus, users may justifiably expect the libc crate to work the same way, both API- and ABI-wise, as native C on the same platform.

Then again, Rust does already have its own build system and its own conception of 'target'. In addition, whereas C compilers are generally shipped as operating system components, with each OS vendor responsible for getting everything working on their OS, Rust has rust-lang.org binaries as a major distribution mechanism for the compiler, so it's more desirable to limit the number of separate targets that have to be built and maintained.

@mattmacy

This comment has been minimized.

Show comment
Hide comment
@mattmacy

mattmacy Sep 23, 2017

@comex Can you distill the implications of your last post?

Yes, Rust has it's own notions of target and outside of Linux and Windows that notion is wrong. I just want to be able to use Rust across different 3-tuples (OS/ABI/arch) as opposed to the current 2-tuple (OS/arch) with the ABI set to wherever they happened to copy the header values from.

mattmacy commented Sep 23, 2017

@comex Can you distill the implications of your last post?

Yes, Rust has it's own notions of target and outside of Linux and Windows that notion is wrong. I just want to be able to use Rust across different 3-tuples (OS/ABI/arch) as opposed to the current 2-tuple (OS/arch) with the ABI set to wherever they happened to copy the header values from.

@mattmacy

This comment has been minimized.

Show comment
Hide comment
@mattmacy

mattmacy Sep 23, 2017

Although I managed to get jsonrpc_http_server bits to work yesterday by patching some of the dependent ports and then using [patch.crates-io] overrides, today I tried building netmap_sys with user libs which failed in the gcc crate, presumably because of something in libstd. I tried installing the FreeBSD port, but the cargo in that version wouldn't let me override libc (if that's something only available in nightlies you really need to make it standard) so although it compiled the binary didn't actually work. I've lost too much time on this to consider Rust a viable option for any near-term deadlines and have concluded that Rust really isn't cross-platform outside of tier 1 targets right now.

mattmacy commented Sep 23, 2017

Although I managed to get jsonrpc_http_server bits to work yesterday by patching some of the dependent ports and then using [patch.crates-io] overrides, today I tried building netmap_sys with user libs which failed in the gcc crate, presumably because of something in libstd. I tried installing the FreeBSD port, but the cargo in that version wouldn't let me override libc (if that's something only available in nightlies you really need to make it standard) so although it compiled the binary didn't actually work. I've lost too much time on this to consider Rust a viable option for any near-term deadlines and have concluded that Rust really isn't cross-platform outside of tier 1 targets right now.

@comex

This comment has been minimized.

Show comment
Hide comment
@comex

comex Sep 23, 2017

@mattmacy From http://doc.crates.io/manifest.html:

Note that the [patch] feature will first become available in Rust 1.21, set to be released on 2017-10-12.

In the mean time you could use [replace].

comex commented Sep 23, 2017

@mattmacy From http://doc.crates.io/manifest.html:

Note that the [patch] feature will first become available in Rust 1.21, set to be released on 2017-10-12.

In the mean time you could use [replace].

@mattmacy

This comment has been minimized.

Show comment
Hide comment
@mattmacy

mattmacy Sep 23, 2017

This is why this oversight really perplexes me. Having the version be an explicit part of the OS target is not hidden and is visible to anyone who has looked at a configure script for any cross-platform OSS program:

 | |                     \-+- 69334 mmacy make DIRPRFX=lib/clang/ all
 | |                       \-+= 69335 mmacy sh -e
 | |                         \-+- 69336 mmacy make all DIRPRFX=lib/clang/libllvm/
 | |                           |-+= 71356 mmacy sh -ev
 | |                           | \-+- 71357 mmacy c++ -O2 -pipe -I/home/mmacy/devel/build/usr/home/mmacy/devel
 | |                           |   \--- 71358 mmacy /usr/bin/c++ -cc1 -triple x86_64-unknown-freebsd11.1 -emit
 | |                           |-+= 71499 mmacy sh -ev
 | |                           | \-+- 71500 mmacy c++ -O2 -pipe -I/home/mmacy/devel/build/usr/home/mmacy/devel
 | |                           |   \--- 71501 mmacy /usr/bin/c++ -cc1 -triple x86_64-unknown-freebsd11.1 -emit
 | |                           |-+= 71769 mmacy sh -ev
 | |                           | \-+- 71770 mmacy c++ -O2 -pipe -I/home/mmacy/devel/build/usr/home/mmacy/devel
 | |                           |   \--- 71771 mmacy /usr/bin/c++ -cc1 -triple x86_64-unknown-freebsd11.1 -emit
 | |                           |-+= 71772 mmacy sh -ev
 | |                           | \-+- 71773 mmacy c++ -O2 -pipe -I/home/mmacy/devel/build/usr/home/mmacy/devel
 | |                           |   \--- 71774 mmacy /usr/bin/c++ -cc1 -triple x86_64-unknown-freebsd11.1 -emit
 | |                           |-+= 71778 mmacy sh -ev
 | |                           | \-+- 71779 mmacy c++ -O2 -pipe -I/home/mmacy/devel/build/usr/home/mmacy/devel
 | |                           |   \--- 71780 mmacy /usr/bin/c++ -cc1 -triple x86_64-unknown-freebsd11.1 -emit
 | |                           |-+= 71781 mmacy sh -ev
 | |                           | \-+- 71782 mmacy c++ -O2 -pipe -I/home/mmacy/devel/build/usr/home/mmacy/devel
 | |                           |   \--- 71783 mmacy /usr/bin/c++ -cc1 -triple x86_64-unknown-freebsd11.1 -emit
 | |                           |-+= 71784 mmacy sh -ev
 | |                           | \-+- 71785 mmacy c++ -O2 -pipe -I/home/mmacy/devel/build/usr/home/mmacy/devel
 | |                           |   \--- 71786 mmacy /usr/bin/c++ -cc1 -triple x86_64-unknown-freebsd11.1 -emit
 | |                           \-+= 71787 mmacy sh -ev
 | |                             \-+- 71788 mmacy c++ -O2 -pipe -I/home/mmacy/devel/build/usr/home/mmacy/devel
 | |                               \--- 71789 mmacy /usr/bin/c++ -cc1 -triple x86_64-unknown-freebsd11.1 -emit

mattmacy commented Sep 23, 2017

This is why this oversight really perplexes me. Having the version be an explicit part of the OS target is not hidden and is visible to anyone who has looked at a configure script for any cross-platform OSS program:

 | |                     \-+- 69334 mmacy make DIRPRFX=lib/clang/ all
 | |                       \-+= 69335 mmacy sh -e
 | |                         \-+- 69336 mmacy make all DIRPRFX=lib/clang/libllvm/
 | |                           |-+= 71356 mmacy sh -ev
 | |                           | \-+- 71357 mmacy c++ -O2 -pipe -I/home/mmacy/devel/build/usr/home/mmacy/devel
 | |                           |   \--- 71358 mmacy /usr/bin/c++ -cc1 -triple x86_64-unknown-freebsd11.1 -emit
 | |                           |-+= 71499 mmacy sh -ev
 | |                           | \-+- 71500 mmacy c++ -O2 -pipe -I/home/mmacy/devel/build/usr/home/mmacy/devel
 | |                           |   \--- 71501 mmacy /usr/bin/c++ -cc1 -triple x86_64-unknown-freebsd11.1 -emit
 | |                           |-+= 71769 mmacy sh -ev
 | |                           | \-+- 71770 mmacy c++ -O2 -pipe -I/home/mmacy/devel/build/usr/home/mmacy/devel
 | |                           |   \--- 71771 mmacy /usr/bin/c++ -cc1 -triple x86_64-unknown-freebsd11.1 -emit
 | |                           |-+= 71772 mmacy sh -ev
 | |                           | \-+- 71773 mmacy c++ -O2 -pipe -I/home/mmacy/devel/build/usr/home/mmacy/devel
 | |                           |   \--- 71774 mmacy /usr/bin/c++ -cc1 -triple x86_64-unknown-freebsd11.1 -emit
 | |                           |-+= 71778 mmacy sh -ev
 | |                           | \-+- 71779 mmacy c++ -O2 -pipe -I/home/mmacy/devel/build/usr/home/mmacy/devel
 | |                           |   \--- 71780 mmacy /usr/bin/c++ -cc1 -triple x86_64-unknown-freebsd11.1 -emit
 | |                           |-+= 71781 mmacy sh -ev
 | |                           | \-+- 71782 mmacy c++ -O2 -pipe -I/home/mmacy/devel/build/usr/home/mmacy/devel
 | |                           |   \--- 71783 mmacy /usr/bin/c++ -cc1 -triple x86_64-unknown-freebsd11.1 -emit
 | |                           |-+= 71784 mmacy sh -ev
 | |                           | \-+- 71785 mmacy c++ -O2 -pipe -I/home/mmacy/devel/build/usr/home/mmacy/devel
 | |                           |   \--- 71786 mmacy /usr/bin/c++ -cc1 -triple x86_64-unknown-freebsd11.1 -emit
 | |                           \-+= 71787 mmacy sh -ev
 | |                             \-+- 71788 mmacy c++ -O2 -pipe -I/home/mmacy/devel/build/usr/home/mmacy/devel
 | |                               \--- 71789 mmacy /usr/bin/c++ -cc1 -triple x86_64-unknown-freebsd11.1 -emit
@kev009

This comment has been minimized.

Show comment
Hide comment
@kev009

kev009 Sep 24, 2017

@comex I think you're starting down the right track now. The current rust libc is trying to do something that is not really tractable, an impedance mismatch with C language calling and linkage conventions. In effect: lazy binding with a static set of bindings. That feels like something you might do in a dynamic language, and there you might not care much about correctness or performance and can further do runtime introspection or generation to smooth out the issues. So I do feel strongly the current Rust behavior here is undefined behavior on all platforms and think it is unfair to characterize it as a BSD issue. It happens to work widely for now, but it is a major design flaw that just happen to be seen on BSD right now and will fester as rust wants to grow.

Let's step back to buildeng history for a bit. If I were a major ISV, I want to support some forward range of OS versions. We'll chose Red Hat EL. So I build on EL5, and I build my app on there and it will work for probably a very long string of future versions perhaps with compat packages that maintain an old sover since sover ought to be bumped for ABI changes. You can substitute Red Hat EL for Windows or HP-UX or anything else, OS makers usually strive hard to keep that kind of forward compat. There is one exception, OpenBSD, where forward compat is not guaranteed at all, and binaries are expected to match the running major.minor. But OpenBSD's choices aren't really an issue for C language authors if you understand the root issue, they just choose no amount of forward compat. You build on the oldest platform you wish to support for all OSes. As an implementation detail this could be in chroots, jails, containers or just plain path, compiler and linker flag gymnastics for cross building. That's how things have been done for a long time.

Now, there is a second and much less reliable case for building with backwards compat. In a big IDE this might be a hidden build process which looks like the above path/flag munging to target the compiler to the right runtime and linkage. In something like MSVCxx that might be by bundling a runtime environment with the installer. I'm not sure rust should bother with this at the moment, but it's not something I've put a great deal of thought into.

Another thing I haven't put a great deal of thought into is direct usage of syscalls by the language or libraries. Things discussed like soname and sover go out the window with that.

kev009 commented Sep 24, 2017

@comex I think you're starting down the right track now. The current rust libc is trying to do something that is not really tractable, an impedance mismatch with C language calling and linkage conventions. In effect: lazy binding with a static set of bindings. That feels like something you might do in a dynamic language, and there you might not care much about correctness or performance and can further do runtime introspection or generation to smooth out the issues. So I do feel strongly the current Rust behavior here is undefined behavior on all platforms and think it is unfair to characterize it as a BSD issue. It happens to work widely for now, but it is a major design flaw that just happen to be seen on BSD right now and will fester as rust wants to grow.

Let's step back to buildeng history for a bit. If I were a major ISV, I want to support some forward range of OS versions. We'll chose Red Hat EL. So I build on EL5, and I build my app on there and it will work for probably a very long string of future versions perhaps with compat packages that maintain an old sover since sover ought to be bumped for ABI changes. You can substitute Red Hat EL for Windows or HP-UX or anything else, OS makers usually strive hard to keep that kind of forward compat. There is one exception, OpenBSD, where forward compat is not guaranteed at all, and binaries are expected to match the running major.minor. But OpenBSD's choices aren't really an issue for C language authors if you understand the root issue, they just choose no amount of forward compat. You build on the oldest platform you wish to support for all OSes. As an implementation detail this could be in chroots, jails, containers or just plain path, compiler and linker flag gymnastics for cross building. That's how things have been done for a long time.

Now, there is a second and much less reliable case for building with backwards compat. In a big IDE this might be a hidden build process which looks like the above path/flag munging to target the compiler to the right runtime and linkage. In something like MSVCxx that might be by bundling a runtime environment with the installer. I'm not sure rust should bother with this at the moment, but it's not something I've put a great deal of thought into.

Another thing I haven't put a great deal of thought into is direct usage of syscalls by the language or libraries. Things discussed like soname and sover go out the window with that.

@jakllsch

This comment has been minimized.

Show comment
Hide comment
@jakllsch

jakllsch Aug 14, 2018

Contributor

I've run into this issue while trying to ascertain the correctness of the libc crate on NetBSD 7.x and 8.x. The current situation has the libc crate using some combination of 7.x and 8.x ABI/APIs, while the official distribution artifacts for x86_64-unknown-netbsd currently target 7.x. Both NetBSD 7.x and 8.x are currently-supported release serieses.

I really want a way to express these and future ABIs from a single version of the libc crate, that can target a specific OS kernel-libc version at compile time. Targeting the current oldest supported NetBSD release series might be a reasonable way forward if that is impractical, as there is libc-level and kernel-level backward compatibily across many major versions for old binaries.

Contributor

jakllsch commented Aug 14, 2018

I've run into this issue while trying to ascertain the correctness of the libc crate on NetBSD 7.x and 8.x. The current situation has the libc crate using some combination of 7.x and 8.x ABI/APIs, while the official distribution artifacts for x86_64-unknown-netbsd currently target 7.x. Both NetBSD 7.x and 8.x are currently-supported release serieses.

I really want a way to express these and future ABIs from a single version of the libc crate, that can target a specific OS kernel-libc version at compile time. Targeting the current oldest supported NetBSD release series might be a reasonable way forward if that is impractical, as there is libc-level and kernel-level backward compatibily across many major versions for old binaries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment