How to avoid complicated coordinated upgrades
Rust
Switch branches/tags
Nothing to show
Latest commit 2eb7391 Jul 10, 2017 @dtolnay committed on GitHub Merge pull request #1 from gyscos/patch-1
Do not apply breaking change in libc 0.2.1

README.md

The semver trick

The semver trick refers to publishing a breaking change to a Rust library without requiring a coordinated upgrade across its downstream dependency graph. The trick is built around having one version of your library declare a dependency on a newer version of the same library.

Illustrative example

The Rust library ecosystem has a history of traumatic library upgrades. The upgrade of libc from 0.1 to 0.2 is known as the "libcpocalypse". Another frequent culprit was pre-1.0 Serde, with the upgrades from 0.7 to 0.8 to 0.9 to 1.0 requiring ecosystem-wide effort.

The cause of the difficulty was the large number of crates using types from these libraries in their public API.

By way of example, consider a simplified version of the libc crate that exposes only two things: the c_void type and the EVFILT_AIO constant from NetBSD.

// libc 0.2.0

pub type c_void = /* it's complicated */;

pub const EVFILT_AIO: int32_t = 2;

The c_void type becomes widely used as hundreds of libraries want to expose functions that are ABI-compatible with C's void * type. Meanwhile the EVFILT_AIO constant is less commonly used and never in the public API of downstream crates.

extern {
    // Usable from C as:
    //
    //    void qsort(void *base,
    //               size_t nitems,
    //               size_t size,
    //               int (*compar)(const void *, const void*));
    //
    // The `c_void` type is now part of the public API of this crate.
    pub fn qsort(base: *mut c_void,
                 nitems: usize,
                 size: usize,
                 compar: Option<unsafe extern fn(*const c_void, *const c_void) -> c_int>);
}

After some time, it is discovered that EVFILT_AIO should have been defined as uint32_t rather than int32_t to match how it is used elsewhere in NetBSD header files (rust-lang/libc#506).

This fix would be a breaking change to the libc crate. Existing code that passes libc::EVFILT_AIO to a function accepting an argument of type int32_t would be broken, and this needs to be reflected in the semver version of the libc crate.

Here is where things go wrong.

Coordinated upgrades

Suppose we make the fix and publish it as a breaking change.

// libc 0.3.0

pub type c_void = /* it's complicated */;

pub const EVFILT_AIO: uint32_t = 2;

Despite the fact that the definition of c_void has not changed, technically the c_void from libc 0.2 and the c_void from libc 0.3 are different types. In Rust (as in C, for that matter), two structs are not interchangeable just because they look the same.

That means if crate A depends on crate B which depends on libc, and B uses c_void in the public API of some function called by A, then A cannot upgrade to libc 0.3 until B has upgraded to libc 0.3. If A upgrades before B, then A is going to try to pass libc 0.3's c_void to B's function that still expects libc 0.2's c_void and will not compile.

What needs to happen is first B upgrades to libc 0.3, releases this as a major version bump of B (because its public API has changed in a breaking way), and then A may upgrade to the new version of B.

For longer dependency chains this is a huge ordeal and requires coordinated effort across dozens of developers. During the most recent libcpocalypse, Servo found themselves coordinating an upgrade of 52 libraries over a period of three months (servo/servo#8608).

The trick

At the heart of the problem is having a widely used API caught up in the breakage of a much less widely used API. Rust and Cargo are capable of handling this predicament in a better way.

All we need is one modification to the c_void / EVFILT_AIO example from above.

After making the breaking change and publishing it as libc 0.3.0, we release one final minor version of the 0.2 series and re-export the unchanged API(s) from 0.3.

In Cargo.toml:

[package]
name = "libc"
version = "0.2.1"

[dependencies]
libc = "0.3"  # future version of itself

And in lib.rs:

// libc 0.2.1

extern crate libc; // this pulls in 0.3 as per Cargo.toml

pub use libc::c_void;

pub const EVFILT_AIO: int32_t = 2;

This way we avoid the problem of having two c_void types that look the same but are not interchangeable. Here the c_void from libc 0.2.1 and the c_void from libc 0.3.0 are precisely the same type.

The libcpocalypse scenario is averted because users of libc can upgrade from 0.2 to 0.3 at their leisure, in any order, without needing to bump their own semver major version.

Advanced trickery

With some care and creativity, the technique above can be generalized to lots of different breaking change situations. The semver-trick example crate included in this repo demonstrates some types of changes that can be accomodated.

Limitations

This is not the silver bullet that solves all occurrences of dependency hell.

Fundamentally the semver trick is beneficial when a crate needs to break a rarely used API while leaving widely used APIs unchanged, or when a crate wants to shuffle types around in its module hierarchy.

Most other types of breakage are not helped by this trick. A concrete example would be adding a new method to a widely used trait in your library.

Other tricks

Where the semver trick is not applicable, it can be possible to mitigate the impact of breaking changes in other ways. The Serde legacy shims are one example of this.

License

To the extent that it constitutes copyrightable work, the idea of depending on a future version of the same library is licensed under the CC0 1.0 Universal license (LICENSE-CC0) and may be used without attribution.

This document and the accompanying semver-trick example crate are licensed under either of Apache License, Version 2.0 (LICENSE-APACHE) or MIT license (LICENSE-MIT) at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in this codebase by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.