Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guide on binary compatibility for library authors #881

Merged
merged 13 commits into from
Jan 9, 2018

Conversation

jatcwang
Copy link
Contributor

@jatcwang jatcwang commented Sep 11, 2017

Added guide on binary compatibility, with library authors as the target audience. Addresses #621

After building you can see the guide on http://localhost:4000/tutorials/binary-compatibility-for-library-authors.html

TODOs:

  • Able to navigate to the actual guide from the homepage
  • "Designing for Evolution" section - I'm setting up an example repo to present this information with more detail (and actual code examples that show the breakages). Will link it in the main guide after I'm happy with it DONE

Let me know if there are things that I've missed. I'm also up for reorganization if you think it will help with the presentation/delivery.

@jatcwang jatcwang changed the title Guide on binary compatibility for library authors [WIP] Guide on binary compatibility for library authors Sep 11, 2017
* Any release with the same major version are **Binary Backwards Compatible** with each other
* A minor version bump signals new features and **may contain minor source incompatibilities** that can be easily fixed by the end user
* Patch version for bugfixes and minor behavioural changes
* For **expreimental library versions** (where the major version is `0`, such as `v0.1.0`), a minor version bump **may contain both source and binary breakages**
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo, experimental

if we say that any library release of `v1.x.x` will be forwards compatible, we can use `v1.0.0` anywhere where `v1.1.0` was originally used.
For the rest of the guide, when the 'compatible' is used we mean backwards compatible, as it is the more common case of compatibility guarantee.

An important note to make is that while breaking source compatibility normally results in breaking binary compatibility, they are actually orthorgonal
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: orthogonal


![Initial dependency graph]({{ site.baseurl }}/resources/images/library-author-guide/before_update.png){: style="width: 280px; margin: auto; display: block;"}

Some time later, we see `B v1.1.0` is now available and we upgraded the version in our `build.sbt`. Our code compiles and seems to work so we push it to production and goes home for dinner.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Times and Typo:
Some time later, we see B v1.1.0 is available and upgrade its version in our build.sbt. Our code compiles and seems to work so we push it to production and go home for dinner.


Now imagine if our App is more complex with lots of dependencies themselves depending on `C` (either directly or transitively) - it becomes extremely difficult to upgrade any dependencies because it now
pulls in a version of `C` that is incompatible with the rest of the versions of `C` in our dependency tree! In the example below, we cannot upgrade `D` because it will transitively pull in `C v2.0.0`, causing breakages
due to binary incompatibility. This inability to upgrade any packages without breaking anything is common known as **Dependency Hell**.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: commonly

and all other versions of the same library are **evicted**. When packaging applications, the same versions of libraries that was used for compiling the
application is packaged and used during runtime.

Two library versions are said to be **Source Compatible** if switching one for the other does not incur any compile errors. For example, If we can switch from `v1.0.0` of a dependency to `v2.0.0` and
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer if the examples would use the proposed versioning scheme. I think it's easier for the reader to follow and internalize it's rules.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeap good catch. That's my intention too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member

@jvican jvican left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @jatcwang, really cool work. I propose some rewordings and slight modifications to improve the correctness and readability of the document.

I'm really excited that you took up this task, and I'd like to thank you personally for taking the time to write this down.

## Introduction

A diverse and comprehensive set of libraries is important to any productive software ecosystem. While it is easy to develop and distribute Scala libraries, good library authorship goes far
beyond just writing code and publishing them.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd write "beyond just writing code and publish it".

In this guide, we will cover the important topic of **Binary Compatibility**:

* How binary incompatibility can cause production failures in your applications
* How library authors can avoid breaking binary compatibility, and/or convey breakages clearly to library users when they happen
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would split this last item in two:

  • How library authors can avoid breakages in binary compatibility
  • How to reason about the impact of binary incompatible changes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there different levels of "impact` for binary incompatibility?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. In hindsight, that last bullet point doesn't explain what I wanted to. Let me reword it: "How to reason about the impact of changes in their code". And I would put it in the second position instead of the last one.

* How binary incompatibility can cause production failures in your applications
* How library authors can avoid breaking binary compatibility, and/or convey breakages clearly to library users when they happen

Before we start, first we need to understand how code is compiled and executed on the Java Virtual Machine (JVM).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first we need to understand -> let's understand


## The JVM execution model

Code compiled to run on the JVM is compiled to a platform-independent format called **JVM bytecode** and stored in **Class File** format (with `.class` extension) and these class files are stored
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code compiled to run on the JVM -> Scala
stored in Class File format -> stored in .class files.

Code compiled to run on the JVM is compiled to a platform-independent format called **JVM bytecode** and stored in **Class File** format (with `.class` extension) and these class files are stored
in JAR files. The bytecode is what we refer to as the **Binary** format.

When application or library code is compiled, their bytecode invokes named references of classes/methods from their dependencies instead of including the dependencies' actual bytecode
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer "When an application depends on a library, its compiled bytecode references to the library's bytecode"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's better. One important point I want deliver is that application code is calling the "bytecode interface" of the library, which means modifying the internal of methods is OK and won't break bin-compat. What's the right technical term for this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would call it "binary interface" though I don't know the academic term. If you want to make this more obvious, I would create a new paragraph (or even better: enclose it in a -- there are some other blog posts that do it).

This situation can only be resolved by ensuring that the chosen version of `C` is binary compatible with all other evicted versions of `C`. In this case, we need a new version of `A` that depends
on `C v2.0.0` (or any other future `C` version that is binary compatible with `C v2.0.0`).

Now imagine if our App is more complex with lots of dependencies themselves depending on `C` (either directly or transitively) - it becomes extremely difficult to upgrade any dependencies because it now
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make 'App' lowercase 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

App is the name of our application, just like A, B are the names of the libraries. Can wrap it in inline-code it so it's more obvious.


## Versioning Scheme - Communicate binary compatibility breakages

We recommend using the following schemes to communicate binary and source compatibility to your users:
Copy link
Member

@jvican jvican Sep 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We strongly recommend.


TODO

## Versioning Scheme - Communicate binary compatibility breakages
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if you try to explain what a versioning scheme is and maybe quote the semantic versioning spec? I feel like a small introduction here would make the reading much much nicer for beginners.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't want to mention semantic versioning at all because it brings in the wrong correlation and might confuse the reader.

MAJOR version when you make incompatible API changes,

In the versioning scheme we're recommending, major version bump has nothing to do with API changes. (From my point of view, API changes = source incompatibility)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but yeah, we can briefly explain what versioning schemes are

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't want to mention semantic versioning at all because it brings in the wrong correlation and might confuse the reader.

I'm not sure about this. The versioning scheme that we recommend is a more constrained subset of semantic versioning. I feel mentioning semantic versioning here is a must, it's a widely used standard. We should make it obvious that our versioning scheme enforces more information in every Scala version, which is extremely advantageous.

(From my point of view, API changes = source incompatibility)

Not really, API changes happen because of both binary and source incompatible changes.


We recommend using the following schemes to communicate binary and source compatibility to your users:

* Any release with the same major version are **Binary Backwards Compatible** with each other
Copy link
Member

@jvican jvican Sep 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, this is how I propose you write this:

  • Any library versioned with a starting 0 can break binary and source compatibility at any time, as of the [semantic versioning](link here) spec.
  • A change in the major version number signals at least one binary incompatibility, either forwards and/or backwards. For example, code compiled against v1.0.0 is not compatible with v2.0.0.
  • A change in the minor version number signals at least one source incompatibility, either forwards and/or backwards. For example, code compiled against the sources of a dependency v1.0.0 is not ensure to compile with v1.1.0 of the same library.
  • A change in the patch version (the x in 1.0.x) signals no binary or source incompatibility. Patch releases often include bug fixes and minor behavioural changes.

There may be cases in which libraries break source compatibility and decide to bump up the major version number. These cases, where not common, are defined per library.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the new format is much more readable and more "structured". The dumber the bullet lists are, the better ^^

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing is not clear to me is the status we give to forward binary compatibility. We should probably say that it's often the case that libraries do not promise forward binary compatibility, and that therefore they don't influence major version numbers. I'll think about this in the next hours since now I'm saturated, but perhaps @sjrd can shed some light on this. He's the original author of this proposal.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forward compat is generally not an issue and does not count per se in the versioning number I am advocating.

However, checking forward binary compatibility with MiMa is the best approximation we have of guaranteed (100%) backward source compat. Indeed, the most common source for backward source incompatibility are additions of public fields/methods (which influence implicit resolution), and that is forbidden if you enforce forward binary compat. Therefore, I would in general encourage checking forward bin compat across patch releases (so between 1.2.x and 1.2.y), which is the best thing we have to enforce backward source compat in patch releases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I agree with @sjrd regarding forward compatibility not being as relevant for most userland libraries
  • I am still on the fence about minor versions. I know a lot of libraries like to use minor version to signal new features. Perhaps we make minor version a mandatory bump if forward compatibility is broken? (which normally coincides with adding new functionality like new methods).
  • Does the SBT MiMa plugin currently support easy checking for forward compatibility?
  • Just an idea: I think we should extend MiMa SBT plugin to automatically suggest a new version number based on the result of compatibility check. (Although I'm not sure if there's a way to check source compatibility..)


### Explanation

Why do we use the major version number to signal binary compatibility releases?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to signal binary incompatible releases

@jvican
Copy link
Member

jvican commented Sep 12, 2017

@jatcwang I'm really not a fan of the GitHub way of reviewing this. If you prefer it, I can add all my suggestions as a commit to this PR, and then you review my commit and change the things you don't like.

I feel that would be a much faster way of reviewing it, and would save you time.

@sjrd
Copy link
Member

sjrd commented Sep 12, 2017

I will not have time this week to review this, because of a paper deadline. I will get to it next week.

@jatcwang
Copy link
Contributor Author

@jvican That'll be awesome thank you. I've left comments on all of your feedback but feel free to make changes as you see fit.

@jatcwang
Copy link
Contributor Author

I have just uploaded a repo to demonstrate common causes of binary incompability + more detailed explanation that I can't cover in the main guide. The intention is to provide a link to this repo from the main guide.

https://github.com/jatcwang/binary-compatibility-guide

Please raise issues in that repo for anything you think I've missed.

@jvican I am happy to update my PR to address your feedback so let me know if you're struggling to find time.

@jvican
Copy link
Member

jvican commented Sep 26, 2017

I'm struggling to find time atm, but I'll review further updates on this PR.

@jatcwang
Copy link
Contributor Author

jatcwang commented Oct 1, 2017

Pushed some commits that hopefully addresses all your feedback @jvican. (Much appreciated - it reads better now)

@heathermiller
Copy link
Member

🎉 🎉
@jvican can you have a look at this when you're back in the office?

And again, sincerest thanks for taking the initiative on this @jatcwang ❤️

@heathermiller
Copy link
Member

@sjrd too, if you're interested in/have time to provide a review, I know it'd be appreciated.

Copy link
Member

@jvican jvican left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work @jatcwang. I could only review until "An example of dependency hell". I'm planning to continue the review tomorrow.

Also, this commit (jvican@96b49ad) moves the guide from the tutorials section to the guides section. I think that this one is more appropriate for such a document.

for all of `App`, `A` and `C` (something like `java -cp App.jar:A.jar:C.jar:. MainClass`). If we did not provide `C.jar` or if we provided a `C.jar` that does not contain some classes/methods
which `A` calls, we will get classloading exceptions when our code attempts to invoke the missing classes/methods.

These are what we call **Binary Incompatibility Errors**. An error caused by binary incompatibility happens when the compiled bytecode references a name that cannot be resolved during runtime
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have here and in some sentences before some missing dots 😄.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea what you're talking so I just rewrote it and add the missing full stop at the end 😛

When a class is needed during execution, the JVM classloader loads the first matching class file from the classpath (any other matching class files are ignored).
Because of this, having multiple versions of the same library in the classpath is generally undesireable:

* Unnecessary application size increase
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does the "application size" increase? This sentence is a little bit misleading. I think that what you mean here is that "build tools need to resolve and download unnecessary dependencies that take time and space".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the build tool bundles all different version of a library with the application (however it is distributed), then you have unnecessary bloat in your bundle. Happy to reword or remove if you find it confusing

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think that it requires some clarification 😄. Also, consider adding my remark, I think it's valid.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated :)


## What are Evictions, Source Compatibility and Binary Compatibility?

### Evictions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like how you explain evictions, well done.


These are what we call **Binary Incompatibility Errors**. An error caused by binary incompatibility happens when the compiled bytecode references a name that cannot be resolved during runtime

## What are Evictions, Source Compatibility and Binary Compatibility?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's some disconnect between this section and the previous one. It would be good to add some paragraph connecting the two so that the user clearly sees how this one links to the previous one. This eases readability.


### Source Compatibility
Two library versions are **Source Compatible** if switching one for the other does not incur any compile errors.
For example, If we can upgrade `v1.0.0` of a dependency to `v1.1.0` and recompile our code without any compilation errors, `v1.1.0` is source compatible with `v1.0.0`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

without any compilation errors -> "without any compilation errors and *semantic changes at runtime". Note that if you add a new method that was previously enriched by an implicit into the base class, and that method has a different behaviour, it will compile but the application's behaviour will change.


## Why binary compatibility matters

Binary Compatibility matters because failing to maintain it makes life hard for everyone.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about "Breaking binary compatibility has bad consequences on the ecosystem around the software:"?


Binary Compatibility matters because failing to maintain it makes life hard for everyone.

* End users has to update all library versions in their whole transitive dependency tree such that they are binary compatible, otherwise binary compatibility errors will happen at runtime
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has have

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about "have to update versions transitively in all their dependency tree such that they are binary compatible. This process is time-consuming and error prone, and it can change the semantics of end program."?

Binary Compatibility matters because failing to maintain it makes life hard for everyone.

* End users has to update all library versions in their whole transitive dependency tree such that they are binary compatible, otherwise binary compatibility errors will happen at runtime
* Library authors are forced to update the dependencies of their library so users can continue using them, greatly increases the effort required to maintain libraries
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this part is not clear enough. What do you mean exactly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reworded if that is clearer? mainly talking about users asking library authors to bump the dependency version to align the versions to be bin-compat

* End users has to update all library versions in their whole transitive dependency tree such that they are binary compatible, otherwise binary compatibility errors will happen at runtime
* Library authors are forced to update the dependencies of their library so users can continue using them, greatly increases the effort required to maintain libraries

Constant binary compatibility breakages in libraries, especially ones that are used by other libraries, is detrimental to our ecosystem as they require a lot of effort
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"as they require time and effort from end users and maintainers of dependent libraries"


Some time later, we see `B v1.1.0` is available and upgrade its version in our build. Our code compiles and seems to work so we push it to production and go home for dinner.

Unfortunately at 2am, we got frantic calls from customers saying that our application is broken! Looking at the logs, you find lots of `NoSuchMethodError` is being thrown by some code in `A`!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gotget

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isare being thrown

@jatcwang
Copy link
Contributor Author

Updated according to @jvican's second round of review (thanks!) + grammar check.

@heathermiller
Copy link
Member

@jvican... Can you please have a look at the rest?

@heathermiller
Copy link
Member

(PS, thanks very much @jatcwang for your patience on this! And thank you for the fixes!)

Copy link
Member

@jvican jvican left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jatcwang Awesome. This is my last pass through the document. I've left some nice comments that I think can help improve the document even more. Though being honest, I think you've done a real good job at this difficult topic.

From now on, I defer my reviewing duties to someone with more expertise than I do. Once my comments are addressed, I'm happy for this to be merged! But I'd like someone else to have a look and confirm I haven't missed anything.


### Source Compatibility
Two library versions are **Source Compatible** if switching one for the other does not incur any compile errors or unintended behavioral changes at runtime.
For example, If we can upgrade `v1.0.0` of a dependency to `v1.1.0` and recompile our code without any compilation errors, `v1.1.0` is source compatible with `v1.0.0`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You forgot to mention here "semantic errors" too 😄


### Binary Compatibility
Two library versions are **Binary Compatible** if the compiled bytecode of these versions can be interchanged without causing binary compatibility errors.
For example, if we can replace the class files of a library's `v1.0.0` with the class files of `v1.1.0` without any binary compatibility errors during runtime,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an example of backward bincompat. Since you haven't introduced here yet the two kinds of bincompat that there are, I think it's better to remove this example from here and keep the definition alone (you already have a really good diagram that exemplifies both later on, so I find it somewhat unnecessary). Then, I would remove the note at the end, and create a new section ### Relationship between binary compatibility and source compatibility where you explain in a more principled way how they interact.


## Why binary compatibility matters

Binary Compatibility matters because breaking binary compatibility have bad consequences on the ecosystem around the software.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: has 😄

## MiMa - Checking binary compatibility against previous library versions

The [Migration Manager for Scala](https://github.com/typesafehub/migration-manager) (MiMa) is a tool for diagnosing binary incompatibilities between different library versions.
It works by comparing the class files of two provided JARs and report any binary incompatibilities found.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would try to make more explicit that it support forwards and backwards bincompat detection with something like this: "It can tell a library author whether a change causes any kind of binary compatibility in both directions: forwards and backwards."

you have accidentally introduced binary incompatible changes. Detailed instruction on how to use the SBT plugin can be found in the link.

We strongly encourage every library author to incorporate MiMa into their library release workflow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Someone at some point mentioned that MiMa can be used as the best approximation of source (in)compatibility detection. I think it was a comment in this very same PR.

It may be interesting mentioning that here and also explaining that detecting source incompatibility automatically in Scala is extremely difficult because of complex scoping and certain language features like implicits, named parameters, etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I probably said that.

Basically, checking both forward and backward bincompat with MiMa is the closest tool we've got to checking backward source compatibility. Because it prevents, for example, from adding public methods, which is one of those things that can break source compat.


* Ensure major versions of all versions of a library in the dependency tree are the same
* Pick the latest version and evict the rest (This is the default behavior of SBT).

Copy link
Member

@jvican jvican Oct 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion, this is one of the most important parts of the document and we need to make a good case to synthesize all the advantages to use this scheme over others. To me, the biggest advantage to users is determinism: the consequences of upgrading/downgrading a library in their projects are clearer. They can reason better about the changes produced in every release. They express the rich kind of incompatibilities in Scala.

Do you think that adding these instead of the ones you mention is better?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you do a similar reasoning in "Explanation". I think it may be worth considering merging this with the "Conclusion" section instead of putting it here. Do you agree?


From our [example](#why-binary-compatibility-matters) above, we have learned two important lessons:

* Binary incompatibility releases often leads to dependency hell, rendering your users unable to update any of their libraries without breaking their application
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not only binary. I would say: "API incompatibilities often lead to dependency hell".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is not accurate. That's the whole point of having bin compat in MAJOR and source compat in MINOR. Source incompatibility does NOT result in dependency hell. Only binary incompat does.

I think I explained that at length, and pretty well, in the Contributors forum at some point.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, I don't know why I said otherwise in the beginning.

From our [example](#why-binary-compatibility-matters) above, we have learned two important lessons:

* Binary incompatibility releases often leads to dependency hell, rendering your users unable to update any of their libraries without breaking their application
* If a new library version is binary compatible but source incompatible, the user can simply fix the compile errors in their application and everything will work
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simply

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/everything will work/their application should work/g

* Binary incompatibility releases often leads to dependency hell, rendering your users unable to update any of their libraries without breaking their application
* If a new library version is binary compatible but source incompatible, the user can simply fix the compile errors in their application and everything will work

Therefore, **binary incompatible releases should be avoided if possible** and be more noticeable when they happen, warranting the use of the major version number. While source compatibility
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

be more noticeable be "properly documented"

* If a new library version is binary compatible but source incompatible, the user can simply fix the compile errors in their application and everything will work

Therefore, **binary incompatible releases should be avoided if possible** and be more noticeable when they happen, warranting the use of the major version number. While source compatibility
is also important, if they are minor breakages that do not require effort to fix, then it is best to let the major number signal just binary compatibility.
Copy link
Member

@jvican jvican Oct 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if they are minor breakages that do not require effort to fix

The previous statement is true no matter whether that condition is true or not. The scheme signals the most severe API change, in this case breaking binary compatibility, and users should be prepared for source incompatible changes too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just gonna take out this sentence. I feel like I'm overexplaining at various places when the reader can simply figure it out later on their own

for all of `App`, `A` and `C` (something like `java -cp App.jar:A.jar:C.jar:. MainClass`). If we did not provide `C.jar` or if we provided a `C.jar` that does not contain some classes/methods
which `A` calls, we will get classloading exceptions when our code attempts to invoke the missing classes/methods.

These are what we call **Binary Incompatibility Errors** -- errors that happen when the compiled bytecode references a name that cannot be resolved during runtime.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The technical term for the JVM is Linkage error (as evidenced by the fact that they are all subclasses of LinkageError). Linkage errors are caused by binary incompatible changes in APIs, or binary incompatibilities for short. There is no such thing as a "binary incompatibility error".

These are what we call **Binary Incompatibility Errors** -- errors that happen when the compiled bytecode references a name that cannot be resolved during runtime.

Before we look at how to avoid binary incompatibility errors, let us first
establish some key terminologies we will be using for the rest of the guide.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be a short subsection about the model of other platforms (Scala.js and Scala Native). Here is a draft:

Linking model of Scala.js and Scala Native

Similarly to the JVM, Scala.js and Scala Native have their respective equivalents of .class files, namely .sjsir files and .nir files. Similarly to .class files, they are distributed in .jars, and linked together at the end.

However, contrary to the JVM, Scala.js and Scala Native link their respective IR files at link time, so eagerly, instead of lazily at run-time. Failure to correctly link the entire program results in linking errors reported while trying to invoke fastOptJS/fullOptJS or nativeLink.

Besides that difference in the timing of linkage errors, the models are extremely similar. Unless otherwise noted, the contents of this guide apply equally to the JVM, Scala.js and Scala Native.


### Evictions
When a class is needed during execution, the JVM classloader loads the first matching class file from the classpath (any other matching class files are ignored).
Because of this, having multiple versions of the same library in the classpath is generally undesirable:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mixes up different things. Consider replacing that paragraph with something more obviously accurate, such as

Unless using special tools such as OSGI), only one version of any given library can be loaded in the same JVM. If several are provided on the classpath, the contents of the first one will shadow the later ones.

Copy link
Contributor Author

@jatcwang jatcwang Nov 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's inaccurate about the paragraph? I want to minimize the amount of keywords mentioned especially when they doesn't help the key message of the guide (e.g. OSGI).

I don't think "only one version of any given library" is accurate. If I have A_v1.jar and A_v2.jar in my classpath and A_v2.jar contains a class that doesn't exist in A_v1.jar, then JVM will find find and use the class in A_v2.jar no? (Your second sentence kinda hints that, but I want to be clear about it)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jatcwang I think Sébastien's suggestion here makes sense, just make sure you add a paragraph for "(such as OSGi)" so that it has less importance. But the wording is superior in precision 😄.

Therefore, when resolving JARs to use for compilation and packaging, most build tools will pick only one version of each library and **evict** the rest.

### Source Compatibility
Two library versions are **Source Compatible** if switching one for the other does not incur any compile errors or unintended behavioral changes at runtime.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

source compatible with each other

For example, If we can upgrade `v1.0.0` of a dependency to `v1.1.0` and recompile our code without any compilation errors, `v1.1.0` is source compatible with `v1.0.0`.

### Binary Compatibility
Two library versions are **Binary Compatible** if the compiled bytecode of these versions can be interchanged without causing binary compatibility errors.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

binary compatible with each other


### Recommended Versioning Scheme

* If backward **binary compatibility** is broken, **major version number** must be increased
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In English typography, there are no hard rules on the proper use of bulleted lists.

The only well-accepted rule I have seen is that, if the full list can read as a full sentence, you can write the list as if bullets were basically not semantically relevant. Therefore, the rest of the punctuation and capitalization should be applied as if they were all part of the same sentence. This means that a bullet list should look as follows:

  • items should not start with a capital letter,
  • they should all end with a comma, and
  • the next-to-last item ends with ", and" while the last one ends with a period.

Indeed, you can remove the bullets from the above list, and it still reads as a correct sentence.


* If backward **binary compatibility** is broken, **major version number** must be increased
* If backward **source compatibility** is broken, **minor version number** must be increased
* A change in **patch version number** signals **no binary nor source incompatibility**. According to SemVer, patch versions should contain only bug fixes that fix incorrect behavior so major behavioral
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on how serious you are about source compatibility, a backward source compatible release must have exactly the same public/protected API as the previous version. It cannot even add methods. Given that SemVer itself recommends that adding functionality should bump MINOR, it's pretty clear to me that adding a public method should bump MINOR anyway.

* A change in **patch version number** signals **no binary nor source incompatibility**. According to SemVer, patch versions should contain only bug fixes that fix incorrect behavior so major behavioral
change in method/classes should result in a minor version bump.
* When major version is `0`, a minor version bump **may contain both source and binary breakages**
* Some libraries may take a harder stance on maintaining source compatibility, bumping the major version number for ANY source incompatibility even if they are binary compatible
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bumping the major version for ANY source incompatibility is probably nonsense. If you really mean ANY (and I guess you mean it, because it's uppercase), then adding a public method means you need to bump MAJOR. I don't any library will ever be that strict.

Actually, I would argue that a library doing this is doing a disservice to its community, as it has no means to effectively convey binary incompatible changes anymore in a way that is independent from source incompatible changes. This means that every single new public method breaks the ecosystem, since I cannot know from the version number whether I am moving into dependency hell or not.

* A change in **patch version number** signals **no binary nor source incompatibility**. According to SemVer, patch versions should contain only bug fixes that fix incorrect behavior so major behavioral
change in method/classes should result in a minor version bump.
* When major version is `0`, a minor version bump **may contain both source and binary breakages**
* Some libraries may take a harder stance on maintaining source compatibility, bumping the major version number for ANY source incompatibility even if they are binary compatible
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MAJOR could be bumped for "severe" source breaking changes (for some definition of "severe"). Removing public APIs could be considered "severe", for example. But we cannot give any objective metric about this.


From our [example](#why-binary-compatibility-matters) above, we have learned two important lessons:

* Binary incompatibility releases often leads to dependency hell, rendering your users unable to update any of their libraries without breaking their application
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is not accurate. That's the whole point of having bin compat in MAJOR and source compat in MINOR. Source incompatibility does NOT result in dependency hell. Only binary incompat does.

I think I explained that at length, and pretty well, in the Contributors forum at some point.

@jatcwang
Copy link
Contributor Author

Changes pushed. I addressed everything apart from @sjrd's comment about JVM classloading and bullet points 😛

@jatcwang jatcwang changed the title [WIP] Guide on binary compatibility for library authors Guide on binary compatibility for library authors Nov 12, 2017
@jvican
Copy link
Member

jvican commented Dec 7, 2017

@jatcwang Thanks ❤️.

Can you check if the document is in good shape now? @sjrd


## MiMa - Checking binary compatibility against previous library versions

The [Migration Manager for Scala](https://github.com/typesafehub/migration-manager) (MiMa) is a tool for diagnosing binary incompatibilities between different library versions.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


* If backward **binary compatibility** is broken, **major version number** must be increased.
* If backward **source compatibility** is broken, **minor version number** must be increased.
* A change in **patch version number** signals **no binary nor source incompatibility**.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wording is is a bit weird here. This should be "neither binary nor source incompatibility" or "no binary or source incompatibility".

In the following section, we will outline a versioning scheme based on Semantic Versioning that we **strongly encourage** you to adopt for your libraries. The rules listed below are **in addition** to
Semantic Versioning v2.0.0.

### Recommended Versioning Scheme

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also like to encourage projects to document their versioning scheme and binary compatibility rules. Can we add a short section suggesting how a project might do this? If it follows this scheme, we should recommend linking to this document in the README. If not, the README should ideally include a short section explaining the rules for that project.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to keep the word count down to not dilute the main takeaways of this tutorial.

I think having another "maintaining an OSS Scala project" guide will be a good place to cover topics such as publishing to repositories and documenting versioning schemes. Speaking of which, I think it's a good idea to name this versioning scheme (e.g. "Scala Binary SemVer") instead of "that versioning scheme mentioned in that tutorial"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, while @gmethvin's suggestion is good I think it may be too much for this guide.

## Versioning Scheme - Communicating compatibility breakages

Library authors use versioning schemes to communicate compatibility guarantees between library releases to their users. Versioning schemes like [Semantic Versioning](http://semver.org/)(SemVer) allow
users to easily reason about the impact of updating a library, without needing to read the detailed release note.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"...release notes."

* Need to fetch and bundle multiple library versions when only one is actually used
* Unexpected runtime behavior if the order of class files changes

Therefore, when resolving JARs to use for compilation and packaging, most build tools will pick only one version of each library and **evict** the rest.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be useful to explain that the default scheme for this in sbt (and gradle, and many other build tools) is to take the latest version, since most libraries are backwards but not forwards compatible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did initially mention this but was asked to remove it. What do you think @jvican?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you can add "The most common solution in build tools is to take the latest version (sbt, gradle, etc)." and will still be good 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@jvican
Copy link
Member

jvican commented Dec 25, 2017

Okay, I think it’s about time to merge this PR, what’s your take on this? @jatcwang

@jatcwang
Copy link
Contributor Author

@jvican, I've addressed some stuff @gmethvin picked up. (there's one question about mentioning SBT/maven's default eviction strategy - I can add it if you think it's good)

Also just want to make sure that the page is indexable by search engine bots, but we can just merge it and see whether it appears :)

@heathermiller
Copy link
Member

@jvican please let me know when we can merge this :)

Copy link
Member

@sjrd sjrd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one minor comment. Good enough for me otherwise.

* `v1.0.0 -> v2.0.0` is <span style="color: #c10000">binary incompatible</span>.
End users and library maintainers need to update all their dependency graph to remove all dependency on `v1.0.0`.
* `v1.0.0 -> v1.1.0` is <span style="color: #2b2bd4">binary compatible</span>. Classpath can safely contain both `v1.0.0` and `v1.1.0`. End user may need to fix minor source breaking changes introduced
* `v1.0.0 -> v1.0.1` is <span style="color: #2b2bd4">binary compatible</span>. This is a safe upgrade that does not introduce binary or source incompatibilities.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is source and binary compatible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done :)

@jvican
Copy link
Member

jvican commented Jan 9, 2018

Well, after 153 comments (without including this one), it's time to ship this valuable guide. Once again, I thank you for this awesome work, this is an unprecedent contribution from the Community and I'm really happy to see that there exists tenacious people out there that help us improve Scala 😄.

:shipit:

@jvican jvican merged commit d8c24c9 into scala:master Jan 9, 2018
@SethTisue
Copy link
Member

kudos to @jatcwang for tackling this and to everybody for the inspiring group effort here

@jatcwang
Copy link
Contributor Author

Thanks for all your help. Every suggestion and comment was appreciated :) and also thanks for all your awesome work in the scala ecosystem - many silent users like me appreciate it :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants