Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple versions of a library in dependency tree. #771

Open
avodonosov opened this issue Nov 19, 2021 · 11 comments
Open

Multiple versions of a library in dependency tree. #771

avodonosov opened this issue Nov 19, 2021 · 11 comments

Comments

@avodonosov
Copy link

avodonosov commented Nov 19, 2021

The Semantic Versioning document does not take into account that one library often appears many times in a dependency tree.

In most programming languages it is impossible to load different versions of the same library simultaneously. Therefore, introducing breaking changes according to the current SemVer rules (incrementing major version number) will break such applications anyways.

For example:

my-application
  web-server 1.1.1
    commons-logging 1.1.1
  db-client 1.1.1
    commons-logging 1.1.1
  authentication 1.1.1
    commons-logging 1.1.1

Then commons-logging changes its API incompatibly and is released as commons-logging 2.0.1. Authentication adopts commons-logging 2.0.1 while other libraries still depend on 1.1.1:

my-application
  web-server 1.1.1
    commons-logging 1.1.1
  db-client 1.1.1
    commons-logging 1.1.1
  authentication 1.1.2
    commons-logging 2.0.1

Now my-application is broken, because the dependency tree includes two versions of commons-logging which share packages, class / functions names, and thus can not be loaded simultaneously.

The main problem with the current Semantic Versioning doc is that by not mentioning this issue it encourages developers to break backward compatibility, giving them impression that increased major version number will protect the clients from trouble.

Instead, breaking changes should be discouraged - in majority of cases it is trivial to keep backward compatibility. (For example, instead of altering function behavior, keep the old function as it is and introduce a new function).

If breaking change is still needed, the developer has to make sure the new version of the library can be loaded simultaneously with the old version. In many languages that will require giving the library new identifier, using new namespace / package in the library source code.

An example of such a proper release of incompatible versions is Java library apache commons-lang (https://commons.apache.org/proper/commons-lang/). New versions not only upgrade major version number, but also have new artifact ID - commons-lang3, and new Java package - org.apache.commons.lang3.

If commons-lang simply followed the current Semantic Versioning rules, just increasing the major version number, it would resulted in breakages all over Java ecosystem.

It's interesting to note that the need to create new artifact ID for the library, new namespace for its code, makes the notion of major version number redundant. Instead of releasing library Ladder version 3.1.0 the developer could release Ladder3 version 1.0.

It's also important to note that even in case simultaneous loading of multiple versions is enabled, some possibilities for troubles remain, like when an object created by one version of the library travels through the application layer and gets passed to another version of the library. That's why preserving backward compatibility should be preferred over release of an incompatible version.

@ljharb
Copy link
Contributor

ljharb commented Nov 20, 2021

This issue is not caused by semver, it's caused by package managers that have the flaw of only having a single flat pool of dependencies. Package managers that lack this flaw - npm, cargo, nix - don't have this problem. I'm not sure what the point would be of mentioning this in the semver spec - surely package authors in the ecosystems that have this problem are well aware of the limitations.

@avodonosov
Copy link
Author

avodonosov commented Nov 21, 2021

@ljharb, believe me, the issue is subtle enough that many programmers don't even know about it. And when the issue is explained, even those willing to think about it often don't understand it initially, as your comment above demonstrates.

Nix will not let you to have multiple versions of the same C library in one application.

Javascript is a relatively rare case between languages, where module practices evolved and were standardized in a form that allows multiple versions to be loaded (module internals are essentially defined within a closure, so there are no global points of interference). But even in Javascript there are libraries whose functionality is accessed through global objects, like new goog.crypt.Sha1(), and they of course can't have multiple versions loaded simultaneously. In other programming languages simultaneous loading is often more difficult or practically impossible.

SemVer aims to "make dependency hell a thing of the past.", and the limited recommendation "When breaking API increase major version " is not enough to achieve this goal. It's desirable to say "When breaking API increase major version and make sure new version can be loaded simultaneously with the old version. Or better don't break the API".

I also extended the issue description with a note why even when simultaneous loading is supported, keeping compatibility should be preferred over release of a new incompatible version.

@jwdonahue
Copy link
Contributor

@avodonosov

The SemVer spec provides a simple mechanism for conveying risk information via the version string. The original author may have got a bit carried away in the sales pitch section of the document (not part of the spec), but your quote "make dependency hell a thing of the past" is out of context. The first sentence of the relevant paragraph reads:

A simple example will demonstrate how Semantic Versioning can make dependency hell a thing of the past.

It then describes a perfectly legitimate example of how SemVer's communication of breaking change information, solves one part of the dependency hell problem. The spec doesn't mention the diamond dependency problem because it was not intended to solve that problem, at least not directly. It does enable enlightened tooling to detect that a dependency tree contains breaking changes, relative to an early version of the tree. Circular and diamond dependency problems are separate issues from breaking/non-breaking change in a particular API or package, though they can certainly cause breakage. They can cause problems even if none of the new versions in the tree were breaking changes. The extent to which they are a problem and how best to deal with them, depends on the many varied tool-chains and operating environments in existence. The fact that SemVer has been heavily adopted in most of those varied environments, is a testament to it's generality and usefulness.

The propriety of your preferred solution to the diamond dependency problem is not something that SemVer should weigh-in on. Doing so would render it less general and therefore less useful. Instead, other standards should address these problems, building on SemVer or other versioning or naming schemes as appropriate.

@avodonosov
Copy link
Author

avodonosov commented Dec 15, 2021

@jwdonahue , thank you for the feedback, but no. Read also the Introduction, it poses semantic versioning as a solution to the dependency hell problem.

In the current form the document is an invitation to break backwards compatibility any time, under a false promise that changing major version number solves all dependency hell issues.

The propriety of your preferred solution to the diamond dependency problem is not something that SemVer should weigh-in on.

It's important to keep in mind, that the person who experiences the dependency hell in the diamond dependency situation when compatibility is broken, is a different one from the person who introduced the issue by breaking compatibility and who is in the best position to fix it.

A security scanner tells developer to upgrade a library because new version has fixes for recently discovered vulnerabilities. The developer migrates to new minor version and the application breaks. The application developer has no good way to move forward, he is blocked, because he does not control the libraries. It is very expensive. And this happens all the time.

And the person who broke compatibility in the lower level library, that is used multiple times in the application dependency tree, honestly did what he knows as the best engineering practice - he used semantic versioning. He does not know he is causing huge trouble to the library users.

If this issue is fixed, the overall value for humanity will be huge.

@jwdonahue
Copy link
Contributor

@avodonosov How do you propose to fix it?

@avodonosov
Copy link
Author

@jwdonahue, I propose to remove the wording that can be understood as mere increase of major version solves dependency hell. And to add several sentences of advice, as suggested in the issue description.

@jwdonahue
Copy link
Contributor

Now my-application is broken, because the dependency tree includes two versions of commons-logging which share packages, class / functions names, and thus can not be loaded simultaneously.

Not true in all environments. SemVer is platform/language agnostic.

If breaking change is still needed, the developer has to make sure the new version of the library can be loaded simultaneously with the old version. In many languages that will require giving the library new identifier, using new namespace / package in the library source code.

It's up to publishers whether they wish to release breaking changes. It's up to consumers to decide whether to take them.

If breaking change is still needed, the developer has to make sure the new version of the library can be loaded simultaneously with the old version. In many languages that will require giving the library new identifier, using new namespace / package in the library source code.

This level of detail should be left to platform and language standards and conventions, not specified in a platform/language agnostic spec.

An example of such a proper release of incompatible versions is Java library apache commons-lang (https://commons.apache.org/proper/commons-lang/). New versions not only upgrade major version number, but also have new artifact ID - commons-lang3, and new Java package - org.apache.commons.lang3

If commons-lang simply followed the current Semantic Versioning rules, just increasing the major version number, it would resulted in breakages all over Java ecosystem.

Sounds like a platform/language convention to me. Nothing in the SemVer spec prohibits layering additional requirements and conventions in with it.

It's interesting to note that the need to create new artifact ID for the library, new namespace for its code, makes the notion of major version number redundant. Instead of releasing library Ladder version 3.1.0 the developer could release Ladder3 version 1.0.

Neither convention solves all dependency issues. Combining them doesn't either, and seems rather silly to me, but I've never been a fan of Java, so I could be missing something.

It's also important to note that even in case simultaneous loading of multiple versions is enabled, some possibilities for troubles remain, like when an object created by one version of the library travels through the application layer and gets passed to another version of the library. That's why preserving backward compatibility should be preferred over release of an incompatible version.

I have long been a proponent of simply renaming the product and moving on, but it's not necessary. The major version bump is a sufficient barrier and preservers the brand the publisher has been building around their product. Consumers are under no obligation to accept breaking changes, whether indicated by major version bump or a product name change.

Discouragement of breaking changes is tantamount to suppression of innovation. Breaking things is often the whole point of innovating. No standards body is ever going to successfully suppress it anyway, so why not just embrace it?

The real problem here is lack of adequate tooling in most environments. We work with languages and tool chains that are decades old, whose designers probably didn't really have today's version of DevOps in mind. Dependency hell as currently defined by Wikipedia is a provably hard problem to solve, but tractable on modern hardware. All you need to know about any particular dependency is what is in its subgraph?

@avodonosov you are welcome to fork the repo and issue a PR for any changes you like. It sounds like what you are proposing however, would be one of those breaking changes (SemVer 3 or NewSemVer 1?) you seem to claim we should all be avoiding.

@arithex
Copy link

arithex commented Jan 2, 2022

I agree semver isn't meant to be a magic bullet to solve all versioning problems -- just a coherent way to talk about versioning changes, and communicate changes/expectations.

But I also agree this issue is a real problem, and worthy of some discussion or guidance to library vendors (eg. allow major-version breaking changes to run side-by-side with older versions, whenever possible, through conventions appropriate for the language and runtime).

Sometimes the package-name-bump approach works (eg. "FooLib3 ver. 1.0.0"). Other times, when the library is ubiquitous and has API surface that reaches up through your dependency stack -- like JSON serialization.. or, changes to the programming language or runtime itself -- it's not enough to rename the product.. some poor program-manager is going to have to coordinate multiple teams to migrate, test and (maybe) deploy their code, in unison.

(I suspect this is a problem mostly faced by people working in large organizations.)

@jozefizso
Copy link

These transitive package dependency issues must be solved at the package manager and programming language level. This is not something SemVer should define.

@avodonosov
Copy link
Author

avodonosov commented Feb 3, 2022

That's indeed a possible alternative to simply write "Breaking compatibility may lead to dependency hell. The details are out of scope of this document, SemVer is merely a notation to distinguish compatible versions from incompatible ones".

However, that's not friendly to users. Why be mysterious? Several sentences about multiple versions in the dependency tree can illustrate the problem and give useful hints for the case when people really need to break compatibility.

It's not a goal to have 100% exhaustive guide of compatibility breakages, but I believe the described scenario covers most of the issues application developers encounter in real life.

Circular dependencies

I think it's a rare and very specific situation, so not worth a detailed explanation in the semver document. At most, something like "other issues possible sometimes" and a link.

They [multiple versions of a library] can cause problems even if none of the new versions in the tree were breaking changes.

Presence of multiple compatible versions is solved by choosing the later of them, which is done automatically by build tools in semver-aware environments (e.g. Cargo), or manually by application developer. That's not a dependency hell.

Once again, the point of this ticket is to:

  1. Remove the false promise that SemVer solves dependency hell by simply increasing major version.
  2. Discourage unnecessary compatibility breakages, when it's trivial to maintain compatibility.
  3. Several sentences of advice for the cases when comparability breakage is really needed.

@avodonosov
Copy link
Author

Related: #341, #414

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants
@avodonosov @ljharb @jozefizso @jwdonahue @arithex and others