Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Are we happy with CalVer? Is there any easy way to move past it? #372

Open
fjetter opened this issue Apr 8, 2024 · 5 comments
Open

Are we happy with CalVer? Is there any easy way to move past it? #372

fjetter opened this issue Apr 8, 2024 · 5 comments

Comments

@fjetter
Copy link
Member

fjetter commented Apr 8, 2024

A while ago I read an excellent blog post about version schemes that is introducing a version scheme called "Intended Effort Versioning" (EffVer) that I quite like.

I'm not particularly happy with CalVer since it just hides way too much information. As an example, recently we changed the default DataFrame backend to use dask-expr which by any kind of measure should be considered a major change. It has a vast potential for improvement for many users and if we are honest will likely break stuff for some users.
Who knows, by heart, which version this was released in? Guess what, I was the release manager on that one and even got the version number wrong when I tested myself just now. For the record, it was 2024.3.0. This was possibly one of the most important releases dask ever had, definitely one of the most meaningful in months if not years but there is nothing to set it apart from a mere maintenance fix.

I don't care that much about semantics but I would like to use the version number to communicate awesomeness and/or risk and the EffVer scheme sounds like it's addressing most concerns (about compatibility and ambiguity) that led dask to adopt CalVer in the first place. We've been using CalVer for about three and a half years and I think it's enough time to collect some experience to talk about it.

How happy are folks generally with CalVer?

And most importantly... If we were to adopt another versioning scheme (even if it's not EffVer), how would that look like? There are Version epochs but I would hate it if users had to specify a version like 1!1.4.2 since the epoch identifier is pretty rare. Are there other possibilities?

@jacobtomlinson
Copy link
Member

jacobtomlinson commented Apr 8, 2024

Thanks for raising this @fjetter. I'm glad you liked my blog post! For folks interested in thinking more about the challenges of CalVer I also wrote this blog post a while ago.

I generally have the same feelings about epochs. In theory they sound like a good way to change scheme, but I'm not quite sure how it would work in practice. I know @minrk was exploring how it works in this repo after having a similar conversation last year, so that ay be a useful resource.

I think you only need to specify the 1! part at publish time. To continue your example users should be able to pip install distributed>=1.4 and omit the epoch and it will resolve correctly. Assuming this is true we may want to introduce a little automation around the release to handle this. If we had a GitHub Action that was triggered on tags and pushed to PyPI it could also handle prepending the epoch, so the tag would be 1.4.2 but PyPI would be told 1!1.4.2 by the Action. I was wrong, see #372 (comment))

What I don't know is how Conda Forge handles this. Maybe @jakirkham has some thoughts?

If we do want to go down this road we could choose a less prominent project like dask-kubernetes to experiment with and I'd be happy to take the lead on trying things out. That repo also has the publish on tags workflow set up already which is convenient.

@jacobtomlinson
Copy link
Member

Thinking more about Dask changing scheme I would be tempted to suggest that we review it on a repo-by-repo basis. For most library style projects (distributed, dask-kubernetes, dask-jobqueue, etc) I think something like EffVer (or SemVer) would make most sense. For projects that hold only documentation (dask-tutorial, dask-examples, etc) I think CalVer is still a fine choice.

For dask/dask I'm a little more torn because it is a library, but it implements a variety of popular APIs from other libraries, which makes it something of a distribution. So in some ways I think CalVer does make some sense here. But in other ways it is just a library so maybe EffVer would be a good choice.

@fjetter
Copy link
Member Author

fjetter commented Apr 8, 2024

using a different versioning scheme for docs only projects makes sense. I'm less convinced about the decoupling of schemes for dask/dask and distributed since both are still coupled tightly. Getting rid of that hard pin is an entirely different topic

@jacobtomlinson
Copy link
Member

jacobtomlinson commented Apr 23, 2024

I went back over the epicepoch experiment that @minrk did last year and I decided to push my own test package and play with epochs some more.

You can find the full results here https://github.com/jacobtomlinson/epochexperiments.

The key things I've learned from playing around with it are:

  • Both publishers and users must always specify the epoch. This means if we switched to Dask v4 users would need to pip install dask>=1!4.0.0, which kinda sucks.
  • Omitting the epoch adds an implicit 0! epoch in some cases, like when using wildcards, which can lead to unintuitive version resolving like epochexperiments>=2024.* resolving to the newest CalVer release and not the latest release.
    • This also means epochexperiments>=2024.0 and epochexperiments>=2024.* resolve to different things.
  • Epochs must be specified even if there isn't a package for the corresponding 0! epoch. For example if you have released v1!4.0.1 you cannot do pip install epochexperiments==4.0.1, even though 0!4.0.1 doesn't exist.

These quirks are enough for me to say that we shouldn't go down the road of using epochs, they will add too much burden and confusion for users. Which pretty much means we are stuck with CalVer, unless we want to go to Dask 3000!

@fjetter
Copy link
Member Author

fjetter commented Apr 23, 2024

These quirks are enough for me to say that we shouldn't go down the road of using epochs, they will add too much burden and confusion for users.

I agree. Thanks for checking

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants