Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version numbering for next release #305

Closed
mrocklin opened this issue Jul 31, 2019 · 12 comments
Closed

Version numbering for next release #305

mrocklin opened this issue Jul 31, 2019 · 12 comments

Comments

@mrocklin
Copy link
Member

Hi All,

As part of an upcoming release I would like to release dask-jobqueue. What should the version number be? Some options:

  1. 0.6.2 - this will be a minor version change in functionality
  2. 1.0 - this library is fairly stable, might as well call it 1.0
  3. 2.2 - let's just match Dask version numbers going forward

What should we use?

@willirath
Copy link
Collaborator

+1 for 2.2

@guillaumeeb
Copy link
Member

No strong opinion from me. Let's do the same as other dask related libraries.

@mrocklin
Copy link
Member Author

@lesteve made a good comment here dask/dask#5168 (comment)

dask-jobqueue doesn't need much attention these days, so its version is likely to fall behind if we try to keep things in sync. I think that for the upcoming release (which I'd like to do in the next hour or so if no one objects) I'll just bump the minor version number so that we can leave this discussion open.

@lesteve
Copy link
Member

lesteve commented Jul 31, 2019

My preference would be 0.6.2. I feel the pace of development of dask/distributed is a lot quicker than dask-jobqueue and that keeping versions in sync does not make complete sense to me.

I have made some similar comment in dask/dask#5168 (comment).

Something I have been thinking about in seeing the changes needed for distributed >= 2, and now distributed >= 2.2 (thanks a lot for this @mrocklin by the way!) is that dask-jobqueue is exposed to what feels like internal dask/distributed details.

My feeling is that one of the main reason is that at one point we decided to create dask_jobqueue.core.ClusterManager and our base cluster class JobQueueCluster is not inheriting from distributed.cluster.Cluster (see #187 for the PR and #170 for some of the motivations) . I think that we should revisit this decision. I may well be wrong about all the rest of this message because I have not had as much time as I wanted to look at this unfortunately ... more than open to comments/complaints/corrections!

It would be great to have @guillaumeeb's opinion on this because he was the main person involved in this IIRC.

I seem to remember there were two reasons for the dask_jobqueue.core.ClusterManager change:

  • we wanted to do some dask-jobqueue specific experimentations, for example where the scheduler was on a different node than the main script or where the cluster object manages different kind of workers. As we are all quite busy with other things, it did not really happen as far as I know. Also some stuff has happened on the distributed front, e.g. the SpecCluster (although I have not followed very closely it seems like it could do the heterogeneous workers use case).
  • we wanted to have different arguments for scale min_cores, min_memory. I don't think that was one of the main reason. See Scale using number of cores or memory with ClusterManager #184 for the PR.

@mrocklin
Copy link
Member Author

I encourage people to take a look at SpecCluster and how it was used with an SSH library to make a simple (but fully featured) SSHCluster and how @jacobtomlinson is using it to rewrite KubeCluster.

I think that we'll be able to do the same thing with dask-jobqueue. This will allow us to drop a lot of the code in dask-jobqueue that handles adaptivity, cluster management and so on, and focus on how to correctly launch individual jobs.

@guillaumeeb
Copy link
Member

Yeah, ClusterManager has no point anymore with SpecCluster. Dask distributed provided solutions have taken the lead, we must now implement them. I wish I could do some of this work, but this doesn't seem realistic right now... We'll see!

@mrocklin
Copy link
Member Author

For now I've pushed out a 0.6.2 release. It contains nothing except for the recent compatibility commit, and some recent work on documentation from @lesteve

@lesteve
Copy link
Member

lesteve commented Jul 31, 2019

Thanks for your insights @guillaumeeb and for the release @mrocklin!

If I understand correctly @mrocklin's dask/dask#5168 (comment) it seems like he may be volunteering to do some of the SpecCluster work in dask-jobqueue at some point, which of course would be greatly appreciated!

@mrocklin if you start working on this and you have questions about dask-jobqueue specificities do let us know!

@willirath
Copy link
Collaborator

likely to fall behind if we try to keep things in sync

To me, this sounds like an argument for syncing versions. As a user, I'd love to immediately be able to tell what is the highest version of dask and dask.distributed that I can safely use with dask-jobqueue.

@lesteve
Copy link
Member

lesteve commented Aug 1, 2019

The main assumption I am making is that in the future, there will be a lot less breaking changes in dask that needs dask-jobqueue changes compared to what we have seen recently (e.g. dask 2.0 changes). This assumption may well be too idealistic and I may be strongly biased by my past implications in projects (e.g. scikit-learn) where backward-compatibility with old dependency versions was not negociable.

With this assumption in mind:

  • in most cases there is nothing to indicate since the current release of dask-jobqueue is already compatible with the new release of dask
  • if not much has happened in dask-jobqueue it feels slightly weird to bump the dask-jobqueue just to indicate that it is compatible with the new dask version (which it may already be anyway, see previous bullet point)

Also I feel the usual mechanism to indicate compatiblity between different projects is in no particular order README/changelog, dependency constraints (in conda meta.yml or pip setup.py), runtime checks at import time that tell you that your version is too old and needs to be updated. I agree it is not as immediate as you probably would like it to be.

@jacobtomlinson
Copy link
Member

Just to chime in here. This project like the other cluster managers is here to bring two technologies together, Dask and batch job schedulers. Like dask-kubernetes is Dask and Kubernetes and dask-yarn is Dask and Hadoop.

Therefore I feel like version numbering should be independent of both technologies rather than pinned to one of them. If a new batch job scheduler is added to this library that should indicate a feature revision. If there are breaking changes in Dask that need addressing that should also indicate a feature revision. If some of those breaking changes are propagated to this library and it affects usage then that should indicate a major revision.

It's harder to communicate these things if you are pinning to the Dask version. As @lesteve says there are already mechanisms in Python for setting dependencies and ensuring compatibility. One thing we could do is do strict major pinning (>=2,<3 for example), so if Dask does a major release with breaking changes it is not supported automatically.

@guillaumeeb
Copy link
Member

I think we can close this one for now, consensus was reached to keep independant ersion numbering.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants