Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More frequent DataFusion releases to crates.io (discussion) #2327

Closed
alamb opened this issue Apr 24, 2022 · 7 comments
Closed

More frequent DataFusion releases to crates.io (discussion) #2327

alamb opened this issue Apr 24, 2022 · 7 comments
Labels
enhancement New feature or request

Comments

@alamb
Copy link
Contributor

alamb commented Apr 24, 2022

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
DataFusion's mission statement is to be "easy to embed" from https://arrow.apache.org/datafusion/user-guide/introduction.html#why-datafusion

Within the rust ecosystem, publishing to crates.io is a key way to make something easy to embed (other projects can then use it with a single line of toml). However, we have been releasing about once every three months, which limits the speed at which we can publish new features to crates. io for use in downstream projects.

Also, each three month release has substantial changes to the interfaces requiring substantial work from downstream crates to keep up with the API churn

The most active projects of the users today of datafuson use a fork or work off the master branch directly (see presentation linked in #2323).

Describe the solution you'd like
It would be awesome to release DataFusion more frequently, perhaps in a semver compatible way, to encourage more community use.

For example, as suggested by @martinitus #37 (comment) when there is a new major version of arrow-rs:

I think general best practice (given you use semver) is that if you have a major increase in a dependency that appears in your public API, then you also need a major increase in your library version, as the stuff that appear in your public API may break backwards compatibility for the users of your API.

Describe alternatives you've considered

Keep with the current approximately quarterly release schedule, at least until the APIs have stabilized more.

Additional context

Managing regular releases is a substantial undertaking:

  1. Managing the actual release process requires significant time
  2. Managing what code changes can go where and still maintain semver is non trivial (e.g. backporting 'semver' compatible changes -- see discussion on More frequent major releases for arrow-rs arrow-rs#1120 for some flavor). We (I) abandoned doing this in arrow-rs due to lack of time.

I don't personally have the bandwidth at this time to organize / manage this process; However, I wanted to get it on our collective radar, and I would be happy to help guide and support anyone who wants to invest the time making this happen for the community

@alamb alamb added the enhancement New feature or request label Apr 24, 2022
@andygrove
Copy link
Member

andygrove commented Apr 25, 2022

I would like to volunteer to help with this. I think that the dask-sql project is going to benefit from more frequent releases of DataFusion.

@martinitus
Copy link

Another side note: Other rust projects used pre v1.0.0 minor commits to stabilize their APIs for a while. During that phase, they break compatibility also during minor version increments. Once they are "happy" with things they stabilize it in a version 1.0.0 promising strong backwards compatibility from there on.

I guess examples for this would be tokio or hyper?!

That said, I see no real downsides in frequent major releases during stabilization :)

Awesome work! <3

@alamb
Copy link
Contributor Author

alamb commented Apr 27, 2022

Another side note: Other rust projects used pre v1.0.0 minor commits to stabilize their APIs for a while. During that phase, they break compatibility also during minor version increments. Once they are "happy" with things they stabilize it in a version 1.0.0 promising strong backwards compatibility from there on.

Yes, I agree this would be the ideal versioning scheme. The reason DataFusion is already in > 1.0.0 version is partly due to history, as in the past it was released with the rest of the arrow implementations and matched version numbers.

That said, I see no real downsides in frequent major releases during stabilization :)

The only downside I see is that there may be some misaligned expectations with some users. However, it hasn't seemed to cause any major issues yet, that I know of.

Awesome work! <3

❤️ thank you

@alamb alamb mentioned this issue May 2, 2022
11 tasks
@andygrove
Copy link
Member

Another option to consider would be Calendar Versioning

@jychen7
Copy link
Contributor

jychen7 commented May 8, 2022

More frequent DataFusion releases to crates.io

do we mean more frequent major release from master branch?

Another option to consider would be Calendar Versioning

I like it!

andygrove commented 12 days ago
I would like to volunteer to help with this

I also can help with this, but from last time 7.1.0 release (PR), though I can help prepare release note, it still require committer to take time to prepare release candidate artifacts

@alamb
Copy link
Contributor Author

alamb commented May 8, 2022

I also can help with this, but from last time 7.1.0 release (#2187), though I can help prepare release note, it still require committe

Thank you @jychen7 -- your assistance for 7.1.0 was very helpful

I find the mechanics of creating the artifacts and voting takes relatively minimal time compared to creating release notes and ensuring everything is ready. The assistance of the community on release notes and documentation is amazing

@alamb
Copy link
Contributor Author

alamb commented Jan 22, 2023

Given @andygrove is now running releases every month or so I think we can close this particular issue and open other tickets for other discussiosn

Discussion of "nightly" releases: #5023

@alamb alamb closed this as completed Jan 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants