Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement biweekly releases for arrow-rs, parquet-rs #292

Closed
7 of 8 tasks
alamb opened this issue May 15, 2021 · 18 comments
Closed
7 of 8 tasks

Implement biweekly releases for arrow-rs, parquet-rs #292

alamb opened this issue May 15, 2021 · 18 comments
Assignees
Labels
development-process Related to development process of arrow-rs

Comments

@alamb
Copy link
Contributor

alamb commented May 15, 2021

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Implement the process that will allow us to release to crates.io every 2 weeks as described in the proposal

See official vote email chain

I plan to use this ticket to track the work related to implementing this new process

High level plan / updates:

@alamb alamb added the enhancement Any new improvement worthy of a entry in the changelog label May 15, 2021
@alamb alamb self-assigned this May 15, 2021
@alamb
Copy link
Contributor Author

alamb commented May 15, 2021

@jorgecarleitao is working on changelog creation in #274

There is a packaging issue here: #212

This was referenced May 15, 2021
@alamb
Copy link
Contributor Author

alamb commented May 15, 2021

Here is what I am thinking for the release workflow (based on the main apache release flow. The major difference is that there is no Release Candidate)

Proposal:

  1. Update version on active_release branch with proposed release number
  2. create tag from active_release branch with proposed release number
  3. Make a candidate release .tar.gz file with signatures + Upload to apache distro dev dev, e.g. https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-rs-4.1.0/
  4. Propose release of that rc on the mailing list
  5. if approved: move tarball to release, e.g. https://dist.apache.org/repos/dist/release/arrow/apache-arrow-rs-4.1.0/
  6. If not approved, fix whatever the problem is and make a new tag, with a new minor release (aka don't make "RC / release candidates, just skip minor versions)

We will then treat uploading to crates.io as a post official release task done from the official source

I hope to have a script for most of this process (other than the email / approval).

@alamb
Copy link
Contributor Author

alamb commented May 17, 2021

I need help verifying the proposed source tarball format for the Arrow Rust releases;

Specifically, can someone please:

  1. Download the example files and ensure they can successfully validate the signatures
  2. Ensure that the contents of this tarball could be used to publish to crates.io

Background: I have been working on the new release process for
arrow-rs (updates in [2]). The contents and changelog in this example
release tarball are from [3] and were created using the scripts /
instructions in [1].

[1] #299
[2] #292
[3] #305

Here is an example output (including Vote Email) generated by script in [1]:

Example output:


cd /Users/alamb/Software/arrow-rs/ && ./dev/release/create-tarball.sh  0.0.3
Attempting to create /Users/alamb/Software/arrow-rs/dev/dist/apache-arrow-rs-0.0.3/apache-arrow-rs-0.0.3-f3959f59a.tar.gz from tag 0.0.3
Draft email for dev@arrow.apache.org mailing list

---------------------------------------------------------
To: dev@arrow.apache.org
Subject: [VOTE][RUST] Release Apache Arrow

Hi,

I would like to propose a release of Apache Arrow Rust
Implementation, version 0.0.3.

This release candidate is based on commit: f3959f59a6119dab23818e6eef87e0d7b58c820e [1]

The proposed release tarball and signatures are hosted at [2].
The changelog is located at [3].

Please download, verify checksums and signatures, run the unit tests,
and vote on the release.

The vote will be open for at least 72 hours.

[ ] +1 Release this as Apache Arrow Rust
[ ] +0
[ ] -1 Do not release this as Apache Arrow Rust  because...

[1]: https://github.com/apache/arrow-rs/tree/f3959f59a6119dab23818e6eef87e0d7b58c820e
[2]: https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-rs-0.0.3
[3]: https://github.com/apache/arrow-rs/blob/f3959f59a6119dab23818e6eef87e0d7b58c820e/CHANGELOG.md
---------------------------------------------------------
Running rat license checker on /Users/alamb/Software/arrow-rs/dev/dist/apache-arrow-rs-0.0.3/apache-arrow-rs-0.0.3-f3959f59a.tar.gz
OK
No unapproved licenses
Signing tarball and creating checksums
Uploading to apache dist/dev to https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-rs-0.0.3
Checked out revision 47764.
A         dev/dist/apache-arrow-rs-0.0.3
A  (bin)  dev/dist/apache-arrow-rs-0.0.3/apache-arrow-rs-0.0.3-f3959f59a.tar.gz
A         dev/dist/apache-arrow-rs-0.0.3/apache-arrow-rs-0.0.3-f3959f59a.tar.gz.sha512
A         dev/dist/apache-arrow-rs-0.0.3/apache-arrow-rs-0.0.3-f3959f59a.tar.gz.asc
A         dev/dist/apache-arrow-rs-0.0.3/apache-arrow-rs-0.0.3-f3959f59a.tar.gz.sha256
Adding         dev/dist/apache-arrow-rs-0.0.3
Adding  (bin)  dev/dist/apache-arrow-rs-0.0.3/apache-arrow-rs-0.0.3-f3959f59a.tar.gz
Adding         dev/dist/apache-arrow-rs-0.0.3/apache-arrow-rs-0.0.3-f3959f59a.tar.gz.asc
Adding         dev/dist/apache-arrow-rs-0.0.3/apache-arrow-rs-0.0.3-f3959f59a.tar.gz.sha256
Adding         dev/dist/apache-arrow-rs-0.0.3/apache-arrow-rs-0.0.3-f3959f59a.tar.gz.sha512
Transmitting file data ....done
Committing transaction...
Committed revision 47765.

Also in mailing list: https://lists.apache.org/thread.html/re4a67bda78f0636bec589bafc9fb502058159852480f3e61a00ae6c1%40%3Cdev.arrow.apache.org%3E

@andygrove
Copy link
Member

The tarball contains the directory 0.0.3 and it would be better if that were renamed to apache-arrow-rs-0.0.3.

@andygrove
Copy link
Member

andygrove commented May 17, 2021

I ran the following steps:

  • Downloaded tarball
  • Checked signatures
  • Untarred tarball (see note above suggesting a change to the directory name)
  • Ran cargo test
  • From the arrow directory, ran cargo publish --dry-run. Looked good.
  • I cannot test publishing other crates because they depend on the arrow crate being published but I ran some grep commands to check that there were no SNAPSHOT versions and that looked good.

So I would say that this format looks ready to go.

@alamb
Copy link
Contributor Author

alamb commented May 17, 2021

Thank you @andygrove

@alamb
Copy link
Contributor Author

alamb commented May 17, 2021

I'll wait until tomorrow for any other comments, and prepare a release candidate build and send to the mailing list

@alamb
Copy link
Contributor Author

alamb commented May 18, 2021

The tarball contains the directory 0.0.3 and it would be better if that were renamed to apache-arrow-rs-0.0.3.

Sorry @andygrove I just saw this comment (after I made a release candidate). I will update my script so subsequent releases are named this way

@alamb
Copy link
Contributor Author

alamb commented May 24, 2021

Update: we have released version 4.1.0 to crates.io 🎉

This week I will begin putting in place the code / process for the next release (4.2.0 perhaps?) so we can test the process and tools

@alamb
Copy link
Contributor Author

alamb commented May 24, 2021

I started figuring out how to backport individual PRs from master to active_release

Some notes:

Here is the common ancestor commit:

git merge-base  apache/master apache/active_release
c863a2c44bffa5c092a49e07910d5e9225483193

Thus, you can get the list of commits that are on master but not active release via:a

git rev-list `git merge-base  apache/master apache/active_release`..apache/master

Here is the list in pretty format:

git rev-list --pretty=oneline `git merge-base  apache/master apache/active_release`..apache/master

5295e25 Document and automate new release process (#299)
5ac771a respect offset in utf8 and list casts (#335)
4c17ac8 feature gate ipc reader/writer (#336)
b2de544 parquet: Speed up BitReader/DeltaBitPackDecoder (#325)
91ef8e9 Enable wasm32 as a target architecture for the SIMD feature (#324)
b88ef80 fix comparison of dictionaries with different values arrays (#332) (#333)
f316798 Fix undefined behavior in FFI (#323)
a25cafb Add ported Rust release verification script (#331)
71c2159 return reference from DictionaryArray::values() (#313) (#314)
dde86b9 feature gate csv functionality (#312)
f042191 fix invalid null handling in filter (#296)
e18b356 Doctests for StringArray and LargeStringArray. (#330)
087cf17 inline PrimitiveArray::value (#329)
7f37a7f Mutablebuffer::shrink_to_fit (#318)

I have started working through this list (from bottom) creating prs using the script in #339 with an initial PR #344:

ARROW_GITHUB_API_TOKEN=$ARROW_GITHUB_API_TOKEN CHECKOUT_ROOT=/tmp/arrow-rs CHERRY_PICK_SHA=7f37a7f2a119dd83c497766265707a64d9b82307 python3 dev/release/cherry-pick-pr.py

@alamb
Copy link
Contributor Author

alamb commented May 25, 2021

I made several more PRs: #354 #355 #356 #357 #358 #359 as an attempt to cherry pick to release

@jorgecarleitao
Copy link
Member

IMO the is no need to PR these? We could just cherry-pick and merge based on whether they have the label "breaking-change" or not. Alternatively, use an inverse label, "minor-change" (i.e. if we want to be explicit over what to include vs what not to include).

@alamb
Copy link
Contributor Author

alamb commented May 26, 2021

@jorgecarleitao I will think about this -- the extra human control / auditing of "we explicitly wanted to include this change in the release" is important I think. However there may be too many PRs with this method 🤔

@alamb
Copy link
Contributor Author

alamb commented May 26, 2021

@jorgecarleitao , upon more thought I plan to keep doing PRs for the following reason:

  1. It is a natural place to run the CI tests to make sure there are no logical conflicts
  2. It offers a place for the original author / committers to comment and say it should/should not be backported.
  3. It offers a way to make cleanups / fixups and approve (if needed) for non cherry pick PRs

For clean cherry-pick PRs I will plan on just merging the backport PRs when they are green in CI rather than waiting for a re-approval (given the code was already approved and we will have another "approve the whole release" vote). If the PR needs changes, we can get it approved prior to merging

@jorgecarleitao
Copy link
Member

Sounds good 👍 . One idea would be to have a single PR with all the cherry-picks, so that we only have to review one PR. We could then merge then without squashing; no strong feelings, though; was just trying to avoid a bunch of PRs that were already reviewed. ^_^

@alamb
Copy link
Contributor Author

alamb commented May 26, 2021

👍 Let's give this PR per backport a try for a while and see how it goes

@alamb
Copy link
Contributor Author

alamb commented Jun 4, 2021

I have added some labels to help the process here: #409

@alamb
Copy link
Contributor Author

alamb commented Jul 17, 2021

This is about as done as I expect it to be. Closing this ticket 🎉

@alamb alamb closed this as completed Jul 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
development-process Related to development process of arrow-rs
Projects
None yet
Development

No branches or pull requests

3 participants