Skip to content

feat(cicd): build, test, and release github actions#5

Merged
gforsyth merged 6 commits intosubstrait-io:mainfrom
danepitkin:danepitkin/github-actions
May 30, 2023
Merged

feat(cicd): build, test, and release github actions#5
gforsyth merged 6 commits intosubstrait-io:mainfrom
danepitkin:danepitkin/github-actions

Conversation

@danepitkin
Copy link
Copy Markdown
Contributor

@danepitkin danepitkin commented May 16, 2023

  • Build the Substrait Python package
  • Locally install the package and run pytests
  • Version tag triggers github release
  • Version tag triggers PyPi publish
  • Push to main triggers Test PyPi publish

Requires GH secrets to be added for PYPI_API_TOKEN and TEST_PYPI_API_TOKEN.

@danepitkin
Copy link
Copy Markdown
Contributor Author

This is my first time implementing GH actions, so please review carefully!

Copy link
Copy Markdown
Member

@gforsyth gforsyth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @danepitkin -- this looks good to me overall, and I think the logic is sound.

What do you think about splitting out the release and publish workflows into a separate file? It's still possible to use needs across files and it can avoid certain footguns.

@danepitkin
Copy link
Copy Markdown
Contributor Author

danepitkin commented May 16, 2023

Hey @danepitkin -- this looks good to me overall, and I think the logic is sound.

What do you think about splitting out the release and publish workflows into a separate file? It's still possible to use needs across files and it can avoid certain footguns.

I would also prefer this approach, but it started to get a bit hairy w/o myself having a good understanding of how to test these actions. AFAIK needs can only be used in the same workflow. When you have separate workflows (files), you need to use workflow_run[1]. e.g.:

name: Release Python package

on:
  workflow_run:
    workflows: [build, test]
    types: [completed]
    branches: [main]

I can give it a shot! I tried using https://github.com/nektos/act for testing locally, but ran into issues trying to get the upload-artifact action to work locally.

[1] https://stackoverflow.com/questions/72869054/github-action-add-needs-with-separate-workflow-file

@danepitkin
Copy link
Copy Markdown
Contributor Author

I could also just switch to a test.yaml that runs for PRs and a release.yaml that does the build + publish.

@jorisvandenbossche
Copy link
Copy Markdown

I would also assume that we would have a separate CI workflow for just the tests (that doesn't necessarily creates wheels first), and which would test on different OS, python versions, possibly versions of dependencies, etc

And in that case, I think keeping a simple test step (essentially to verify that the build artifacts are OK) in the build and release workflow might be OK without splitting it.

@jorisvandenbossche
Copy link
Copy Markdown

Yes, so essentially what you also said in your last comment

- name: Install package dependencies
run: |
python -m pip install --upgrade pip
pip install ".[test]"
Copy link
Copy Markdown

@jorisvandenbossche jorisvandenbossche May 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we keep a test step for this "release" workflow, we probably want to explicitly install one of the built packages (in /dist)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the new test.yml to install from dist/. It runs on push as well, but its not gating. Probably good to rely on tests to pass before running the release workflow?

Comment thread .github/workflows/build.yml Outdated
@@ -0,0 +1,34 @@
name: Build
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Build workflow will always run on PRs, main branch, and version tags. It will upload the build artifact that can be accessed by downstream workflows with 1 day retention (default is 30 days).

Comment thread .github/workflows/test.yml
Comment thread .github/workflows/release.yml Outdated
@@ -0,0 +1,55 @@
name: Release
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Release workflow will run when the Test workflow completes AND is successful AND we are on the main branch. If the git ref is a git tag, create a Github release AND upload package to PyPi. If the git ref is the main branch, upload to Test PyPi.

@danepitkin danepitkin mentioned this pull request May 16, 2023
Copy link
Copy Markdown

@assignUser assignUser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice to see such an automatic release process :D

One major security issue on the test workflow and some additional comments.

The only way to really test these is to merge them to main on your fork and test them there (the pypi steps would fail but as you are relying on an action there it should be fine to assume they will work given the right creds)

Comment thread .github/workflows/test.yml Outdated

on:
workflow_run:
workflows: [build]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was unable to confirm in the docs but iirc this need to be either the full name: value or the yaml file name including extension,

Comment thread .github/workflows/test.yml Outdated
Comment on lines +26 to +30
- name: Download artifact
uses: actions/download-artifact@v3
with:
name: dist
path: dist/
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will not work as the artifact actions can only be used within the same run but workflow_run is disjunct from the triggering run. You have to use the api to get the artifact (see docs)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! I had assumed the github action would have used the API under the hood, but I hadn't actually managed to test it yet.

Comment thread .github/workflows/test.yml Outdated
on:
workflow_run:
workflows: [build]
types: [completed]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually a security risk as the workflow_run trigger starts a run with elevated permissions, which means that running any kind of untrusted code (e.g. from an outside PR).

The default permissions on the GITHUB_TOKEN are rw on almost everything, so even with no secrets exposed in the env a lot of damage could be done.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My recommendation would be to move this into the build workflow as a secondary job with needs: as I don't really see a need to keep this in a separate file.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or just have this test workflow not rely on the build workflow. I think we can install the package from source here in one of the first steps (that essentially do that same, but without the need to upload/download artifacts, and this is a very fast step anyway for a small pure python package like we have)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove all workflow_run usage. They need a big red security warning in the docs to it's more obvious!

Comment thread .github/workflows/build.yml Outdated
pull_request:
push:
branches: [ main ]
tags: [ 'v*.*.*' ]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to confirm that this is what you want: this will trigger the workflow off of any 'v*' tag on any branch (which would make sense if maintenance branches are planned).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I might update to a more strict regex.

Comment thread .github/workflows/release.yml Outdated
release:
name: Release
runs-on: ubuntu-latest
if: ${{ github.event.workflow_run.conclusion == 'success' && startsWith(github.ref, 'refs/tags/') }}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if: ${{ github.event.workflow_run.conclusion == 'success' && startsWith(github.ref, 'refs/tags/') }}
if: ${{ github.event.workflow_run.conclusion == 'success' && startsWith(github.ref, 'refs/tags/v') }}

Maybe better to be specific (probably a bit paranoid 🤷 :D)

Comment thread .github/workflows/release.yml
- name: Publish package to PyPI
uses: pypa/gh-action-pypi-publish@release/v1
with:
password: ${{ secrets.PYPI_API_TOKEN }}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh nice pypi supports OICD which might be worth a look: https://docs.pypi.org/trusted-publishers/

Comment thread .github/workflows/release.yml
Comment thread .github/workflows/build.yml Outdated
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.x"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it not make sense to build on the lowest supported version?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For our use case, that shouldn't matter

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3.x was also recommended by the official docs, which is why I chose it.

Comment thread .github/workflows/build.yml Outdated
python -m pip install build --user
- name: Build package
run: |
python -m build
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC this is a pure python project at the moment and will build a none-any wheel? Otherwise cibuildwheel would be my goto :D

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this will build a single pure python wheel, so cibuildwheel would be overkill at the moment

Comment thread .github/workflows/test.yml Outdated
- name: Install wheel and test dependencies
run: |
python -m pip install --upgrade pip
find ./dist/*.whl | xargs python -m pip install
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
find ./dist/*.whl | xargs python -m pip install
python -m pip install .

Install from source (which will also go through building a wheel, but just on the fly without having to download the pre-built wheel)

@danepitkin
Copy link
Copy Markdown
Contributor Author

danepitkin commented May 17, 2023

I've updated the code to remove workflow_run and was able to test the github actions locally for test.yml and the build + publish_test jobs in release.yml (eventually failing due to no Test PyPI token). This was on MacOS (ARM) using nektos/act:

act --container-architecture linux/amd64 --artifact-server-path /tmp/artifacts

Copy link
Copy Markdown

@assignUser assignUser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now. A minor change that would mostly be for comparability future changes to the default token permissions would be to explicitly set the minimum required permissions: https://docs.github.com/en/actions/using-jobs/assigning-permissions-to-jobs

Copy link
Copy Markdown
Member

@gforsyth gforsyth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together @danepitkin !

And thanks for the security check @assignUser !

@jorisvandenbossche -- do you want to take another look over this? If not, I'll merge this in later today.

@danepitkin
Copy link
Copy Markdown
Contributor Author

Thanks all for the review! @gforsyth any recommendations on how to get PyPI secrets configured? Are there substrait-owned secrets we can reuse?

@gforsyth
Copy link
Copy Markdown
Member

We can ask @cpcloud to add them to the repository once we've created them. I usually create a package on PyPI with a (near) empty pyproject.toml and then scope the API token to cover only that package.

I don't think there will be existing substrait secrets because there aren't any Python offerings (yet!)

@gforsyth
Copy link
Copy Markdown
Member

gforsyth commented May 18, 2023

Ok, I've created https://pypi.org/project/substrait/ and invited @jacques-n and @cpcloud as owners of the package. @westonpace -- I couldn't find a PyPI username for you, if you have one, let me know and I'm happy to add you.

The OIDC setup looks pretty straightforward, but will require someone who has access to the repository settings (and that's not me).

Similarly, using a project-scoped API token (which I can create), will also require someone on the PMC to add to the repo secrets.

@westonpace
Copy link
Copy Markdown
Member

@westonpace -- I couldn't find a PyPI username for you, if you have one, let me know and I'm happy to add you.

@gforsyth

I don't have one. However, if me having one would help in any way then please don't hesitate to reach out and I'll add one.

@westonpace
Copy link
Copy Markdown
Member

closes #3

@westonpace westonpace linked an issue May 18, 2023 that may be closed by this pull request
@gforsyth
Copy link
Copy Markdown
Member

I don't have one. However, if me having one would help in any way then please don't hesitate to reach out and I'll add one.

All good! I just didn't want to miss you if you did have one.
I think with 2/3 of the PMC having access we should be good.

@danepitkin
Copy link
Copy Markdown
Contributor Author

Do we think its OK merging now, or should we wait for the secrets to be added to the repo?

Copy link
Copy Markdown

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

name: Publish to Test PyPI
runs-on: ubuntu-latest
needs: build
if: github.ref == 'refs/heads/main'
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to push to the test pypi server for every commit on main?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably not -- but we can leave it in place for the initial commit (after credentials are setup) to make sure everything is working, then strip it out

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be triggered manually via workflow_dispatch if that would be useful?

Copy link
Copy Markdown
Contributor Author

@danepitkin danepitkin May 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This workflow was recommended by the Python packaging docs[1] themselves:

Now, whenever you push a tagged commit to your Git repository remote on GitHub, this workflow will publish it to PyPI. And it’ll publish any push to TestPyPI which is useful for providing test builds to your alpha users as well as making sure that your release pipeline remains healthy!

[1] https://packaging.python.org/en/latest/guides/publishing-package-distribution-releases-using-github-actions-ci-cd-workflows/

@gforsyth
Copy link
Copy Markdown
Member

Do we think its OK merging now, or should we wait for the secrets to be added to the repo?

I think we wait, because we'll want to make sure the credentials are working without a bunch of extraneous commits (there will almost certainly be some).

@gforsyth
Copy link
Copy Markdown
Member

I've created the tokens and added them to the repository secrets -- going to merge this and see what happens.

@gforsyth gforsyth merged commit ff60671 into substrait-io:main May 30, 2023
@gforsyth
Copy link
Copy Markdown
Member

Looks like the release failed because of some unexpected yaml: https://github.com/substrait-io/substrait-python/actions/runs/5123352674/workflow

@danepitkin
Copy link
Copy Markdown
Contributor Author

Looks like the release failed because of some unexpected yaml: https://github.com/substrait-io/substrait-python/actions/runs/5123352674/workflow

Looking into it now! I must have misread the documentation on how this is configured..

@danepitkin
Copy link
Copy Markdown
Contributor Author

Created #8

@gforsyth
Copy link
Copy Markdown
Member

#8 Fixed the yaml issue and the release pipeline ran. We're getting close!

Now it's that the versioneer style version isn't compliant:
https://github.com/substrait-io/substrait-python/actions/runs/5123895480/jobs/9215062884

 ERROR    HTTPError: 400 Bad Request from https://test.pypi.org/legacy/          
         '0.1.dev1+g2e6f6a5' is an invalid value for Version. Error: Can't use  
         PEP 440 local versions. See                                            
         https://packaging.python.org/specifications/core-metadata for more     
         information.  

@danepitkin
Copy link
Copy Markdown
Contributor Author

Added #9

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Adding CIs

5 participants