Skip to content

Deploy action #27

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 35 commits into from
Aug 4, 2023
Merged

Deploy action #27

merged 35 commits into from
Aug 4, 2023

Conversation

billbrod
Copy link
Member

@billbrod billbrod commented Jul 7, 2023

This PR adds notes on building and deploying python packages, as well as two potential actions for doing so on Github Actions, depending on whether the project is pure python or not.

Included changes:

  • Adds content to 02-packaging.md to describe building and deployment.
  • Adds two yml files, deploy-cibw.yml and deploy-pure-python.yml. They're described in the note.
  • Uses setuptools_scm to dynamically determine version number. Updates pyproject.toml to reflect this, as well as __init__.py
  • Starts CONTRIBUTING.md, with some example text on how releases work. This references binder, which we're not currently using. So this can be changed or sections of it removed, if people think that's confusing.

Two bigger issues that need to be resolved:

  • Note that we can't actually test this properly until after the merge: I had tried triggering the action on a PR (e.g., this one), but I'm pretty sure the github commit SHA during a pull request is one off from the local head, see Merge commit ref shouldn't stay same after changing base branch for a PR actions/checkout#919 (comment). That is, it creates a "fake commit". This is a problem because setuptools_scm determines the tag based on how many commits you are from the most recent tag, and thus always thought I was one commit ahead. If you look at this action, under the test setuptools_scm step, you'll see the tag is 0.0.2 but setuptools_scm creates a tag of 0.0.3.dev1 (line 69). You can read its docs for details on how it determines this (I use python-simplified-semver and no-local-version, specified in pyproject.toml), but basically it's the "distance and clean" case: it sees there's been 1 commit, so it increments the semantic version (0.0.2 -> 0.0.3) and adds dev1 (to signify one commit). None of this should be a problem if we trigger the action from a release in main, but we can't test until after merge.
  • Additionally, this project is a "pure python" one. That means the cibuildwheel will always fail and thus we can't test the deploy-cibw.yml file. Should we include two fake packages here, one pure python and one with a little C code saying "hello world" so we can test both?
    • In particular, I'm worried about the interaction between cibuildwheel and setuptools_scm, so I'd like to test they behave as expected.

Closes #22

@billbrod billbrod marked this pull request as ready for review July 7, 2023 21:10
@BalzaniEdoardo
Copy link
Collaborator

BalzaniEdoardo commented Jul 24, 2023

Hei Billy, very good job. I learned a lot on deployment by following your instructions. Minor changes a part, I think my main comment are:

  1. I would split the manual build into pure-python and native deps, as in the other sessions
  2. I would use the manual build to explain what happens in the action (as you did) but not the other way around.
  3. I am confused on how the release should trigger a test vs regular pypi deployment. maybe we should recommend to add a tag to the pre-release (v1.0.0-alpha or something) so that one can then proceed to use the actual tag when publishing to pypi

fetch-depth: 0

- name: Build sdist
run: pipx run build --sdist
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

naive question: is python already setup when you run this job?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, no. I got this from the CI build wheel docs, which says: "The GitHub Actions runners have pipx installed, so you can easily build in just one line." So in general I think you need to run the setup python step, but pipx is ready to go (and creates virtual environments with specific python versions as needed)


### Optional dependencies
- Source distribution are the raw files, as stored on GitHub (or whatever platform you use), and will require the user's computer to build the package itself. For pure python packages, this is probably not a problem, as the user will need to have python anyway. However, if you have non-python compiled dependencies (called "native dependencies" for some reason), this will get more complicated. For example, if you depend on C or C++ code, the user will need to have a C compiler, which Windows machines do not have by default.
- Binary files are the already built / compiled source files. They include everything necessary and are ready to be installed directly. They're thus faster to install and less likely to run into user- or operating system-specific issues. However, this means that the developer must build and upload to PyPI separate wheels for each operating system they wish to support. However, if you have no compiled extensions, you can produce a "universal" or ["pure python" wheel](https://packaging.python.org/en/latest/guides/distributing-packages-using-setuptools/#pure-python-wheels) (so-called because all of your code is in python), which will work on all operating systems.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when you say "compiled extensions", do you mean "native dependencies"? If it is the same concept introduced in the first point, I would use the same terminology (naive deps or compiled deps)

testing notes](05-linters-and-tests.md) for more info.

## Build
In this note, we'll discuss how to build and upload your library using a GitHub action that handles as much of this as possible for you: it will build the source distribution and wheel files, check that installation is possible, run some tests, and upload to PyPI.

## pyproject.toml Configuration
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## pyproject.toml Configuration
## Configuring Your Python Project with the pyproject.toml

I was trying to come up with a more descriptive title for the section, let me know if you like this proposal

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this should be more specific, how about: "Specifying project metadata and build instructions with pyproject.toml ", what do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it


This will build the source distribution and wheel for your system. If your code is pure python, as discussed above, then you're done!

If you have native dependencies, then you're using `deploy-cibw.yml`, and you might want to be able to run `cibuildwheel` locally, in order to check against what's happening in the Github action. To do so, see [the cibuildwheel documentation](https://cibuildwheel.readthedocs.io/en/stable/setup/#local) for how to install and run it. Note that you'll need [docker](https://www.docker.com/products/docker-desktop) installed to build the Linux wheel (which is possible from any OS); if you wish to build the macOS or Windows wheel, you'll need that operating system (see cibuildwheel docs for more info).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm not mistaken, this is the first instance where we mention the specific deploy action. In order to avoid potential confusion, I recommend that we don't mention the 'git-action' while detailing the manual procedure. I would concentrate solely on the build tutorial at this stage. then we could leverage the understanding gained from the manual process later on to clarify what occurs within the action

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe you could separate the procedure in two, as in the background. One procedure would be for pure-python builds, and another for native deps.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tried doing this, let me know what you think


### Understanding the action file

#### Pure python
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we split the background, and the git-action in pure-python vs native deps, I would follow the same structure in the manual deployment procedure


### Why is publish a separate job?

In both of these actions, `publish` is a separate job, rather than a step within the deploy job, and you may be wondering why. The main reason is because you might be running the tests multiple times in parallel (e.g., in `deploy-cibw.yml`, we build the wheel separately for each OS), but we only want to upload to PyPI once, including all files. To do this, we make use of Github's [upload and download artifact](https://docs.github.com/en/actions/using-workflows/storing-workflow-data-as-artifacts) actions, which allows us to share files between jobs. For the pure python build, the `build` action is only run on one OS, but using the artifact actions allows you to download the built package for examination on your local machine, if you'd like.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean "...a step within the build job"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch!


The action defined in `.github/workflows/deploy-pure-python.yml` builds and deploys a pure python project. This section is based on [this guide](https://packaging.python.org/en/latest/guides/publishing-package-distribution-releases-using-github-actions-ci-cd-workflows/) from the python packaging authority and makes use of [their action](https://github.com/pypa/gh-action-pypi-publish).

This action gets triggered on every github release and does the following:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another naive Q: if I trigger the deploy action on test-PyPI by tagging the repo , how do I deploy the actual PyPI with the same tag? how should i trigger again the deploy?

I have a proposal for how we could handle the test/actual pypi deployment:
should we suggest to tag test-pypy release with the version + some other label like, alpha/beta...?
v1.0.0-alpha

Then finally tag it without the label to trigger PyPi?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyPI and Test PyPI are completely separate indices, so there's no problem with conflicting names or tags. Thus, to deploy with the same tag, I'd remove thewith lines, delete the tag and go again.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be worried about putting too much string-parsing logic like that in the job. My intention was that people would go to Test PyPI just in the beginning, to make sure the action is working and it looks like they want, and then deploy right to PyPI from then on, rather than always deploying to both.

The only use case I can think of for why they'd want to go back to TestPyPI afterwards would be if they're adding/removing a bunch of files and want to check it all again. But in that case, they could modify the test line at the end of build job or do it manually (e.g., build the wheel locally and investigate it / upload to Test PyPI)

Copy link
Collaborator

@BalzaniEdoardo BalzaniEdoardo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the caveats about 'breaking change' being hard to pin down, though I do think the original specification of semantic versioning is pretty much good (once semantic vagueness is taken into account). The newer proposal seems less helpful to me as it focuses on one dimension from one perspective, is less nuanced (also the order is reversed from major->batch in the two).

Copy link
Member Author

@billbrod billbrod Aug 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch on the order reversing, I'll change that.

For the discussion of semantic versioning, here's Henry Schreiner's notes, which basically is this xkcd. He argues that, if you have enough users, someone will be using your package in an unexpected way (relying on a private attribute or method being the most common one; this is probably more true with scientists as users, who might not realize they shouldn't rely on private attributes/methods), and so you can't easily decide what's breaking or not. Thus, it's better to clarify that SemVer is about authorial intent, rather than sounding like you're making guarantees about whether changes will break or not. I hear what you're saying though, this version sounds mushier (though more nuanced, IMO).

Should I add the explanation I gave above to the packaging notes and/or contributing stub?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gleaned that from the doc which was very clear, and I liked the caveat given about how hard it can be, and I liked that you included the xkcd. To me the standard semver description seems sufficient, when given with the caveats that this can be really hard to specify in practice and these aren't really hard and fast divisions (because xkcd).

I'm just not convinced you need to actually go any further with an attempt to make it more explicit (it seems almost like a Penrose tiling problem semantically). But, if you like it, it's not that big a deal to me: it fits with the overall explanatory tenor of this doc, and you were clear that you weren't really coming down with a strong claim with the "regardless of what you decide" qualification.

One option would be to give the usual standard, and the caveats, and something like "...some devs have even tried to accommodate these concerns by X an alternate system" (With a link to the Schriener notes but without giving a whole block quote which to me just seems to give it too much weight).

But if you prefer it as is, I really don't have a strong feeling about it: this is obviously a pretty minor nit to pick. 😄

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha (I thought I had included the xkcd in the docs but forgot it was workflow.md, not packaging.md, so I couldn't find it and thus posted it again). I think you're right here, especially because our packages are on the smaller side. And there's certainly an advantage to using the standard. I pushed changes that integrated it.

And it's a good nit to pick!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let me know what you think of the current version

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

python -m build --outdir dist/ --sdist --wheel
```

This will build the source distribution and wheel for your system. If your code is pure python, as discussed above, then you're done! You can double check this by running `ls dist/*.whl`; if your package is pure python, the name will look like `package-version-py3-none-any.whl` (replacing `package` and `version` with the name and version, respectively, of your package). Otherwise, you have native dependencies.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what it means to say "native dependencies" (from context it seems to mean that it isn't pure Python, but is that all it means? I think it means it depends on OS-specific, compiled code). But maybe just spell this out? E.g., if you have a yaml file that isn't python is it a native dependency (you know what I'm saying).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, it means "non-python compiled dependencies", which I define on line 16

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah missed that, so it's all good.


We deploy to PyPI every release, and we follow [semantic versioning](https://semver.org/) for release labeling (see [workflow notes](00-workflow.md) for more details on when to release and how this labeling works; this repo's `CONTRIBUTING.md` document also includes some suggested language for explaining the procedure to ccontributors). We do this by making use of continuous integration (CI) on github actions. See the [CI](06-ci.md) notes for more on CI and Github actions in general; in the following sections we discuss specifically how to use our `deploy` actions.

We provide two different Github actions for handling building and deployment: `deploy-pure-python.yml` and `deploy-cibw.yml` (`cibw` = "CI build wheel"). If your package is pure python, as described earlier in this note, use `deploy-pure-python.yml`, otherwise use `deploy-cibw.yml`. If you're not sure, manually build your package as described [earlier](#build) (`python -m build --wheel`) and then `ls dist/*.whl` to examine your built wheels. If your output looks like `package-version-py3-none-any.whl`, it's pure python. If it includes the name of your OS in the filename, it is package-specific and you should use `deploy-cibw.yml`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general this is all super helpful stuff and great!

@billbrod billbrod merged commit 48f08cc into main Aug 4, 2023
@billbrod billbrod deleted the deploy_action branch August 4, 2023 16:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Deploy action
3 participants