Skip to content

Add Python build type specification#151

Merged
TomHennen merged 6 commits intomainfrom
python-build-spec
Apr 24, 2026
Merged

Add Python build type specification#151
TomHennen merged 6 commits intomainfrom
python-build-spec

Conversation

@TomHennen
Copy link
Copy Markdown
Owner

Summary

Spec for build/actions/python/ — the Python build type for wrangle v0.2.

Key design decisions

  • uv as first-class: detects uv.lock → uses uv sync/uv build/uv run pytest; falls back to standard PEP 517 (python -m build)
  • No tokens needed: PyPI Trusted Publishing via OIDC (adopter configures trusted publisher on PyPI, no secrets in repo)
  • Two attestation layers:
    • PEP 740 attestations via pypa/gh-action-pypi-publish (Sigstore, PyPI-native) — proves who published
    • SLSA provenance via actions/attest-build-provenance (GitHub attestation store) — proves how it was built
  • CycloneDX SBOM from resolved dependencies
  • pytest gating before publish (skips gracefully if no tests found)
  • No separate SLSA generator jobactions/attest-build-provenance works for any artifact, avoids the tag-vs-SHA issue we hit with the container builder's SLSA generator

Related

Test plan

  • Spec-only PR, no code changes

🤖 Generated with Claude Code

Spec for build/actions/python/ covering:
- pyproject.toml-based builds with uv detection and PEP 517 fallback
- PyPI Trusted Publishing (OIDC, no tokens)
- Two attestation layers: PEP 740 (Sigstore, PyPI-native) + SLSA
  provenance (actions/attest-build-provenance, GitHub attestation store)
- CycloneDX SBOM generation
- pytest test gating

See #150 for the unified build results vision.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@TomHennen TomHennen temporarily deployed to integration-test April 23, 2026 01:04 — with GitHub Actions Inactive
Comment thread build/actions/python/SPEC.md Outdated

The Python build type builds, tests, and publishes Python packages to PyPI with Sigstore attestations. It follows PyPI's Trusted Publishing model (OIDC, no API tokens) and PEP 740 attestations for supply chain provenance.

Unlike the container build type, Python packages don't need a separate SLSA generator or Cosign signing step — PyPI's native Sigstore integration (via `pypa/gh-action-pypi-publish`) handles attestation and signing in a single publish step.
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not true for SLSA provenance?

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right — PEP 740 attestations prove publisher identity (who published), not SLSA provenance (how it was built). I've already added actions/attest-build-provenance for SLSA provenance in step 7, and added a "Two layers of attestation" section explaining the distinction. But I should fix this overview paragraph to not imply PEP 740 covers SLSA. Will update.

Comment thread build/actions/python/SPEC.md
Comment thread build/actions/python/SPEC.md Outdated

### Attestations are built-in

PEP 740 attestations are generated and uploaded by `pypa/gh-action-pypi-publish` when `attestations: true` is set. The attestations are Sigstore-based (Fulcio certificates, Rekor transparency log) and verified natively by PyPI. No separate signing or provenance step is needed.
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, not SLSA attestations.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed — this paragraph conflates PEP 740 attestations with SLSA attestations. PEP 740 is Sigstore-based publisher identity, not SLSA provenance. Will fix the language to be precise about what PEP 740 covers and note that SLSA provenance is handled separately.

Comment thread build/actions/python/SPEC.md Outdated
| Input | Required | Default | Description |
|-------|----------|---------|-------------|
| `path` | No | `.` | Relative path to the directory containing `pyproject.toml` |
| `python-version` | No | `3.12` | Python version to use for building |
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have anything in place to change the default over time? If not should this be required? otherwise it will get out of date.

If we do change it people's projects might break. On the other hand, regular adopters won't need to do anything (beyond reving their wrangle commit) to get the latest version.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. A hardcoded 3.12 will go stale. Better options: (1) detect from pyproject.toml's requires-python field, (2) use actions/setup-python's python-version-file feature which reads from pyproject.toml natively, or (3) make it required with no default so adopters are explicit.

I think (2) is cleanest — actions/setup-python already supports python-version-file: pyproject.toml and picks the right version. We could default python-version to empty and use the version-file approach, with the explicit input as an override. That way it stays current without wrangle needing to track Python releases.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah 2 sounds good.

Comment thread build/actions/python/SPEC.md Outdated
|-------|----------|---------|-------------|
| `path` | No | `.` | Relative path to the directory containing `pyproject.toml` |
| `python-version` | No | `3.12` | Python version to use for building |
| `publish` | No | `true` | Whether to publish to PyPI. Set to `false` for PR builds or dry runs |
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we detect when it's a PR build or dry run so that people don't mess up?

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes — wrangle should auto-detect this. On pull_request events, publishing should default to false regardless of the input value (you never want to publish from a PR). We could gate it: publish only takes effect on push to a tag or the default branch. If someone explicitly sets publish: true on a PR, we should either warn or override. The container builder already has a similar pattern with its provenance guard (if: ! startsWith(github.event_name, 'pull_')).

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let's encode that here.

Comment thread build/actions/python/SPEC.md
Comment thread build/actions/python/SPEC.md
Comment thread build/actions/python/SPEC.md
## Known limitations

- **Trusted Publisher setup is manual.** The adopter must configure the trusted publisher on PyPI (project settings → publishing → add GitHub). Wrangle cannot automate this. The adoption docs must include this step.
- **TestPyPI requires separate trusted publisher config.** Publishing to TestPyPI for testing requires a separate trusted publisher on test.pypi.org.
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we standardize this for people or should we leave it to them to decide when to publish to test?

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should make it easy but not mandatory. A reasonable approach: wrangle could offer a test-pypi-url input (or just let adopters override repository-url to test.pypi.org). But standardizing "always publish to TestPyPI first" adds complexity and requires a separate Trusted Publisher config on test.pypi.org. Leave it to the adopter — document how to use TestPyPI, but do not require it.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good

Comment thread build/actions/python/SPEC.md Outdated
└── test_example.py
```

The fixture should NOT publish to real PyPI. Set `publish: false` in the companion template's `test-python` job. Publishing is tested separately (manually or via TestPyPI).
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't test publishing, have we really tested it? What if we publish to test.pypi.org?

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point — publish: false means we never actually test the publish path. Publishing to test.pypi.org in integration tests is the right call. The companion repo fixture could configure a Trusted Publisher on test.pypi.org and publish there on every integration run. The package name would be something disposable like wrangle-test-fixture. I will update the integration testing section to use TestPyPI instead of skipping publish entirely.

Copy link
Copy Markdown
Owner Author

@TomHennen TomHennen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Responding to individual comments.

Comment thread build/actions/python/SPEC.md Outdated
| Input | Required | Default | Description |
|-------|----------|---------|-------------|
| `path` | No | `.` | Relative path to the directory containing `pyproject.toml` |
| `python-version` | No | `3.12` | Python version to use for building |
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah 2 sounds good.

Comment thread build/actions/python/SPEC.md Outdated
|-------|----------|---------|-------------|
| `path` | No | `.` | Relative path to the directory containing `pyproject.toml` |
| `python-version` | No | `3.12` | Python version to use for building |
| `publish` | No | `true` | Whether to publish to PyPI. Set to `false` for PR builds or dry runs |
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let's encode that here.

- Fix overview: distinguish PEP 740 (publisher identity) from SLSA
  provenance (build metadata), be explicit about SLSA Build L2
- Add onboarding section: Trusted Publisher setup, disable legacy
  uploads (ultralytics-style attack mitigation)
- Remove python-version default: auto-detect from pyproject.toml via
  actions/setup-python's python-version-file feature
- Remove publish input: gate on event context instead (no publish on
  pull_request), preventing accidental PR publishes
- Change SBOM format from CycloneDX to SPDX for consistency with
  container build type; OSV-Scanner supports both
- Use hardcoded SHA in workflow example (not ./) per #136
- Integration testing: publish to TestPyPI instead of skipping,
  exercises full publish path including OIDC and attestations
- Note L3 path: wrangle's reusable workflow is the trust boundary,
  but actions/attest-build-provenance is L2 only
- Document /legacy/ URL as historical artifact, not deprecated

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@TomHennen TomHennen temporarily deployed to integration-test April 24, 2026 00:57 — with GitHub Actions Inactive
A supply chain security tool should ship L3 when L3 is available.
Replace actions/attest-build-provenance (L2) with slsa-github-generator
(L3), matching the container build type's architecture: separate
provenance job, isolated non-falsifiable builder, tag-based invocation.

The build job outputs base64-encoded artifact hashes for the generator.
The provenance job is gated on non-PR events, matching the publish gate.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@TomHennen TomHennen temporarily deployed to integration-test April 24, 2026 01:01 — with GitHub Actions Inactive
Comment thread build/actions/python/SPEC.md Outdated
| `repository-url` | No | `https://upload.pypi.org/legacy/` | PyPI upload URL. The `/legacy/` path is a historical artifact, not deprecated — it is the current and only supported upload endpoint. Override for TestPyPI (`https://test.pypi.org/legacy/`) or private registries. |
| `run-tests` | No | `true` | Whether to run pytest before publishing |

Note: there is no `publish` input. Publishing is controlled by event context: the build action publishes on `push` to a tag or the default branch, and skips publishing on `pull_request` events. This prevents accidental publishes from PR builds without requiring adopters to configure it correctly. The guard follows the same pattern as the container builder's provenance gate (`if: ! startsWith(github.event_name, 'pull_')`).
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like it would be bad for some folks. Like they might not want to publish on every push to main.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed — updated the spec. Wrangle does not decide when to publish. The publish job only gates on "not a PR" to prevent accidental publishes. The adopter controls publish cadence via their calling workflow triggers — e.g., on: push: tags: ["v*"] for tag-only releases, or on: push: branches: ["main"] for every-merge. This is the adopter's choice, not wrangle's.

Comment thread build/actions/python/SPEC.md Outdated
### 6. Generate SBOM

Generate a CycloneDX SBOM from the project's dependencies using `cyclonedx-python` (or `syft` as fallback). Write to `metadata/python/<shortname>/sbom.cdx.json`.
Generate an SPDX SBOM from the project's dependencies using `syft`. Write to `metadata/python/<shortname>/sbom.spdx.json`. SPDX is used for consistency with the container build type (which uses BuildKit-native SPDX). OSV-Scanner supports scanning both SPDX and CycloneDX SBOMs.
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a built-in way to generate an SBOM with python? If it's just cyclonedx that might be fine. I think the spec says that build time SBOMs are better than those generated with scanning?

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No built-in Python SBOM generator — unlike Docker BuildKit which produces SPDX natively. The options are external tools: syft (produces SPDX directly, preferred) or cyclonedx-python (CycloneDX native, would need conversion). Updated the spec to use syft for SPDX consistency with the container builder, and to generate from the resolved environment at build time rather than scanning lockfiles — build-time SBOMs are more accurate per the main spec.

Comment thread build/actions/python/SPEC.md
The build step must output base64-encoded SHA-256 hashes of the
built artifacts in the format slsa-github-generator expects
(base64(sha256:HASH FILENAME)). This is what the provenance job
consumes via needs.build.outputs.hashes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Address review comments:
- Split build/publish/provenance into three jobs so the build runs
  with minimal permissions (contents: read only). Publish and
  provenance get elevated permissions but are gated on non-PR events.
- Adopters control publish cadence via their calling workflow's
  triggers (e.g., on: push: tags: ['v*']), not wrangle.
- Clarify SBOM: Python has no built-in generator; use syft for
  build-time SPDX from the resolved environment (more accurate than
  scan-time inference from lockfiles).
- File #155 for GitHub attestation store consideration.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@TomHennen TomHennen temporarily deployed to integration-test April 24, 2026 01:13 — with GitHub Actions Inactive
permissions:
actions: read
id-token: write
contents: write # for uploading provenance to release assets
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where does this get uploaded to?

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SLSA generator uploads provenance as a GitHub Actions workflow artifact by default (no contents: write needed). To also upload as a GitHub Release asset (permanent, discoverable alongside the release), it needs contents: write. The current spec has contents: write on the provenance job for release asset upload, but this only runs on non-PR events. Will document the rationale and make the upload-to-release behavior explicit.

Provenance is uploaded as a workflow artifact (always, no extra
permissions) and as a GitHub Release asset (on tag pushes, requires
contents: write). Release assets are permanent; workflow artifacts
expire after 90 days. The contents: write permission on the
provenance job is scoped to non-PR events only.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@TomHennen TomHennen temporarily deployed to integration-test April 24, 2026 01:25 — with GitHub Actions Inactive
@TomHennen TomHennen merged commit 87206f0 into main Apr 24, 2026
6 checks passed
@TomHennen TomHennen mentioned this pull request Apr 24, 2026
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant