Skip to content

feat(ci): GitHub Actions PR gate + multi-arch GHCR publish#2

Merged
ancongui merged 1 commit into
mainfrom
feat/github-actions-multiarch-cicd
May 14, 2026
Merged

feat(ci): GitHub Actions PR gate + multi-arch GHCR publish#2
ancongui merged 1 commit into
mainfrom
feat/github-actions-multiarch-cicd

Conversation

@ancongui
Copy link
Copy Markdown
Contributor

Adds PR-gate (lint, typecheck advisory, unit, docker-build smoke) and a multi-arch publish workflow (linux/amd64 + linux/arm64) that pushes to ghcr.io/firefly-operationos/flydesk-idp with SemVer + sha + latest tagging plus SLSA provenance and CycloneDX SBOM. Sibling firefly-framework repos are cloned into ./vendor/ at workflow time so the Dockerfile's --build-context references resolve identically in CI and locally. Refreshes README/deployment/overview/architecture to match the EDA + W3C tracing + probes work that landed in #1, adds docs/cicd.md, and ships .pre-commit-config.yaml with a local no-anthropic-keys hook. All YAML validates and 94 unit tests stay green.

Adds two workflows under ``.github/workflows/`` and refreshes the
prose docs to match the EDA / W3C / probes work that landed in #1.

CI/CD

* ``pr-gate.yaml`` -- runs on PRs:
    - ``lint``: ruff check + format
    - ``typecheck`` (advisory): pyright src/flydesk_idp
    - ``unit``: pytest tests/unit (94 tests, in-memory SQLite + EDA)
    - ``docker-build``: single-arch Docker build smoke (no push)
  Each job clones the sibling firefly framework repos
  (fireflyframework-pyfly + fireflyframework-agentic) into
  ``./vendor/`` and rewrites ``pyproject.toml`` path sources so the
  Dockerfile's ``--build-context`` references resolve identically in
  CI and locally.
* ``docker-publish.yaml`` -- multi-arch image push on main + tags:
    - platforms: linux/amd64, linux/arm64 (QEMU + buildx)
    - registry: ``ghcr.io/firefly-operationos/flydesk-idp``
                (owner is normalised to lower-case at runtime, GHCR
                rejects ``firefly-operationOS``)
    - tags: ``main`` / ``sha-<short>`` / ``latest`` (default branch),
      ``vX.Y.Z`` / ``vX.Y`` / ``vX`` (on git tag),
      ``manual-<run_id>`` (workflow_dispatch)
    - SLSA provenance + CycloneDX SBOM attached to every manifest
    - GHA cache backend for fast warm builds

PR template + pre-commit

* ``.github/pull_request_template.md`` with a verification checklist
  (lint, unit tests, docker boot, EDA smoke).
* ``.pre-commit-config.yaml`` with the standard guards
  (detect-private-key, check-merge-conflict, yaml/toml linters, ruff)
  plus a local ``no-anthropic-keys`` hook implemented as
  ``scripts/precommit_no_anthropic_keys.sh`` so a stray ``sk-ant-…``
  key can never enter the commit history.

Docs

* ``docs/cicd.md`` -- full walkthrough of the two workflows, tag
  matrix, image consumption, release flow, secrets needed for
  private-repo sibling checkouts.
* ``README.md`` -- new CI badges, GHCR image reference,
  ``EDA / async jobs`` row flipped to Postgres default + W3C tracing
  + probes + multi-arch container.
* ``docs/deployment.md`` -- topology rewritten for the Postgres-only
  default (Redis is now optional), ``env`` block flipped, new §3.1
  "From the registry" with the multi-arch pull command, §5
  "Health + readiness" expanded with the actual indicator names.
* ``docs/overview.md`` -- async-path diagram updated to show the
  outbox + LISTEN/NOTIFY flow instead of ``JobQueue.publish``.
* ``docs/architecture.md`` -- boot sequence note that EDA
  ``EventPublisher`` comes from pyfly's auto-configuration; DI
  paragraph updated.

Verified

* All YAML files (workflows + pre-commit + pyfly.yaml + compose)
  parse cleanly.
* 94 unit tests still green.
@ancongui ancongui merged commit 4b8aeac into main May 14, 2026
0 of 3 checks passed
@ancongui ancongui deleted the feat/github-actions-multiarch-cicd branch May 14, 2026 14:59
ancongui pushed a commit that referenced this pull request May 14, 2026
Two follow-ups that unblock the PR-gate and publish workflows the
multi-arch CI/CD PR (#2) added.

Ruff

* Added ``[tool.ruff.lint.per-file-ignores]`` exempting four areas
  from the rules that fight the public-API contract:
    - ``src/flydesk_idp/interfaces/**``: ``N801``, ``N811``, ``N815``,
      ``N818`` (camelCase DTOs are the wire format).
    - ``src/flydesk_idp/core/services/jobs/**`` +
      ``…/webhook/**``: ``N818`` (controller-advice maps these
      exception names by spelling).
    - ``src/flydesk_idp/core/services/judge/**``: ``N815`` (re-binds
      the public field names locally).
    - ``src/flydesk_idp/core/services/validation/**``: ``N811``
      (``PyUUID`` aliases stdlib ``UUID`` to avoid colliding with the
      project's column type).
    - ``tests/**``: ``E501`` -- test fixtures routinely build long
      inline DTOs.
* Ran ``ruff check --fix --unsafe-fixes`` + ``ruff format`` against
  the tree to normalise the imports and quoting that fell out of
  the unit-test fix in 99ec5f1.

Workflows

* ``docker-publish.yaml`` -- the ``actions/attest-build-provenance``
  step now runs with ``continue-on-error: true``. It requires the
  GitHub *Build & Validate Attestations* feature (paid / Enterprise
  or a public repo) and previously failed with a 403 for the
  free-plan ``firefly-operationOS`` org. The buildkit-emitted SLSA
  provenance attached via ``provenance: true`` on
  ``docker/build-push-action`` is the canonical signature; this step
  is just a belt-and-braces upload.

Verified

* ``uv run ruff check . && uv run ruff format --check .`` clean.
* ``uv run pytest -q tests/unit`` -- 94 passed.
* The prior docker-publish run already pushed the multi-arch image
  (digest ``sha256:e47002c0…``); only the attestation step failed.
ancongui added a commit that referenced this pull request May 14, 2026
* fix(tests): skip pricing test when agentic cost module is absent

CI clones fireflyframework-agentic at ``main`` to build the workflow
environment. When that ref doesn't yet export
``fireflyframework_agentic.observability.cost``, the
``test_genai_prices_resolves_our_anthropic_models`` test fails with
ImportError -- even though service behaviour is unaffected, since
``outbound_log._extract_usage_fields`` already swallows the same
import silently and falls back to a zero-cost log line.

Use ``pytest.importorskip`` so the test exercises the resolver chain
when the module is reachable (the common local case) and skips
cleanly when it isn't. The skip is informative -- it carries a reason
that points at the most likely cause -- so a regression that loses
the cost feature shows up as a skip-count drop instead of a silent
fall-through.

Verified locally: 15/15 in tests/unit/test_observability_usage.py.

* fix(ci): ruff per-file-ignores + advisory attestation step

Two follow-ups that unblock the PR-gate and publish workflows the
multi-arch CI/CD PR (#2) added.

Ruff

* Added ``[tool.ruff.lint.per-file-ignores]`` exempting four areas
  from the rules that fight the public-API contract:
    - ``src/flydesk_idp/interfaces/**``: ``N801``, ``N811``, ``N815``,
      ``N818`` (camelCase DTOs are the wire format).
    - ``src/flydesk_idp/core/services/jobs/**`` +
      ``…/webhook/**``: ``N818`` (controller-advice maps these
      exception names by spelling).
    - ``src/flydesk_idp/core/services/judge/**``: ``N815`` (re-binds
      the public field names locally).
    - ``src/flydesk_idp/core/services/validation/**``: ``N811``
      (``PyUUID`` aliases stdlib ``UUID`` to avoid colliding with the
      project's column type).
    - ``tests/**``: ``E501`` -- test fixtures routinely build long
      inline DTOs.
* Ran ``ruff check --fix --unsafe-fixes`` + ``ruff format`` against
  the tree to normalise the imports and quoting that fell out of
  the unit-test fix in 99ec5f1.

Workflows

* ``docker-publish.yaml`` -- the ``actions/attest-build-provenance``
  step now runs with ``continue-on-error: true``. It requires the
  GitHub *Build & Validate Attestations* feature (paid / Enterprise
  or a public repo) and previously failed with a 403 for the
  free-plan ``firefly-operationOS`` org. The buildkit-emitted SLSA
  provenance attached via ``provenance: true`` on
  ``docker/build-push-action`` is the canonical signature; this step
  is just a belt-and-braces upload.

Verified

* ``uv run ruff check . && uv run ruff format --check .`` clean.
* ``uv run pytest -q tests/unit`` -- 94 passed.
* The prior docker-publish run already pushed the multi-arch image
  (digest ``sha256:e47002c0…``); only the attestation step failed.

* docs(cicd): note GHCR visibility default + attestation gating

* fix(ci): exclude ./vendor/ from ruff so framework clones aren't linted

The PR-gate workflow clones fireflyframework-pyfly and
fireflyframework-agentic into ./vendor/ so the Dockerfile's BuildKit
``--build-context`` references resolve in CI the same way they do
locally. ``uv run ruff check .`` then descends into those clones and
fails on framework code we don't own. ``extend-exclude = ["vendor"]``
keeps the framework trees out of the lint scope while still letting
the docker job pick them up.

---------

Co-authored-by: ancongui <andres.contreras@soon.es>
ancongui added a commit that referenced this pull request May 31, 2026
Adds two workflows under ``.github/workflows/`` and refreshes the
prose docs to match the EDA / W3C / probes work that landed in #1.

CI/CD

* ``pr-gate.yaml`` -- runs on PRs:
    - ``lint``: ruff check + format
    - ``typecheck`` (advisory): pyright src/flydesk_idp
    - ``unit``: pytest tests/unit (94 tests, in-memory SQLite + EDA)
    - ``docker-build``: single-arch Docker build smoke (no push)
  Each job clones the sibling firefly framework repos
  (fireflyframework-pyfly + fireflyframework-agentic) into
  ``./vendor/`` and rewrites ``pyproject.toml`` path sources so the
  Dockerfile's ``--build-context`` references resolve identically in
  CI and locally.
* ``docker-publish.yaml`` -- multi-arch image push on main + tags:
    - platforms: linux/amd64, linux/arm64 (QEMU + buildx)
    - registry: ``ghcr.io/firefly-operationos/flydesk-idp``
                (owner is normalised to lower-case at runtime, GHCR
                rejects ``firefly-operationOS``)
    - tags: ``main`` / ``sha-<short>`` / ``latest`` (default branch),
      ``vX.Y.Z`` / ``vX.Y`` / ``vX`` (on git tag),
      ``manual-<run_id>`` (workflow_dispatch)
    - SLSA provenance + CycloneDX SBOM attached to every manifest
    - GHA cache backend for fast warm builds

PR template + pre-commit

* ``.github/pull_request_template.md`` with a verification checklist
  (lint, unit tests, docker boot, EDA smoke).
* ``.pre-commit-config.yaml`` with the standard guards
  (detect-private-key, check-merge-conflict, yaml/toml linters, ruff)
  plus a local ``no-anthropic-keys`` hook implemented as
  ``scripts/precommit_no_anthropic_keys.sh`` so a stray ``sk-ant-…``
  key can never enter the commit history.

Docs

* ``docs/cicd.md`` -- full walkthrough of the two workflows, tag
  matrix, image consumption, release flow, secrets needed for
  private-repo sibling checkouts.
* ``README.md`` -- new CI badges, GHCR image reference,
  ``EDA / async jobs`` row flipped to Postgres default + W3C tracing
  + probes + multi-arch container.
* ``docs/deployment.md`` -- topology rewritten for the Postgres-only
  default (Redis is now optional), ``env`` block flipped, new §3.1
  "From the registry" with the multi-arch pull command, §5
  "Health + readiness" expanded with the actual indicator names.
* ``docs/overview.md`` -- async-path diagram updated to show the
  outbox + LISTEN/NOTIFY flow instead of ``JobQueue.publish``.
* ``docs/architecture.md`` -- boot sequence note that EDA
  ``EventPublisher`` comes from pyfly's auto-configuration; DI
  paragraph updated.

Verified

* All YAML files (workflows + pre-commit + pyfly.yaml + compose)
  parse cleanly.
* 94 unit tests still green.

Co-authored-by: ancongui <andres.contreras@soon.es>
ancongui added a commit that referenced this pull request May 31, 2026
* fix(tests): skip pricing test when agentic cost module is absent

CI clones fireflyframework-agentic at ``main`` to build the workflow
environment. When that ref doesn't yet export
``fireflyframework_agentic.observability.cost``, the
``test_genai_prices_resolves_our_anthropic_models`` test fails with
ImportError -- even though service behaviour is unaffected, since
``outbound_log._extract_usage_fields`` already swallows the same
import silently and falls back to a zero-cost log line.

Use ``pytest.importorskip`` so the test exercises the resolver chain
when the module is reachable (the common local case) and skips
cleanly when it isn't. The skip is informative -- it carries a reason
that points at the most likely cause -- so a regression that loses
the cost feature shows up as a skip-count drop instead of a silent
fall-through.

Verified locally: 15/15 in tests/unit/test_observability_usage.py.

* fix(ci): ruff per-file-ignores + advisory attestation step

Two follow-ups that unblock the PR-gate and publish workflows the
multi-arch CI/CD PR (#2) added.

Ruff

* Added ``[tool.ruff.lint.per-file-ignores]`` exempting four areas
  from the rules that fight the public-API contract:
    - ``src/flydesk_idp/interfaces/**``: ``N801``, ``N811``, ``N815``,
      ``N818`` (camelCase DTOs are the wire format).
    - ``src/flydesk_idp/core/services/jobs/**`` +
      ``…/webhook/**``: ``N818`` (controller-advice maps these
      exception names by spelling).
    - ``src/flydesk_idp/core/services/judge/**``: ``N815`` (re-binds
      the public field names locally).
    - ``src/flydesk_idp/core/services/validation/**``: ``N811``
      (``PyUUID`` aliases stdlib ``UUID`` to avoid colliding with the
      project's column type).
    - ``tests/**``: ``E501`` -- test fixtures routinely build long
      inline DTOs.
* Ran ``ruff check --fix --unsafe-fixes`` + ``ruff format`` against
  the tree to normalise the imports and quoting that fell out of
  the unit-test fix in 99ec5f1.

Workflows

* ``docker-publish.yaml`` -- the ``actions/attest-build-provenance``
  step now runs with ``continue-on-error: true``. It requires the
  GitHub *Build & Validate Attestations* feature (paid / Enterprise
  or a public repo) and previously failed with a 403 for the
  free-plan ``firefly-operationOS`` org. The buildkit-emitted SLSA
  provenance attached via ``provenance: true`` on
  ``docker/build-push-action`` is the canonical signature; this step
  is just a belt-and-braces upload.

Verified

* ``uv run ruff check . && uv run ruff format --check .`` clean.
* ``uv run pytest -q tests/unit`` -- 94 passed.
* The prior docker-publish run already pushed the multi-arch image
  (digest ``sha256:e47002c0…``); only the attestation step failed.

* docs(cicd): note GHCR visibility default + attestation gating

* fix(ci): exclude ./vendor/ from ruff so framework clones aren't linted

The PR-gate workflow clones fireflyframework-pyfly and
fireflyframework-agentic into ./vendor/ so the Dockerfile's BuildKit
``--build-context`` references resolve in CI the same way they do
locally. ``uv run ruff check .`` then descends into those clones and
fails on framework code we don't own. ``extend-exclude = ["vendor"]``
keeps the framework trees out of the lint scope while still letting
the docker job pick them up.

---------

Co-authored-by: ancongui <andres.contreras@soon.es>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant