Skip to content

Improve CI Workflow#887

Open
Symmetricity wants to merge 4 commits into
systemed:masterfrom
Symmetricity:ci/workflow-improvements
Open

Improve CI Workflow#887
Symmetricity wants to merge 4 commits into
systemed:masterfrom
Symmetricity:ci/workflow-improvements

Conversation

@Symmetricity
Copy link
Copy Markdown

@Symmetricity Symmetricity commented May 13, 2026

Improve CI Workflow

This PR was AI generated.

Summary

This updates the GitHub Actions CI workflow so it is faster, less repetitive,
and checks more than whether tilemaker compiles.

  • update GitHub Actions to current major versions
  • add explicit workflow permissions and concurrency cancellation
  • download and checksum the Geofabrik Liechtenstein fixture once, then share it
    with build jobs as an artifact
  • skip build-heavy jobs when a change cannot affect runtime behavior
  • use runner-image-scoped vcpkg binary caches instead of caching the installed
    vcpkg tree directly
  • add Docker Buildx cache and split PR image builds from master publishes
  • generate MBTiles and PMTiles outputs from each CI build path
  • fail immediately if a tilemaker invocation fails or does not create the
    expected output archive
  • run tile generation twice per build path and verify repeatability
  • upload generated tile archives for inspection
  • verify PMTiles archive structure with pmtiles verify
  • compare generated tile contents semantically, not just byte-for-byte
  • normalize MVT geometry encodings during semantic comparison, including
    point ordering and polygon ring rotation while preserving polygon ring
    grouping
  • keep the published Docker action output in repeatability checks without
    treating it as a cross-runner reference for PR-built binaries

Rationale

The old workflow downloaded the same PBF independently in several jobs and only
verified that the build commands completed. That made CI slower than necessary
and left important runtime behavior unchecked.

The shared fixture job makes every build use the same input file and verifies
the Geofabrik .md5 before the fixture is used. If Geofabrik is unavailable or
the checksum does not match, the workflow reports that tile generation was not
verified rather than making the failure look like a code regression.

The vcpkg cache now stores vcpkg binary packages and includes the GitHub runner
image identity in the key. That avoids blindly restoring an installed tree from
a different runner image while still allowing warm dependency restores on stable
runners. Docker builds also use the GitHub Actions build cache.

The generated tile verification exists because tilemaker's most important output
is the generated vector-tile archive. A build can compile cleanly while still
producing invalid PMTiles, non-repeatable tile content, or different content
between build paths. CI should catch those cases.

Each tilemaker invocation is now checked as soon as it returns. This avoids
multi-command CI steps masking a failed archive generation when a later command
succeeds, and it makes the failing output visible in the generation job rather
than later as a missing artifact or verifier failure.

Output verification

The new verifier checks two levels of output equality:

  • raw decompressed tile bytes, which catches byte-for-byte differences after
    removing archive compression noise
  • semantic MVT content, which canonicalizes layer order, feature order, and tag
    order before comparing. It also decodes geometry command streams so equivalent
    point ordering and polygon ring rotation do not fail as content changes, while
    preserving polygon outer-ring and inner-ring grouping.

Raw byte differences with identical semantic content are reported as notices.
Semantic tile-content differences fail the job and include the first differing
tile. When the layer feature counts differ, the error also reports the affected
layer and counts.

PMTiles archives are also checked with pmtiles verify before their tile content
is compared. The PMTiles verifier output is included in the GitHub annotation so
the failure is visible in the web UI without digging through raw logs.

The GitHub Action job is still generated and checked for repeatability, but it
uses the published ghcr.io/systemed/tilemaker:master Docker image. Because
that is not necessarily the same binary as the PR-built CMake and Makefile
outputs, the verifier excludes that artifact from cross-runner equivalence
checks.

Current findings

The verifier was run locally against artifacts downloaded from fork CI runs.
Those tests showed that the new check is catching real output issues, not just
formatting or compression differences:

  • Windows PMTiles archives failed structural verification with a header length
    mismatch reported by pmtiles verify.
  • Repeat runs produced semantic tile-content differences, including examples
    where the same tile/layer had different feature counts, such as poi having
    102 features in one run and 101 in the repeat run.
  • Many byte-level differences were only ordering differences, which is why the
    verifier now separates raw-byte mismatches from semantic MVT mismatches.
  • Some raw MVT geometry differences were equivalent encodings of the same
    feature geometry; the verifier now normalizes those before calculating
    semantic hashes.
  • One fork CI run uploaded an incomplete generated-tile artifact after a build
    job produced only three of the four expected archives. The workflow now fails
    at the tile generation step if an expected archive is missing or empty.

These findings are not fixed by this PR. This PR adds CI coverage that exposes
them consistently.

Related issues and PRs

This PR does not include the build-fix changes from #886. If master still has
those build failures when this PR is tested, this PR should be applied after
#886 or rebased once equivalent fixes land.

Directly related:

This PR may improve future detection and debugging for output-related issues,
but does not claim to fix them:

No directly relevant GitHub Discussions were found for deterministic generated
tile output checks.

Testing

  • actionlint .github/workflows/ci.yml
  • python3 -B -c 'import py_compile; py_compile.compile(".github/scripts/verify-generated-tiles.py", cfile="/tmp/verify-generated-tiles.pyc", doraise=True)'
  • git diff --check
  • local verifier run against downloaded fork CI tile-output artifacts

The clean branch currently contains only CI-related files:

  • .github/workflows/ci.yml
  • .github/scripts/verify-generated-tiles.py
  • .gitignore

Update the workflow defaults before adding deeper output checks. This adds explicit workflow permissions, run cancellation, shared fixture download and verification, newer GitHub Actions versions, runner-scoped vcpkg binary caches, and Docker build caching.

This reduces duplicated work across jobs, avoids using an unchecked PBF download independently on each runner, makes cache reuse less brittle across runner image changes, and keeps PR Docker builds separate from master publishes.
Generate MBTiles and PMTiles artifacts from each CI build path, run the generation twice, and compare the results after the build jobs complete.

The verification step checks PMTiles archive structure, records raw decompressed tile-byte hashes, and also canonicalizes MVT layer, feature, and tag ordering so ordering-only changes are reported separately from semantic tile-content differences.

This turns CI into a correctness check for generated output, not just a compiler check. Current runs show that the new check is finding real issues: Windows PMTiles archives fail structural verification, and repeat runs can produce semantic differences such as different feature counts in the same layer/tile.

Because this adds a Python CI helper, ignore Python bytecode and cache directories produced by local validation.
Decode MVT geometry before calculating semantic hashes so equivalent point ordering and polygon ring rotation do not fail CI as content changes.

Keep the published Docker action output in repeat-output checks, but exclude it from cross-runner comparisons because that action uses the ghcr.io master image rather than the PR-built binaries.
@Symmetricity Symmetricity force-pushed the ci/workflow-improvements branch from fd48625 to 57b1c31 Compare May 13, 2026 05:11
@Symmetricity Symmetricity mentioned this pull request May 13, 2026
CI generated tile artifacts can otherwise be incomplete when one tilemaker invocation fails but a later command in the same step succeeds. This made the Windows CMake job upload only three of the four expected tile files while the job itself still passed.

Wrap direct tilemaker calls so each output is checked immediately, and verify generated files before artifact upload. This makes the failing output visible at the generation step instead of deferring the problem to the verifier or silently uploading partial artifacts.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant