Skip to content

[16.0-stable] Backport CI improvements#5724

Merged
rene merged 15 commits intolf-edge:16.0-stablefrom
europaul:backport/16.0/ci-improvements
Apr 1, 2026
Merged

[16.0-stable] Backport CI improvements#5724
rene merged 15 commits intolf-edge:16.0-stablefrom
europaul:backport/16.0/ci-improvements

Conversation

@europaul
Copy link
Copy Markdown
Contributor

@europaul europaul commented Mar 31, 2026

Description

Backport of #5534, #5551, #5583, #5657, #5662, #5665, #5700, #5702, #5709, #5713, #5714

Cherry-picked CI and build workflow improvements from master to 16.0-stable:

How to test and validate this PR

  • CI workflows pass on the PR itself
  • Verify cache keys in packages job match what eve job expects

Changelog notes

No user-facing changes. CI/build infrastructure improvements only.

Checklist

  • I've provided a proper description
  • I've added a reference link to the original PR
  • PR's title follows the template

europaul and others added 11 commits March 31, 2026 15:43
This change updates the expected job title prefix in `status_ui` and
`subjob_statuses` to include the branch suffix, restoring visibility of
individual testci: fix eden test status reporting in PRs

Signed-off-by: Paul Gaiduk <paulg@zededa.com>
(cherry picked from commit 4cf0daa)
Updates the eden-trusted workflow to reflect the recent renaming of the
16.0 branch to 16.0-stable, ensuring tests run correctly for this branch.

Signed-off-by: Paul Gaiduk <paulg@zededa.com>
(cherry picked from commit d4cf13c)
`gh run list`` limits output to 20 results by default. Since we can have
more than that in our PR checks, increase the limit to 100 to ensure we
capture all relevant runs.

Signed-off-by: Paul Gaiduk <paulg@zededa.com>
(cherry picked from commit bde2bed)
Fixes "Conditional expression contains literal text outside replacement
tokens. This will cause the expression to always evaluate to truthy. Did
you mean to put the entire expression inside ${{ }}?"

Signed-off-by: Paul Gaiduk <paulg@zededa.com>
(cherry picked from commit 481dcb4)
And change the docker login step.

Signed-off-by: Paul Gaiduk <paulg@zededa.com>
(cherry picked from commit dfbbe87)
The eve job was rebuilding arm64 packages from scratch instead of
using the ones already built by the packages job. Investigating
the root cause revealed several interrelated issues.

1. Redundant 'pkgs' target in the eve build command

   The eve job ran 'make pkgs eve', but the packages job already
   builds and caches all packages. Since the eve job restores the
   cache first, the 'pkgs' target should be a no-op. Removed it.

2. arm64 packages were never restored from cache

   The cache restore logic had a conditional: if the runner arch
   matched the matrix arch, it skipped both clearing the linuxkit
   cache and restoring the target arch cache. The assumption was
   that the first cache restore (for tool images) already had the
   right packages. But that first restore always fetched the amd64
   generic cache — even on arm64 runners. So arm64 jobs were left
   with amd64 packages in the cache, and 'make pkgs' (issue #1)
   was silently rebuilding everything for arm64.

3. Tool images were hardcoded to amd64

   The cache key for loading tool images (mkconf, mkimage-raw-efi,
   mkrootfs-squash, etc.) into docker was hardcoded to amd64. On
   arm64 runners this is wrong — they need arm64 tool images. Since
   for native builds the target cache already contains these tools,
   we now load them directly from the target cache. The two-cache
   dance (load tools from one arch, then restore packages from
   another) is only needed for riscv64 cross-builds on amd64.

4. The 'rt' platform maps to generic packages

   No build-rt.yml files exist anywhere in pkg/, so PLATFORM=rt
   produces identical packages to PLATFORM=generic. Rather than
   adding a redundant amd64/rt entry to the packages matrix, we
   map 'rt' to 'generic' in the cache key.

The fix simplifies the eve job's cache handling:
- Native builds (amd64, arm64): restore target cache, load tools, build
- Cross-builds (riscv64): restore amd64 cache, load tools, clear,
  restore riscv64 cache, build

The "Arch Runner is Matrix" step is removed as it is no longer used.

Signed-off-by: Paul Gaiduk <paulg@zededa.com>
(cherry picked from commit e1cc105)
Introduce mk/linuxkit.mk (included from the top-level Makefile) to
consolidate all linuxkit build logic in one place and support three
acquisition modes, selected in priority order:

  1. LINUXKIT_SRC=/path   — build from a local source tree; make
                            tracks Go source changes as dependencies.
  2. LINUXKIT_GIT_URL set — clone at LINUXKIT_GIT_REF (commit hash)
                            and build the binary.
  3. (neither)            — download the upstream release binary.

Mode 2 requires a commit hash (not a branch name) for reproducibility.
The hash is used directly as the versioned binary name — no network
call at parse time.  Make skips the recipe if the binary already
exists (cached).  Use git ls-remote to update the pinned hash.

Defaults to upstream linuxkit at the current master tip:
  LINUXKIT_GIT_URL = https://github.com/linuxkit/linuxkit
  LINUXKIT_GIT_REF = 4cfb70d3cc9256024bb0d4de760c631df0ad06f6
Set LINUXKIT_GIT_URL="" to revert to the release download (mode 3).

Signed-off-by: Mikhail Malyshev <mike.malyshev@gmail.com>
(cherry picked from commit 91c60bb)
linuxkit now supports environment variables for flags previously
threaded through every linuxkit call site in this Makefile.

Remove LINUXKIT_ORG_TARGET and BUILDKIT_CONFIG_OPTS workarounds:
- Drop BUILDKIT_CONFIG_FILE / BUILDKIT_CONFIG_OPTS variables and the
  manifest-mode guard; replace with LINUXKIT_BUILDER_CONFIG env var
  which defaults to /etc/buildkit/buildkitd.toml if the file exists
- Drop LINUXKIT_ORG_TARGET from all linuxkit call sites (9 locations)
- Drop REGISTRY make variable; use LINUXKIT_PKG_ORG directly

Update docs/BUILD.md to document the new variables:
- LINUXKIT_PKG_ORG for pushing to a custom registry
- LINUXKIT_MIRROR for pull-through proxy configuration

Migration:
  make REGISTRY=myregistry  ->  LINUXKIT_PKG_ORG=myregistry/lfedge make

For buildkit registry mirrors, LINUXKIT_BUILDER_CONFIG auto-detects
/etc/buildkit/buildkitd.toml as before; override by setting
LINUXKIT_BUILDER_CONFIG explicitly in the environment.
For pull-side mirrors, set LINUXKIT_MIRROR in the runner environment.

Signed-off-by: Mikhail Malyshev <mike.malyshev@gmail.com>
Signed-off-by: Paul Gaiduk <paulg@zededa.com>
(cherry picked from commit 39d43d5)
…ve permissions

Some container images (e.g., wwan/ModemManager) have directories with no
write permission (dr-x------). When collect-sources.sh extracts the rootfs
tar and later tries to remove the temporary directory, rm fails with
"Permission denied" because the non-writable directories prevent file
deletion inside them.

Add chmod -R u+w before rm to ensure the current user has write access
to all extracted files and directories before cleanup.

Signed-off-by: Paul Gaiduk <paulg@zededa.com>
(cherry picked from commit aac852b)
When building EVE with HV=k, the Makefile includes additional packages
(kube, external-boot-image) and builds pillar with build-k.yml instead
of build.yml. Without a separate package build step for HV=k, the eve
job fails because these packages are missing from the linuxkit cache.

Add an amd64/generic/k entry to the packages matrix so that k-specific
packages are built and cached separately. Update all cache keys to
include the hv dimension (using 'default' for non-k builds and 'k' for
the kubernetes variant).

Signed-off-by: Paul Hendry <phendry@zededa.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Paul Gaiduk <paulg@zededa.com>
(cherry picked from commit a219c30)
Update to include the fix from linuxkit/linuxkit#4212

Signed-off-by: Paul Gaiduk <paulg@zededa.com>
(cherry picked from commit 691bb26)
@europaul europaul changed the title Backport CI improvements to 16.0-stable [16.0-stable] Backport CI improvements Mar 31, 2026
@rucoder
Copy link
Copy Markdown
Contributor

rucoder commented Mar 31, 2026

@europaul you forgot

1703e59c5 pkg/pillar: remove unsupported linuxkit build.yml fields
380941573 ci: update GitHub Actions to Node.js 24 compatible versions

rucoder and others added 4 commits March 31, 2026 16:21
Remove security_opt and ports fields from build-dev.yml and
build-k-dev.yml. These are Docker Compose-specific fields that
have never been valid in linuxkit package build.yml files and
were silently ignored by older linuxkit versions.

The upcoming linuxkit upgrade introduces strict YAML validation
that rejects unknown fields, so these must be removed first.

Signed-off-by: Mikhail Malyshev <mike.malyshev@gmail.com>
(cherry picked from commit 1703e59)
GitHub is deprecating Node.js 20 actions. Starting June 2nd, 2026,
actions will be forced to run with Node.js 24 by default, and Node.js
20 will be removed from runners on September 16th, 2026.

Update all GitHub Actions in CI workflows to the latest versions that
support Node.js 24 and pin them to commit SHAs for supply-chain
security:

- actions/checkout v5.0.0 -> v6.0.2
- actions/cache v4.3.0 -> v5.0.4
- actions/upload-artifact v5.0.0 -> v7.0.0
- actions/download-artifact v6.0.0 -> v8.0.1
- actions/setup-go v6.0.0 -> v6.3.0
- docker/login-action v3.6.0 -> v4.0.0
- docker/setup-buildx-action v3 (unpinned) -> v4.0.0 (pinned)
- github/codeql-action v4.31.3 -> v4.35.1
- codecov/codecov-action v5.5.1 -> v6.0.0
- zizmorcore/zizmor-action v0.2.0 -> v0.5.2
- google/osv-scanner-action v1.9.2 -> v2.3.5

Signed-off-by: Mikhail Malyshev <mike.malyshev@gmail.com>
(cherry picked from commit bdfbdad)
The top-level Makefile hardcodes GOOS=linux and exports it to all
sub-makes and child processes.  This is correct for EVE packages
(they run on Linux), but breaks host tools on macOS: the exported
value overrides each tool's own host-OS detection, producing Linux
binaries that cannot execute on the build host.

The immediate symptom is linuxkit (built from source via
mk/linuxkit.mk) getting GOOS=linux instead of GOOS=darwin, but the
same bug is latent in dockerfile-from-checker and GOSOURCES.

Fix:
- Remove the global GOOS=linux assignment (line 337) and drop GOOS
  and GOARCH from the export list.
- In DOCKER_GO, hardcode -e GOOS=linux -e GOARCH=$(ZARCH) so the
  Linux container always gets the correct values explicitly.
- Remove the now-redundant GOOS=$(LOCAL_GOOS) overrides from
  get-deps and compare-sbom-sources — with no global export, Go
  defaults to the host OS automatically.

Signed-off-by: Mikhail Malyshev <mike.malyshev@gmail.com>
(cherry picked from commit a364d1b)
The TestPciLongExists unit test relies on the presence of specific
PCI devices (e.g., 0000:00:00.0) on the host system running the tests.
This causes failures in CI environments or on non-x86 architectures
where such devices may not exist or be accessible.

This change skips the test to avoid false negatives.

Signed-off-by: Shahriyar Jalayeri <shahriyar@posteo.de>
(cherry picked from commit 9b7cc85)
@rene rene merged commit 1922eda into lf-edge:16.0-stable Apr 1, 2026
32 of 34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants