Skip to content

fix: ZapLink size limit, signed checksum updater verification, execute process hardening#660

Merged
wizzomafizzo merged 20 commits intomainfrom
security/zaplink-sigstore-execute-hardening
Apr 13, 2026
Merged

fix: ZapLink size limit, signed checksum updater verification, execute process hardening#660
wizzomafizzo merged 20 commits intomainfrom
security/zaplink-sigstore-execute-hardening

Conversation

@wizzomafizzo
Copy link
Copy Markdown
Member

@wizzomafizzo wizzomafizzo commented Apr 12, 2026

Summary

  • ZapLink size limit: cap getRemoteZapScript response body to MaxResponseBodySize (1 MiB) using io.LimitReader, matching existing limits in FetchWellKnown and redeemClaimToken
  • Signed checksum updater verification: verify update checksums.txt against an ed25519 signature (checksums.txt.sig) before trusting checksums — CI signs at release time, public key embedded in binary and protected by CODEOWNERS
  • Execute process hardening: wrap spawned commands in platform-specific containment (Windows Job Objects with CREATE_SUSPENDED, Linux process groups with prlimit, macOS process groups with RLIMIT_CPU)

Summary by CodeRabbit

  • New Features

    • Platform-specific process execution now runs in isolated jobs with enforced CPU/memory constraints.
    • Self-update now validates signed checksums for integrity.
  • Bug Fixes

    • Remote script fetches are now size-bounded to prevent excessive memory use.
  • Tests

    • Added cross-platform unit tests covering the new process-execution and signed-checksum behaviors.
  • Chores

    • Updated Go module versions and CI/CD workflows; added cross-lint/task automation and build tooling updates.

getRemoteZapScript() used io.ReadAll without a size limit, allowing a
malicious ZapLink target to return an unbounded response and exhaust
memory. Wrap with io.LimitReader using the existing MaxResponseBodySize
constant (1 MiB), consistent with FetchWellKnown() and
redeemClaimToken().
Chain Sigstore bundle verification after the existing checksum
validation in the self-updater. After downloading a release binary and
verifying its SHA-256 checksum, the updater now fetches the
corresponding attestation from GitHub's attestation API and verifies it
against the Sigstore Public Good trusted root.

Verification enforces that the signing certificate was issued by GitHub
Actions for a workflow in the ZaparooProject/zaparoo-core repository.
If verification fails, the update is rejected (fail closed).

Adds sigstore/sigstore-go v1.1.4 dependency.
…jects

On Windows, wrap execute command child processes in a Job Object that
enforces: kill-on-job-close (cleanup guarantee), active process limit of
1 (prevents spawning further children), and a 512 MiB per-process memory
cap. This contains the blast radius of malicious execute commands.

On non-Windows platforms, RunInJob is a passthrough to cmd.Run().
Extend RunInJob to all platforms instead of Windows-only:

Windows: Use CREATE_SUSPENDED to close TOCTOU race between process
creation and job assignment. Bump ActiveProcessLimit to 2 (allows shell
wrappers). Switch from PROCESS_MEMORY to JOB_MEMORY to cap total job.

Linux: New process group (Setpgid) with Pdeathsig for parent-death
cleanup. Apply RLIMIT_AS (512 MiB) and RLIMIT_CPU (5s) via prlimit64
on the child PID. Override cmd.Cancel to kill the entire process group.

macOS: New process group (Setpgid) with process group kill. Set
RLIMIT_CPU before fork (inherited by child, mutex-protected).
RLIMIT_AS skipped due to unreliable macOS enforcement.
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 12, 2026

📝 Walkthrough

Walkthrough

Adds cross-platform process execution isolation via platform-specific RunInJob implementations, introduces Ed25519-signed checksum validation for updater flows, integrates RunInJob into zapscript, and updates CI/CD (cross-platform linting and signed manifest generation) alongside dependency and tooling updates.

Changes

Cohort / File(s) Summary
Process Execution Isolation
pkg/helpers/command/runinjob_darwin.go, pkg/helpers/command/runinjob_linux.go, pkg/helpers/command/runinjob_windows.go, pkg/helpers/command/runinjob_darwin_test.go, pkg/helpers/command/runinjob_linux_test.go, pkg/helpers/command/runinjob_windows_test.go
Added platform-specific RunInJob(cmd *exec.Cmd) error implementations and tests. Darwin uses serialized RLIMIT_CPU adjustments via a mutex; Linux applies prlimit64 to the child (RLIMIT_AS, RLIMIT_CPU) and Pdeathsig; Windows uses Job Objects, CREATE_SUSPENDED, assignment to job, and thread resume logic.
Signed Checksum Validation
pkg/service/updater/signed_checksum.go, pkg/service/updater/signed_checksum_test.go, pkg/service/updater/update_signing.pub, pkg/service/updater/updater.go
Embedded Ed25519 public key and ed25519-based validator; newSignedChecksumValidator() composes signature check for checksums.txt.sig with checksum validation for other assets; makeUpdater now accepts a context and uses the signed validator.
Zapscript Integration
pkg/zapscript/utils.go, pkg/zapscript/zaplinks.go, pkg/zapscript/zaplinks_test.go
cmdExecute now calls command.RunInJob(); getRemoteZapScript now limits body reads via io.LimitReader and added tests for oversized/normal responses.
CI/CD & Tasks
.github/workflows/lint-and-test.yml, .github/workflows/publish-update-manifest.yml, .github/actionlint.yaml, Taskfile.dist.yml, scripts/tasks/cross-lint.yml, scripts/zigcc/Dockerfile, scripts/zigcc/VERSION
Consolidated CI into a single ci job with platform matrix, added cross-lint via a Zig-based Docker image, enabled signed checksums in publish workflow (sign/verify checksums.txt.sig), updated zigcc SDK/version.
Build & Dependency Changes
go.mod, .github/CODEOWNERS
Updated multiple Go module versions and indirect deps; added CODEOWNERS entry for pkg/service/updater/update_signing.pub.
Test Adjustments
pkg/service/reader_manager_test.go
Adjusted a timeout-expiry test to send two removals to account for nondeterministic select ordering.

Sequence Diagram(s)

sequenceDiagram
  participant Z as Zapscript (caller)
  participant C as command.RunInJob
  participant O as OS / Kernel (processes)
  Z->>C: exec.Cmd with SysProcAttr (new pg / suspended / limits)
  C->>O: fork/Start child (new process group / job object)
  Note right of O: apply resource limits (prlimit/Setrlimit or Job config)
  C->>O: on cancel -> send SIGKILL to -pid / close job handle
  O-->>C: process exit
  C-->>Z: return result (err or nil)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐇 I dug a tunnel through OS and time,
I fenced rogue pups and signed every line,
Checksums snug under moonlight's gleam,
Linted across platforms — what a dream!
✨🔐

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 30.19% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and specifically summarizes the three main changes: ZapLink size limit, signed checksum verification, and process hardening across platforms.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch security/zaplink-sigstore-execute-hardening

Comment @coderabbitai help to get the list of available commands and usage tips.

Fixes GHSA-p77j-4mvh-x3m3 (authorization bypass via missing leading
slash in :path) flagged by dependency review on the grpc v1.79.1
transitive dependency from sigstore-go.
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (3)
pkg/zapscript/zaplinks_test.go (1)

690-690: Use the shared size constant instead of a hardcoded literal.

Line 690 hardcodes 1<<20; using helpers.MaxResponseBodySize avoids drift if the limit changes.

♻️ Proposed tweak
 import (
 	"context"
 	"errors"
 	"io"
 	"net/http"
 	"net/http/httptest"
 	"net/url"
 	"runtime"
 	"strings"
 	"testing"
 
 	"github.com/ZaparooProject/zaparoo-core/v2/pkg/database"
+	"github.com/ZaparooProject/zaparoo-core/v2/pkg/helpers"
 	testhelpers "github.com/ZaparooProject/zaparoo-core/v2/pkg/testing/helpers"
 	"github.com/stretchr/testify/assert"
 	"github.com/stretchr/testify/mock"
 	"github.com/stretchr/testify/require"
 )
@@
-	assert.Len(t, body, 1<<20, "response body should be limited to MaxResponseBodySize (1 MiB)")
+	assert.Len(t, body, helpers.MaxResponseBodySize, "response body should be limited to MaxResponseBodySize")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/zapscript/zaplinks_test.go` at line 690, Replace the hardcoded literal
1<<20 in the test assertion with the shared constant
helpers.MaxResponseBodySize: locate the assertion in zaplinks_test.go (the
assert.Len call that checks response body size) and change the expected length
argument from 1<<20 to helpers.MaxResponseBodySize, and update the assertion
message if needed to reference MaxResponseBodySize rather than the numeric
literal.
pkg/helpers/command/runinjob_darwin.go (1)

62-75: Consider logging when RLIMIT_CPU restoration fails.

On line 74, the error from unix.Setrlimit during restoration is silently ignored. While the child has already started, a restoration failure could leave the parent with unexpected CPU limits affecting subsequent operations.

🔧 Proposed fix to log restoration failures
 	startErr := cmd.Start()

 	// Restore parent limits and release the mutex immediately after fork.
-	_ = unix.Setrlimit(unix.RLIMIT_CPU, &oldCPU)
+	if err := unix.Setrlimit(unix.RLIMIT_CPU, &oldCPU); err != nil {
+		log.Debug().Err(err).Msg("failed to restore RLIMIT_CPU after fork")
+	}
 	rlimitMu.Unlock()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/helpers/command/runinjob_darwin.go` around lines 62 - 75, The restoration
of RLIMIT_CPU after forking currently ignores errors; update the restoration
call that uses unix.Setrlimit(&oldCPU) (after cmd.Start()) to capture its
returned error and log it (e.g., with log.Debug().Err(err).Msg("failed to
restore RLIMIT_CPU after fork") or similar). Keep the surrounding
rlimitMu.Unlock() and ensure this logging uses the same logger used elsewhere in
this file so parent-limit restoration failures (in the block that manipulates
rlimitMu, oldCPU, unix.Getrlimit, unix.Setrlimit, and cmd.Start) are recorded.
pkg/service/updater/sigstore.go (1)

144-147: Consider adding a timeout to the HTTP client.

Using http.DefaultClient means the request has no explicit timeout. If the GitHub attestation API becomes slow or unresponsive, this could block the update process indefinitely.

🔧 Proposed fix to add timeout
+var attestationClient = &http.Client{
+	Timeout: 30 * time.Second,
+}
+
 func doFetchAttestations(ctx context.Context, url string) ([]json.RawMessage, error) {
 	req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, http.NoBody)
 	if err != nil {
 		return nil, fmt.Errorf("creating request: %w", err)
 	}
 	req.Header.Set("Accept", "application/json")

-	resp, err := http.DefaultClient.Do(req)
+	resp, err := attestationClient.Do(req)
 	if err != nil {
 		return nil, fmt.Errorf("requesting attestations: %w", err)
 	}

Note: You'll also need to add "time" to the imports.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/service/updater/sigstore.go` around lines 144 - 147, The code uses
http.DefaultClient.Do(req) (in the resp, err := http.DefaultClient.Do(req) call)
which has no timeout; replace it by constructing a dedicated *http.Client with a
Timeout (e.g. client := &http.Client{Timeout: 10 * time.Second}) and call
client.Do(req) instead, or alternatively wrap the request with a
context.WithTimeout and use req = req.WithContext(ctx); also add "time" to the
imports so the timeout constant compiles. Ensure the new client/context is used
in the function that performs the attestation HTTP request (the block
referencing resp, err and http.DefaultClient).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@go.mod`:
- Line 191: The go.mod currently pins the transitive dependency
google.golang.org/grpc at v1.79.1 which is vulnerable; update the version
specifier in go.mod from "google.golang.org/grpc v1.79.1" to at least v1.79.3
(or later), then run go get google.golang.org/grpc@v1.79.3 and go mod tidy (or
go get ./... and go mod tidy) to update the module graph and vendor files so the
transitive dependency is patched.

In `@pkg/helpers/command/runinjob_linux_test.go`:
- Around line 58-69: Update TestRunInJob_ProcessGroupKill so it verifies
descendant cleanup by capturing the background sleep child's PID and asserting
it no longer exists after RunInJob returns: run the shell with a command that
backgrounds sleep and echoes its PID (e.g. "sleep 60 & echo $!; wait"),
read/parse that PID from the command's stdout before calling RunInJob, invoke
RunInJob(cmd) as before, then after RunInJob returns use a Unix existence check
(e.g. syscall.Kill(pid, 0) or similar) to assert the child process is gone; keep
the test using TestRunInJob_ProcessGroupKill and RunInJob to locate the change.

In `@pkg/helpers/command/runinjob_linux.go`:
- Around line 56-60: The code currently only logs a warning when
applyResourceLimits(pid) fails, leaving the child running without enforced
limits; change this to fail closed by killing the child and returning the setup
error: after applyResourceLimits(pid) returns an error, call cmd.Process.Kill()
(or cmd.Process.Signal(os.Kill)) and then cmd.Wait() to reap the process, log
the failure with Err(err) and pid, and return the error from the surrounding
function instead of continuing; update the block that references cmd.Process.Pid
and applyResourceLimits to implement this termination-and-return behavior.

In `@pkg/helpers/command/runinjob_windows_test.go`:
- Around line 56-64: Add a new Windows test that verifies the KILL_ON_JOB_CLOSE
cleanup path by starting a long-lived process tree via RunInJob, canceling its
context, and asserting all descendants are terminated; specifically, create a
TestRunInJob_KillOnClose (or extend TestRunInJob_ProcessTreeLimited) that uses
exec.CommandContext with a command that spawns a child/descendant (e.g., nested
"cmd /c ping -t" or a simple sleep loop), call RunInJob(cmd), then cancel the
test context and wait/poll for the original process and its spawned child to
exit (assert they are no longer running and that Wait returns an error if
appropriate), thereby ensuring RunInJob honors KILL_ON_JOB_CLOSE and cleans up
the process tree. Ensure you reference RunInJob and the
ActiveProcessLimit/KILL_ON_JOB_CLOSE behavior when locating the code to change.

In `@pkg/helpers/command/runinjob_windows.go`:
- Around line 128-152: The thread enumeration currently swallows
OpenThread/ResumeThread failures and treats any Thread32Next error as normal;
update the loop in the function that calls Thread32First/Thread32Next
(referencing Thread32First, Thread32Next, OpenThread, ResumeThread,
entry.ThreadID, entry.OwnerProcessID) to track whether at least one ResumeThread
succeeded and return an error if none did, propagate OpenThread/ResumeThread
errors instead of ignoring them, and ensure CloseHandle is still called when a
thread handle is opened; also change the Thread32Next error handling to only
treat windows.ERROR_NO_MORE_FILES as normal completion and return a wrapped
error (e.g., fmt.Errorf("enumerating threads: %w", err)) for any other error.

In `@pkg/service/updater/sigstore.go`:
- Around line 176-196: The newSigstoreVerifier function uses a non-existent
root.FetchTrustedRoot(); replace that with creating a TUF client for the
Sigstore Public Good instance (using tuf.ProductionRoot and
tuf.ProductionMirror) and call root.GetTrustedRoot(tufClient) to obtain the
trusted root before constructing the verifier; update the error handling around
tuf client creation and root.GetTrustedRoot, and keep the rest of the function
(the verify.NewVerifier call and options) unchanged so newSigstoreVerifier
returns the verifier created from the trusted root.

---

Nitpick comments:
In `@pkg/helpers/command/runinjob_darwin.go`:
- Around line 62-75: The restoration of RLIMIT_CPU after forking currently
ignores errors; update the restoration call that uses unix.Setrlimit(&oldCPU)
(after cmd.Start()) to capture its returned error and log it (e.g., with
log.Debug().Err(err).Msg("failed to restore RLIMIT_CPU after fork") or similar).
Keep the surrounding rlimitMu.Unlock() and ensure this logging uses the same
logger used elsewhere in this file so parent-limit restoration failures (in the
block that manipulates rlimitMu, oldCPU, unix.Getrlimit, unix.Setrlimit, and
cmd.Start) are recorded.

In `@pkg/service/updater/sigstore.go`:
- Around line 144-147: The code uses http.DefaultClient.Do(req) (in the resp,
err := http.DefaultClient.Do(req) call) which has no timeout; replace it by
constructing a dedicated *http.Client with a Timeout (e.g. client :=
&http.Client{Timeout: 10 * time.Second}) and call client.Do(req) instead, or
alternatively wrap the request with a context.WithTimeout and use req =
req.WithContext(ctx); also add "time" to the imports so the timeout constant
compiles. Ensure the new client/context is used in the function that performs
the attestation HTTP request (the block referencing resp, err and
http.DefaultClient).

In `@pkg/zapscript/zaplinks_test.go`:
- Line 690: Replace the hardcoded literal 1<<20 in the test assertion with the
shared constant helpers.MaxResponseBodySize: locate the assertion in
zaplinks_test.go (the assert.Len call that checks response body size) and change
the expected length argument from 1<<20 to helpers.MaxResponseBodySize, and
update the assertion message if needed to reference MaxResponseBodySize rather
than the numeric literal.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e1883787-1808-4dd5-b6c1-7143730a6254

📥 Commits

Reviewing files that changed from the base of the PR and between 6c13bc3 and faaedab.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (14)
  • go.mod
  • pkg/helpers/command/runinjob.go
  • pkg/helpers/command/runinjob_darwin.go
  • pkg/helpers/command/runinjob_darwin_test.go
  • pkg/helpers/command/runinjob_linux.go
  • pkg/helpers/command/runinjob_linux_test.go
  • pkg/helpers/command/runinjob_windows.go
  • pkg/helpers/command/runinjob_windows_test.go
  • pkg/service/updater/sigstore.go
  • pkg/service/updater/sigstore_test.go
  • pkg/service/updater/updater.go
  • pkg/zapscript/utils.go
  • pkg/zapscript/zaplinks.go
  • pkg/zapscript/zaplinks_test.go

Comment thread go.mod Outdated
Comment on lines +58 to +69
func TestRunInJob_ProcessGroupKill(t *testing.T) {
t.Parallel()

// Short timeout so the context cancellation triggers cmd.Cancel, which
// kills the entire process group (not just the top-level PID).
ctx, cancel := context.WithTimeout(t.Context(), 500*time.Millisecond)
defer cancel()

cmd := exec.CommandContext(ctx, "sh", "-c", "sleep 60 & wait")
err := RunInJob(cmd)
assert.Error(t, err)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

This timeout test doesn't prove descendant cleanup.

RunInJob would still return an error here if only sh died and the background sleep leaked. Please capture the child PID and assert it no longer exists after RunInJob returns so the process-group kill path is actually covered. As per coding guidelines, "Write tests for all new code — see TESTING.md and pkg/testing/README.md".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/helpers/command/runinjob_linux_test.go` around lines 58 - 69, Update
TestRunInJob_ProcessGroupKill so it verifies descendant cleanup by capturing the
background sleep child's PID and asserting it no longer exists after RunInJob
returns: run the shell with a command that backgrounds sleep and echoes its PID
(e.g. "sleep 60 & echo $!; wait"), read/parse that PID from the command's stdout
before calling RunInJob, invoke RunInJob(cmd) as before, then after RunInJob
returns use a Unix existence check (e.g. syscall.Kill(pid, 0) or similar) to
assert the child process is gone; keep the test using
TestRunInJob_ProcessGroupKill and RunInJob to locate the change.

Comment thread pkg/helpers/command/runinjob_linux.go Outdated
Comment thread pkg/helpers/command/runinjob_windows_test.go
Comment thread pkg/helpers/command/runinjob_windows.go
Comment thread pkg/service/updater/sigstore.go Outdated
The shared runinjob.go caused unused-constant lint failures on
platforms that don't use all constants (e.g., executeMemoryLimit
unused on macOS, executeCPULimit unused on Windows). Move constants
to their respective platform files where they're actually used.
Comment thread pkg/helpers/command/runinjob_darwin.go Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
pkg/helpers/command/runinjob_linux.go (1)

66-68: ⚠️ Potential issue | 🟠 Major

Fail closed when prlimit setup fails.

Lines 66-68 still only warn and continue, so the child can run without the advertised CPU and memory caps whenever applyResourceLimits fails. This hardening should terminate the process group and return the setup error instead of degrading to best-effort.

Suggested change
  pid := cmd.Process.Pid
  if err := applyResourceLimits(pid); err != nil {
- 	log.Warn().Err(err).Int("pid", pid).Msg("failed to apply resource limits")
+ 	_ = syscall.Kill(-pid, syscall.SIGKILL)
+ 	_ = cmd.Wait()
+ 	return fmt.Errorf("applying resource limits: %w", err)
  }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/helpers/command/runinjob_linux.go` around lines 66 - 68, The current code
only logs a warning when applyResourceLimits(pid) fails, allowing the child to
run without enforced caps; change this to fail-closed by stopping the child and
returning the error: after applyResourceLimits(pid) returns an error, send a
termination signal to the process group (using pid) to ensure the child cannot
continue, wait/cleanup as done elsewhere in this function (same mechanism used
for other failure paths), and then return the applyResourceLimits error instead
of continuing; update the error handling in the run-in-job flow (the block that
calls applyResourceLimits) so it performs the group kill and returns the error
immediately.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@go.mod`:
- Line 160: Update the indirect rekor dependency to v1.5.0 by upgrading the
module entry for github.com/sigstore/rekor to v1.5.0; run the module upgrade
command (e.g., go get github.com/sigstore/rekor@v1.5.0) and then run go mod tidy
to refresh go.mod and go.sum so the indirect upgrade is recorded and vendored
artifacts are updated; verify no code changes are required and that CI/go build
passes after updating the github.com/sigstore/rekor version.

In `@pkg/helpers/command/runinjob_linux.go`:
- Around line 51-52: The use of cmd.SysProcAttr.Pdeathsig in the Start path is
unsafe because Pdeathsig is tied to the creating OS thread; either remove the
Pdeathsig assignment or ensure Start runs on a locked OS thread — call
runtime.LockOSThread before setting cmd.SysProcAttr.Pdeathsig and starting the
process and unlock after Start returns (and handle panic/return paths), or
simply delete the line setting cmd.SysProcAttr.Pdeathsig to avoid the broken
guarantee; locate the assignment to cmd.SysProcAttr.Pdeathsig and adjust the
Start/Run path accordingly (use runtime.LockOSThread around the code that sets
Pdeathsig and calls cmd.Start, or remove the Pdeathsig assignment).

---

Duplicate comments:
In `@pkg/helpers/command/runinjob_linux.go`:
- Around line 66-68: The current code only logs a warning when
applyResourceLimits(pid) fails, allowing the child to run without enforced caps;
change this to fail-closed by stopping the child and returning the error: after
applyResourceLimits(pid) returns an error, send a termination signal to the
process group (using pid) to ensure the child cannot continue, wait/cleanup as
done elsewhere in this function (same mechanism used for other failure paths),
and then return the applyResourceLimits error instead of continuing; update the
error handling in the run-in-job flow (the block that calls applyResourceLimits)
so it performs the group kill and returns the error immediately.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3fee884e-8bb6-45d6-afc7-456afdea2df6

📥 Commits

Reviewing files that changed from the base of the PR and between faaedab and 93ba6e8.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (4)
  • go.mod
  • pkg/helpers/command/runinjob_darwin.go
  • pkg/helpers/command/runinjob_linux.go
  • pkg/helpers/command/runinjob_windows.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/helpers/command/runinjob_windows.go

Comment thread go.mod Outdated
Comment thread pkg/helpers/command/runinjob_darwin.go Outdated
Comment thread pkg/helpers/command/runinjob_linux.go
After fakeClock.Advance fires guardTimeout, both the timeout channel
and scanQueue are ready in the select. Go picks randomly, so a single
sendScan barrier may consume scanQueue first, leaving guardTimeout
unprocessed when the test continues. Add a second barrier to guarantee
both the timeout and the nil removal are consumed before re-tapping.
- Linux: fail closed when prlimit cannot apply resource limits (ESRCH
  exempted for already-exited processes)
- Linux: lock OS thread for Pdeathsig reliability (signal is tied to
  creating thread, not process)
- Windows: propagate thread resume errors and distinguish
  ERROR_NO_MORE_FILES from real enumeration failures
- Darwin: only lower soft CPU limit (hard limit is irreversible for
  unprivileged processes), skip restore if limit was never changed
- Sigstore: use dedicated HTTP client with 30s timeout for attestation
  fetch
- Upgrade go-tuf/v2 v2.3.0→v2.4.1 (GO-2026-4377 path traversal fix)
- Upgrade rekor v1.4.3→v1.5.0 (GHSA-273p-m2cw-6833, GHSA-4c4x-jm2x-pf9j)
- Use helpers.MaxResponseBodySize constant in test assertion
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
pkg/helpers/command/runinjob_windows.go (1)

168-175: Consider whether partial resume failures should be fatal.

Currently, if some threads are resumed but others fail (e.g., resumed > 0 but lastErr != nil), the function returns nil. This could leave the process in a partially-resumed state with some threads still suspended. Depending on the use case, this might be acceptable (main thread resumed is enough) or could cause subtle issues.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/helpers/command/runinjob_windows.go` around lines 168 - 175, The current
resume logic sometimes returns nil when some threads were resumed but others
failed; update the end of Resume/RunInJob routine so that any non-nil lastErr is
treated as a failure even if resumed > 0: if lastErr != nil return
fmt.Errorf("some threads failed to resume (%d resumed): %w", resumed, lastErr);
keep the existing no-threads-handled branch (resumed==0) returning the original
errors.New message, or alternatively expose a boolean/option to allow
partial-success semantics if callers explicitly want it.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@pkg/helpers/command/runinjob_windows.go`:
- Around line 168-175: The current resume logic sometimes returns nil when some
threads were resumed but others failed; update the end of Resume/RunInJob
routine so that any non-nil lastErr is treated as a failure even if resumed > 0:
if lastErr != nil return fmt.Errorf("some threads failed to resume (%d resumed):
%w", resumed, lastErr); keep the existing no-threads-handled branch (resumed==0)
returning the original errors.New message, or alternatively expose a
boolean/option to allow partial-success semantics if callers explicitly want it.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ecd8b000-f92b-4c51-984d-92cd6c7ac573

📥 Commits

Reviewing files that changed from the base of the PR and between 9d1c604 and 9390af8.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (6)
  • go.mod
  • pkg/helpers/command/runinjob_darwin.go
  • pkg/helpers/command/runinjob_linux.go
  • pkg/helpers/command/runinjob_windows.go
  • pkg/service/updater/sigstore.go
  • pkg/zapscript/zaplinks_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • pkg/zapscript/zaplinks_test.go
  • go.mod

- Fix gosec G103 (unsafe pointer for Windows API interop)
- Fix gosec G115 (int→uint32 PID conversion)
- Fix govet shadow (err variable in Start call)
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
pkg/helpers/command/runinjob_windows.go (1)

141-175: ⚠️ Potential issue | 🟠 Major

Fail fast if any target thread cannot be resumed.

resumeProcessThreads() still returns nil when one thread resumes but another OpenThread/ResumeThread call fails, because Lines 150-156 only stash the error and the function only reports it when resumed == 0. That means this helper can leave the process partially suspended while RunInJob() proceeds as if startup completed.

🔧 Suggested change
-	var resumed int
-	var lastErr error
+	var resumed int
 	for {
 		if entry.OwnerProcessID == pid {
 			thread, openErr := windows.OpenThread(
 				windows.THREAD_SUSPEND_RESUME,
 				false,
 				entry.ThreadID,
 			)
 			if openErr != nil {
-				lastErr = fmt.Errorf("opening thread %d: %w", entry.ThreadID, openErr)
+				return fmt.Errorf("opening thread %d: %w", entry.ThreadID, openErr)
 			} else {
 				if _, resumeErr := windows.ResumeThread(thread); resumeErr != nil {
-					lastErr = fmt.Errorf("resuming thread %d: %w", entry.ThreadID, resumeErr)
+					_ = windows.CloseHandle(thread)
+					return fmt.Errorf("resuming thread %d: %w", entry.ThreadID, resumeErr)
 				} else {
 					resumed++
 				}
 				_ = windows.CloseHandle(thread)
 			}
@@
 	if resumed == 0 {
-		if lastErr != nil {
-			return fmt.Errorf("no threads resumed: %w", lastErr)
-		}
 		return errors.New("no threads found for process")
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/helpers/command/runinjob_windows.go` around lines 141 - 175, The loop in
resumeProcessThreads() silences individual thread OpenThread/ResumeThread errors
by setting lastErr and only returning if resumed==0; change it to fail fast:
when windows.OpenThread or windows.ResumeThread returns an error, immediately
return fmt.Errorf("resuming thread %d: %w", entry.ThreadID, err) (or a similarly
descriptive message) instead of assigning lastErr, ensuring you still call
windows.CloseHandle(thread) when appropriate; keep the existing
Thread32Next/error-no-more-files handling and only use the resumed counter for
the "no threads found" branch.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@pkg/helpers/command/runinjob_windows.go`:
- Around line 141-175: The loop in resumeProcessThreads() silences individual
thread OpenThread/ResumeThread errors by setting lastErr and only returning if
resumed==0; change it to fail fast: when windows.OpenThread or
windows.ResumeThread returns an error, immediately return fmt.Errorf("resuming
thread %d: %w", entry.ThreadID, err) (or a similarly descriptive message)
instead of assigning lastErr, ensuring you still call
windows.CloseHandle(thread) when appropriate; keep the existing
Thread32Next/error-no-more-files handling and only use the resumed counter for
the "no threads found" branch.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1e65d7b1-dead-4bec-b1f4-0a747d376029

📥 Commits

Reviewing files that changed from the base of the PR and between 9390af8 and bea772c.

📒 Files selected for processing (1)
  • pkg/helpers/command/runinjob_windows.go

Remove sigstore-go and ~60 transitive deps (gRPC, protobuf, AWS SDK,
go-tuf, rekor) which caused severe CI slowdowns due to race-detector
compilation overhead on the crypto dependency tree.

Replace with ed25519-signed checksums using only stdlib crypto. CI
signs checksums.txt with a project-controlled key at release time,
and the updater verifies the signature before trusting checksums.

- Add ed25519Validator implementing go-selfupdate Validator interface
- Chain via PatternValidator: signature on checksums.txt, then SHA256
  on release archives
- Update publish-update-manifest workflow to sign in both stable and
  beta jobs with key validation and restricted file permissions
- Public key embedded in binary, private key in UPDATE_SIGNING_KEY
  GitHub secret
Move embedded ed25519 public key to its own file (update_signing.pub)
loaded via go:embed. Add CODEOWNERS rule requiring
@ZaparooProject/admins review for changes to the key file.
@wizzomafizzo wizzomafizzo changed the title fix: ZapLink size limit, Sigstore updater verification, execute process hardening fix: ZapLink size limit, signed checksum updater verification, execute process hardening Apr 12, 2026
The build cache restore key included hashFiles('go.mod'), so any PR
that changes go.mod got a total cache miss. This forced full
recompilation of the stdlib and all deps with -race instrumentation,
taking 15+ minutes on CI runners.

Add a broad fallback restore key (${{ runner.os }}-go-) that matches
any previous main cache regardless of go.mod changes. Most cached
race-instrumented packages are still valid across minor dep bumps.
@sentry
Copy link
Copy Markdown

sentry bot commented Apr 12, 2026

Codecov Report

❌ Patch coverage is 72.41379% with 16 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
pkg/helpers/command/runinjob_linux.go 68.75% 7 Missing and 3 partials ⚠️
pkg/service/updater/signed_checksum.go 76.47% 2 Missing and 2 partials ⚠️
pkg/service/updater/updater.go 71.42% 1 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

…ners

Collapse 7 parallel jobs (lint×3, quick-test×3, full-test×1) into a single
combined ci job with dynamic matrix. PRs run Linux-only on Ubicloud paid
runners for fast feedback. Main push runs all 3 OSes on native runners.

Add cross-lint job that runs golangci-lint for Windows and macOS inside the
zigcc Docker container using zig as the C cross-compiler. This catches
type/lint errors in platform-specific code without needing real macOS/Windows
runners on PRs.

Add govulncheck SARIF integration that uploads findings to GitHub's Security
tab alongside CodeQL results.

Update macOS SDK from 12.3 to 14.5 in zigcc image to fix
SecTrustCopyCertificateChain missing from tbd stubs (required by Go 1.26+
crypto/tls) and kIOMainPortDefault header availability.

Add local cross-lint tasks (task cross-lint:windows, cross-lint:mac,
cross-lint:linux) for developer use.
Job-level permissions completely override workflow-level ones, which
caused the SARIF upload step to lose pull-requests:read needed by
other steps in the ci job.
Cross-lint failed to pull zigcc image because packages:read was missing
from workflow permissions. Also restructure the cross-lint matrix to use
simple target names (windows, darwin) instead of full CC/CXX strings,
which were making GitHub check names unreadably long.
Replace ubicloud-standard-4 with ubuntu-latest for all Linux CI jobs.
Add hashFiles check to govulncheck SARIF upload step to handle the case
where govulncheck fails before creating the SARIF file.
Remove actionlint.yaml (no custom runner labels needed).
Use ubicloud-standard-4 (4 vCPU) for ci and cross-lint jobs. Keep
ci-status on ubuntu-latest since it only runs a shell check.
Ubicloud runners use a separate cache store from GitHub-hosted runners.
The cache was only saved on refs/heads/main, so Ubicloud's cache never
got populated — causing full recompilation with race instrumentation on
every run (~20 min). Save cache on all branches to bootstrap it.

Also remove reusable workflow call (ensure-zigcc) that caused
cancel-in-progress to stall (known GitHub bug, community #30708).
Cross-lint now pulls the zigcc image directly with a fallback build.

Bump ci timeout to 30 min for the first cold-cache run.
Race-instrumented compilation from cold cache takes >30 minutes.
82/119 packages completed in 2 minutes but remaining packages
were still compiling when the timeout hit. Once the cache is
populated, subsequent runs should complete in ~5 minutes.
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
.github/workflows/publish-update-manifest.yml (1)

99-135: ⚠️ Potential issue | 🔴 Critical

Don't re-sign checksums.txt pulled from the CDN.

publish-beta merges _existing/checksums.txt from updates.zaparoo.org into the new file and then signs the merged result. If that CDN copy is ever tampered with, the next beta publish will bless attacker-controlled hashes with the project signing key. Merge from a trusted source, or verify the downloaded checksum file with its detached signature before re-signing.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/publish-update-manifest.yml around lines 99 - 135, The
workflow currently merges and re-signs _existing/checksums.txt from the CDN
(file _existing/checksums.txt) which lets an attacker who tampers with the CDN
get their hashes re-signed; change the logic so you never blindly re-sign
unverified CDN data: when _existing/checksums.txt is present also download its
detached signature (e.g., _existing/checksums.txt.sig) and verify that signature
using a trusted public verification key provided via a new secret/env (e.g.,
UPDATE_VERIFICATION_PUBKEY); only merge _existing/checksums.txt into
_assets/checksums.txt if signature verification succeeds, otherwise fail the job
(or skip the merge and proceed only with local release assets), and ensure the
signing step continues to sign only the trusted, merged output (REFERENCE:
_existing/checksums.txt, _existing/checksums.txt.sig, _assets/checksums.txt,
UPDATE_SIGNING_KEY, and the new UPDATE_VERIFICATION_PUBKEY).
🧹 Nitpick comments (1)
pkg/helpers/command/runinjob_windows.go (1)

52-121: Consider setting cmd.Cancel for immediate job termination on context cancellation.

The Linux implementation overrides cmd.Cancel to kill the entire process group immediately when context is cancelled. Currently, on Windows, if the caller's context times out, the default cmd.Cancel only terminates the main process—child processes continue until the job handle closes in the defer after cmd.Wait() returns.

For more immediate termination behavior consistent with Linux, consider overriding cmd.Cancel to terminate the job:

♻️ Optional enhancement for immediate cancellation
 	cmd.SysProcAttr.CreationFlags |= windows.CREATE_SUSPENDED
+
+	// Override Cancel to terminate all processes in the job immediately.
+	cmd.Cancel = func() error {
+		return windows.TerminateJobObject(job, 1)
+	}
+	cmd.WaitDelay = 2 * time.Second

 	startErr := cmd.Start()

Note: This requires passing job into the closure, which is valid since job is declared before and remains in scope.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/helpers/command/runinjob_windows.go` around lines 52 - 121, In RunInJob,
set cmd.Cancel to a closure that closes the job handle to immediately terminate
all job processes on context cancellation (mirroring Linux behavior); locate
RunInJob and after creating job (variable job) and before calling
cmd.Start()/cmd.SysProcAttr, assign cmd.Cancel = func() { _ =
windows.CloseHandle(job) } (ensuring it captures job), and keep the existing
deferred CloseHandle(job) to avoid leaks; ensure the Cancel closure uses the
same semantics as other cancellations (swallow error or log) and does not race
with the defered close.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/lint-and-test.yml:
- Around line 28-37: The PR workflow's include expression limits pull_request
runs to Ubuntu only, removing native Windows/macOS test coverage; update the
conditional expression used in the include (the multiline expression containing
github.event_name == 'pull_request' and fromJSON(...)) so that pull_request uses
the full matrix (ubuntu, macos, windows) rather than forcing the single ubuntu
entry — ensure the job/matrix that references this include (the include key in
lint-and-test.yml) returns the three-platform array for PRs so platform-specific
execute hardening is exercised pre-merge.

In `@scripts/tasks/cross-lint.yml`:
- Around line 12-88: The cross-OS lint tasks (tasks named "windows", "mac",
"linux", and the aggregator "all") expose local cross-target linting by setting
GOOS/GOARCH and CGO flags (e.g., GOOS, GOARCH, CGO_ENABLED, CGO_CFLAGS,
CGO_LDFLAGS) which we must not allow locally; remove or disable these per-OS
tasks from scripts/tasks/cross-lint.yml and replace them with a single local
lint task that runs golangci-lint natively (no GOOS/CGO overrides), and move the
platform-specific zigcc-based commands into CI pipelines instead (keep the EXEC
install+run logic but only in CI configs), ensuring no local task sets
GOOS/CGO_* environment vars or provides IMAGE_NAME/PLATFORM overrides for
cross-targeting.

---

Outside diff comments:
In @.github/workflows/publish-update-manifest.yml:
- Around line 99-135: The workflow currently merges and re-signs
_existing/checksums.txt from the CDN (file _existing/checksums.txt) which lets
an attacker who tampers with the CDN get their hashes re-signed; change the
logic so you never blindly re-sign unverified CDN data: when
_existing/checksums.txt is present also download its detached signature (e.g.,
_existing/checksums.txt.sig) and verify that signature using a trusted public
verification key provided via a new secret/env (e.g.,
UPDATE_VERIFICATION_PUBKEY); only merge _existing/checksums.txt into
_assets/checksums.txt if signature verification succeeds, otherwise fail the job
(or skip the merge and proceed only with local release assets), and ensure the
signing step continues to sign only the trusted, merged output (REFERENCE:
_existing/checksums.txt, _existing/checksums.txt.sig, _assets/checksums.txt,
UPDATE_SIGNING_KEY, and the new UPDATE_VERIFICATION_PUBKEY).

---

Nitpick comments:
In `@pkg/helpers/command/runinjob_windows.go`:
- Around line 52-121: In RunInJob, set cmd.Cancel to a closure that closes the
job handle to immediately terminate all job processes on context cancellation
(mirroring Linux behavior); locate RunInJob and after creating job (variable
job) and before calling cmd.Start()/cmd.SysProcAttr, assign cmd.Cancel = func()
{ _ = windows.CloseHandle(job) } (ensuring it captures job), and keep the
existing deferred CloseHandle(job) to avoid leaks; ensure the Cancel closure
uses the same semantics as other cancellations (swallow error or log) and does
not race with the defered close.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 63c02436-f251-485a-a76c-c8896e3429d2

📥 Commits

Reviewing files that changed from the base of the PR and between 9390af8 and 9db7e81.

⛔ Files ignored due to path filters (2)
  • go.sum is excluded by !**/*.sum
  • pkg/service/updater/update_signing.pub is excluded by !**/*.pub
📒 Files selected for processing (13)
  • .github/CODEOWNERS
  • .github/actionlint.yaml
  • .github/workflows/lint-and-test.yml
  • .github/workflows/publish-update-manifest.yml
  • Taskfile.dist.yml
  • go.mod
  • pkg/helpers/command/runinjob_windows.go
  • pkg/service/updater/signed_checksum.go
  • pkg/service/updater/signed_checksum_test.go
  • pkg/service/updater/updater.go
  • scripts/tasks/cross-lint.yml
  • scripts/zigcc/Dockerfile
  • scripts/zigcc/VERSION
✅ Files skipped from review due to trivial changes (5)
  • .github/actionlint.yaml
  • scripts/zigcc/VERSION
  • scripts/zigcc/Dockerfile
  • .github/CODEOWNERS
  • Taskfile.dist.yml
🚧 Files skipped from review as they are similar to previous changes (1)
  • go.mod

Comment on lines +28 to +37
include: >-
${{
github.event_name == 'pull_request'
&& fromJSON('[{"os":"ubuntu","runner":"ubicloud-standard-4"}]')
|| fromJSON('[
{"os":"ubuntu","runner":"ubicloud-standard-4"},
{"os":"macos","runner":"macos-latest"},
{"os":"windows","runner":"windows-latest"}
]')
}}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Pull requests no longer get native Windows/macOS test coverage.

PRs now run ci on Ubuntu only, and cross-lint is static analysis only. That means the new platform-specific execute hardening is exercised on Windows/macOS only after merge.

Based on learnings "NEVER run builds, lints, or tests for another OS (e.g., GOOS=windows) — CGO dependencies. Rely on CI"

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/lint-and-test.yml around lines 28 - 37, The PR workflow's
include expression limits pull_request runs to Ubuntu only, removing native
Windows/macOS test coverage; update the conditional expression used in the
include (the multiline expression containing github.event_name == 'pull_request'
and fromJSON(...)) so that pull_request uses the full matrix (ubuntu, macos,
windows) rather than forcing the single ubuntu entry — ensure the job/matrix
that references this include (the include key in lint-and-test.yml) returns the
three-platform array for PRs so platform-specific execute hardening is exercised
pre-merge.

Comment on lines +12 to +88
windows:
desc: Run golangci-lint targeting Windows via zigcc
cmds:
- task: setup
- task: :docker:run
vars:
IMAGE_NAME: zaparoo/zigcc
PLATFORM: windows
BUILD_OS: windows
BUILD_ARCH: amd64
CC: "zig cc -w --target=x86_64-windows-gnu"
CXX: "zig c++ -w --target=x86_64-windows-gnu"
EXEC: >-
bash -c 'curl -sSfL https://golangci-lint.run/install.sh | sh -s -- -b /home/build/bin 2>&1 | tail -1
&& PATH="/home/build/bin:$PATH" golangci-lint run --timeout=5m'
EXTRA_DOCKER_ARGS: >-
-e GOOS=windows
-e GOARCH=amd64
-e CGO_ENABLED=1
-e CGO_LDFLAGS="-Wl,--subsystem,windows"

mac:
desc: Run golangci-lint targeting macOS via zigcc
cmds:
- task: setup
- task: :docker:run
vars:
IMAGE_NAME: zaparoo/zigcc
PLATFORM: mac
BUILD_OS: darwin
BUILD_ARCH: amd64
CC: >-
zig cc -w --target=x86_64-macos
-I/opt/macosx-sdk/usr/include
-L/opt/macosx-sdk/usr/lib
-F/opt/macosx-sdk/System/Library/Frameworks
CXX: >-
zig c++ -w --target=x86_64-macos
-I/opt/macosx-sdk/usr/include
-L/opt/macosx-sdk/usr/lib
-F/opt/macosx-sdk/System/Library/Frameworks
EXEC: >-
bash -c 'curl -sSfL https://golangci-lint.run/install.sh | sh -s -- -b /home/build/bin 2>&1 | tail -1
&& PATH="/home/build/bin:$PATH" golangci-lint run --timeout=5m'
EXTRA_DOCKER_ARGS: >-
-e GOOS=darwin
-e GOARCH=amd64
-e CGO_ENABLED=1

linux:
desc: Run golangci-lint targeting Linux via zigcc (for non-Linux developers)
cmds:
- task: setup
- task: :docker:run
vars:
IMAGE_NAME: zaparoo/zigcc
PLATFORM: linux
BUILD_OS: linux
BUILD_ARCH: amd64
CC: "zig cc -target x86_64-linux-gnu"
CXX: "zig c++ -target x86_64-linux-gnu"
EXEC: >-
bash -c 'curl -sSfL https://golangci-lint.run/install.sh | sh -s -- -b /home/build/bin 2>&1 | tail -1
&& PATH="/home/build/bin:$PATH" golangci-lint run --timeout=5m'
EXTRA_DOCKER_ARGS: >-
-e GOOS=linux
-e GOARCH=amd64
-e CGO_ENABLED=1
-e CGO_CFLAGS="-I/opt/libnfc/x86_64-linux-gnu/include"
-e CGO_LDFLAGS="-L/opt/libnfc/x86_64-linux-gnu/lib -L/opt/deps/x86_64-linux-gnu/lib /opt/libnfc/x86_64-linux-gnu/lib/libnfc.a /opt/deps/x86_64-linux-gnu/lib/libusb.a /opt/deps/x86_64-linux-gnu/lib/libusb-1.0.a -ldl"

all:
desc: Run golangci-lint for all platforms via zigcc
cmds:
- task: windows
- task: mac
- task: linux
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Avoid exposing cross-OS lint as a local task.

These tasks add local Windows/macOS/Linux lint entrypoints via GOOS/CGO, which is the exact workflow we avoid in this repo because native CI runners are the source of truth for cross-platform validation. Please keep this logic in CI instead of encouraging local cross-target lint runs.

Based on learnings "NEVER run builds, lints, or tests for another OS (e.g., GOOS=windows) — CGO dependencies. Rely on CI"

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/tasks/cross-lint.yml` around lines 12 - 88, The cross-OS lint tasks
(tasks named "windows", "mac", "linux", and the aggregator "all") expose local
cross-target linting by setting GOOS/GOARCH and CGO flags (e.g., GOOS, GOARCH,
CGO_ENABLED, CGO_CFLAGS, CGO_LDFLAGS) which we must not allow locally; remove or
disable these per-OS tasks from scripts/tasks/cross-lint.yml and replace them
with a single local lint task that runs golangci-lint natively (no GOOS/CGO
overrides), and move the platform-specific zigcc-based commands into CI
pipelines instead (keep the EXEC install+run logic but only in CI configs),
ensuring no local task sets GOOS/CGO_* environment vars or provides
IMAGE_NAME/PLATFORM overrides for cross-targeting.

… to Windows job

Beta update manifest workflow now verifies the ed25519 signature on
existing CDN checksums before merging them with new release checksums.
Prevents an attacker who compromises the CDN from getting tampered
hashes re-signed with the real key.

Windows RunInJob now sets cmd.Cancel to close the job handle on context
cancellation, immediately terminating all processes in the job instead
of waiting for the defer chain.
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
pkg/helpers/command/runinjob_windows_test.go (1)

68-79: ⚠️ Potential issue | 🟡 Minor

Strengthen cancel test to assert descendant cleanup, not only returned error.

assert.Error here confirms cancellation surfaced, but it does not prove child/descendant processes are terminated after job-handle close. Please assert spawned descendant PID(s) are no longer running after cancel to lock in KILL_ON_JOB_CLOSE behavior.

As per coding guidelines, "Write tests for all new code — see TESTING.md and pkg/testing/README.md".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/helpers/command/runinjob_windows_test.go` around lines 68 - 79, The test
TestRunInJob_ContextCancelKillsJob currently only asserts an error; update it to
also assert that spawned descendant PIDs are no longer running after the context
cancellation to prove KILL_ON_JOB_CLOSE. Modify the test to have the command
spawn a child process that writes its PID (or PIDs) to a temp file or stdout (so
the test can capture them), call RunInJob(cmd), wait for cancellation, then read
the descendant PID(s) and verify they are terminated using a Windows-appropriate
check (e.g., OpenProcess + GetExitCodeProcess or an isProcessRunning helper) and
assert they are not running; keep references to the existing
TestRunInJob_ContextCancelKillsJob and RunInJob so reviewers can locate the
change.
🧹 Nitpick comments (1)
pkg/helpers/command/runinjob_windows.go (1)

93-96: Chain existing cmd.Cancel instead of replacing it outright.

Overwriting cmd.Cancel drops any prior cancel behavior from exec.CommandContext. Chaining keeps job-close semantics and preserves fallback termination behavior.

♻️ Suggested patch
-	cmd.Cancel = func() error {
-		closeJob()
-		return nil
-	}
+	prevCancel := cmd.Cancel
+	cmd.Cancel = func() error {
+		closeJob()
+		if prevCancel != nil {
+			return prevCancel()
+		}
+		return nil
+	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/helpers/command/runinjob_windows.go` around lines 93 - 96, The current
assignment replaces cmd.Cancel and discards any existing cancellation behavior
created by exec.CommandContext; instead, capture the previous cancel function
(e.g., oldCancel := cmd.Cancel) and set cmd.Cancel to a wrapper that first calls
closeJob() and then, if oldCancel != nil, calls and returns oldCancel()
(propagating its error), otherwise returns nil. This preserves the original
exec.CommandContext termination semantics while ensuring the job is closed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@pkg/helpers/command/runinjob_windows_test.go`:
- Around line 68-79: The test TestRunInJob_ContextCancelKillsJob currently only
asserts an error; update it to also assert that spawned descendant PIDs are no
longer running after the context cancellation to prove KILL_ON_JOB_CLOSE. Modify
the test to have the command spawn a child process that writes its PID (or PIDs)
to a temp file or stdout (so the test can capture them), call RunInJob(cmd),
wait for cancellation, then read the descendant PID(s) and verify they are
terminated using a Windows-appropriate check (e.g., OpenProcess +
GetExitCodeProcess or an isProcessRunning helper) and assert they are not
running; keep references to the existing TestRunInJob_ContextCancelKillsJob and
RunInJob so reviewers can locate the change.

---

Nitpick comments:
In `@pkg/helpers/command/runinjob_windows.go`:
- Around line 93-96: The current assignment replaces cmd.Cancel and discards any
existing cancellation behavior created by exec.CommandContext; instead, capture
the previous cancel function (e.g., oldCancel := cmd.Cancel) and set cmd.Cancel
to a wrapper that first calls closeJob() and then, if oldCancel != nil, calls
and returns oldCancel() (propagating its error), otherwise returns nil. This
preserves the original exec.CommandContext termination semantics while ensuring
the job is closed.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: bd668cba-873c-4af6-8f68-6980500c401b

📥 Commits

Reviewing files that changed from the base of the PR and between 9db7e81 and 3a6e863.

📒 Files selected for processing (3)
  • .github/workflows/publish-update-manifest.yml
  • pkg/helpers/command/runinjob_windows.go
  • pkg/helpers/command/runinjob_windows_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • .github/workflows/publish-update-manifest.yml

@wizzomafizzo wizzomafizzo merged commit e7d3a98 into main Apr 13, 2026
11 checks passed
@wizzomafizzo wizzomafizzo deleted the security/zaplink-sigstore-execute-hardening branch April 13, 2026 05:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant