Skip to content

avoid race conditions on multiple instances on arch specific builds#5004

Merged
openshift-merge-bot[bot] merged 1 commit intoopenshift:mainfrom
droslean:bug-race
Mar 12, 2026
Merged

avoid race conditions on multiple instances on arch specific builds#5004
openshift-merge-bot[bot] merged 1 commit intoopenshift:mainfrom
droslean:bug-race

Conversation

@droslean
Copy link
Member

@droslean droslean commented Mar 11, 2026

/cc @openshift/test-platform

Summary by CodeRabbit

  • Refactor
    • Manifest publishing now interacts directly with the cluster API for more reliable multi-architecture image handling and digest/tag resolution.
    • Better handling when target image metadata is absent to avoid unnecessary pushes.
  • Tests
    • Added tests for manifest entry generation and target reference parsing to validate behavior and prevent regressions.

@openshift-ci-robot
Copy link
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: automatic mode

@openshift-ci openshift-ci bot requested a review from a team March 11, 2026 14:21
@coderabbitai
Copy link

coderabbitai bot commented Mar 11, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 330d6316-6d9b-4861-8676-0e882c98f1fa

📥 Commits

Reviewing files that changed from the base of the PR and between 482ef97 and 8dc3285.

📒 Files selected for processing (6)
  • pkg/controller/multiarchbuildconfig/multiarchbuildconfig.go
  • pkg/manifestpusher/manifestpusher.go
  • pkg/manifestpusher/manifestpusher_test.go
  • pkg/steps/build_client.go
  • pkg/steps/source.go
  • pkg/steps/source_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/steps/build_client.go

Walkthrough

Injects a controller-runtime client into ManifestPusher, which now parses target image refs, fetches ImageStreamTag objects via the client, merges DockerImageManifests with source images, and updates callers to provide the client.

Changes

Cohort / File(s) Summary
ManifestPusher Core Implementation
pkg/manifestpusher/manifestpusher.go
Constructor now accepts a ctrlruntimeclient.Client and stores it. Added manifestEntries(builds, targetImageRef) and splitImageStreamTagRef; PushImageWithManifest uses manifestEntries and returns propagated errors.
ManifestPusher Tests
pkg/manifestpusher/manifestpusher_test.go
New tests (TestManifestEntries, TestSplitImageStreamTagRef) using a fake client and ImageStreamTag objects to validate manifest entry generation and target ref parsing, including error cases.
BuildClient Interface
pkg/steps/build_client.go
Added Client() loggingclient.LoggingClient to the BuildClient interface and implemented it on buildClient to expose the embedded LoggingClient.
Source Step Call Site Updates
pkg/steps/source.go, pkg/steps/source_test.go
handleBuilds now passes buildClient.Client() into NewManifestPusher. Test fake client updated to implement Client().
MultiarchBuildConfig Call Site
pkg/controller/multiarchbuildconfig/multiarchbuildconfig.go
AddToManager updated to pass mgr.GetClient() as the new fourth argument to NewManifestPusher.

Sequence Diagram(s)

sequenceDiagram
    participant Steps as Steps.handleBuilds
    participant Pusher as ManifestPusher
    participant Builds as Builds (data)
    participant Kube as Kube API (ImageStreamTag)
    participant Registry as Image Registry

    Steps->>Pusher: NewManifestPusher(logger, registryURL, dockercfg, client)
    Steps->>Pusher: PushImageWithManifest(builds, targetImageRef)
    Pusher->>Pusher: manifestEntries(builds, targetImageRef)
    Pusher->>Kube: Get ImageStreamTag(namespace/name:tag)
    Kube-->>Pusher: ImageStreamTag (or NotFound)
    Pusher->>Pusher: merge srcImages + DockerImageManifests -> entries
    Pusher->>Registry: push manifest list for entries
    Registry-->>Pusher: digest
    Pusher-->>Steps: result / error
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Structure And Quality ❓ Inconclusive The custom check is written for Ginkgo test code, but this repository uses standard Go testing with table-driven tests and t.Run() subtests. Clarify whether the check should evaluate Go standard testing patterns instead of Ginkgo-specific patterns, or confirm if the repository should adopt Ginkgo testing.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main objective of the changeset, which is to avoid race conditions when multiple instances run arch-specific builds by introducing a Kubernetes client-driven approach to fetch and manage image manifests.
Stable And Deterministic Test Names ✅ Passed Tests use standard Go testing with t.Run() subtests rather than Ginkgo. All test names are static, descriptive, and free of dynamic information such as timestamps, UUIDs, random suffixes, pod names, node names, or IP addresses.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 11, 2026
@droslean droslean force-pushed the bug-race branch 2 times, most recently from 4454d50 to dc8096d Compare March 11, 2026 14:38
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
pkg/manifestpusher/manifestpusher_test.go (1)

27-96: Consider expanding test coverage for edge cases.

The test cases cover the happy path (merging entries) and invalid reference error, but several edge cases are missing:

  1. ImageStreamTag not found - Verify behavior when the IST doesn't exist yet (first-time multi-arch build)
  2. Empty DockerImageManifests - Verify behavior when IST exists but has no manifest list
  3. Empty builds array - Verify behavior when no new builds are provided
  4. Architecture collision - Verify behavior when a new build has the same architecture as an existing manifest

These scenarios are important for validating the race condition fix mentioned in the PR title.

💡 Example test case for IST not found
{
    name:      "returns only new builds when ImageStreamTag not found",
    targetRef: "ns/pipeline:src",
    builds: []buildv1.Build{
        {
            Spec: buildv1.BuildSpec{
                CommonSpec: buildv1.CommonSpec{
                    NodeSelector: map[string]string{nodeArchitectureLabel: "amd64"},
                    Output: buildv1.BuildOutput{
                        To: &corev1.ObjectReference{Namespace: "ns", Name: "pipeline:src-amd64"},
                    },
                },
            },
        },
    },
    objects: []ctrlruntimeclient.Object{}, // No IST exists
    want: []types.ManifestEntry{
        {
            Image: "registry/ns/pipeline:src-amd64",
            Platform: ocispec.Platform{
                OS:           "linux",
                Architecture: "amd64",
            },
        },
    },
},
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/manifestpusher/manifestpusher_test.go` around lines 27 - 96, Add unit
tests to the testCases slice in manifestpusher_test.go to cover the missing edge
cases: (1) "ImageStreamTag not found" — add a case with objects:
[]ctrlruntimeclient.Object{} and expect only the new build entry in want; (2)
"Empty DockerImageManifests" — add an IST object with
Image.DockerImageManifests: nil or empty slice and expect only new build entries
in want; (3) "Empty builds array" — set builds: []buildv1.Build{} and provide an
existing IST in objects, asserting want equals the existing manifest entries (or
empty if none); (4) "Architecture collision" — provide an existing IST with a
manifest having the same Architecture as a new build and a build in builds with
the same nodeArchitectureLabel value, then assert that the resulting want
deduplicates or replaces correctly per intended behavior. Use the existing field
names (testCases, targetRef, builds, objects, want, wantErr) and
nodeArchitectureLabel to mirror other cases so the test harness
(manifestpusher_test.go) picks them up.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/manifestpusher/manifestpusher.go`:
- Around line 85-95: The current logic in the ImageStreamTag fetch
(m.client.Get) and subsequent manifest check returns nil,nil when the IST is not
found or when ist.Image.DockerImageManifests is empty, which drops the newly
built srcImages; change the branches so that on kerrors.IsNotFound(err) you
return srcImages (not nil,nil) and likewise when
len(ist.Image.DockerImageManifests) == 0 return srcImages, preserving the
existing error return for other Get failures; locate the ImageStreamTag handling
around m.client.Get, the ist variable, and the check of
ist.Image.DockerImageManifests and replace the nil,nil returns with returning
the srcImages slice so new builds are pushed.
- Around line 97-112: The merge currently appends existing manifests from
ist.Image.DockerImageManifests onto mergedEntries without filtering, causing
duplicate platform entries when srcImages already contains the same
architecture; create a set of built architectures from srcImages (inspect
srcImages[].Platform.Architecture), then when iterating
ist.Image.DockerImageManifests skip any manifest whose Architecture is present
in that set (also keep the existing checks for empty OS/Architecture/Digest)
before appending to mergedEntries so PushManifestList won't receive duplicate
platforms.

---

Nitpick comments:
In `@pkg/manifestpusher/manifestpusher_test.go`:
- Around line 27-96: Add unit tests to the testCases slice in
manifestpusher_test.go to cover the missing edge cases: (1) "ImageStreamTag not
found" — add a case with objects: []ctrlruntimeclient.Object{} and expect only
the new build entry in want; (2) "Empty DockerImageManifests" — add an IST
object with Image.DockerImageManifests: nil or empty slice and expect only new
build entries in want; (3) "Empty builds array" — set builds: []buildv1.Build{}
and provide an existing IST in objects, asserting want equals the existing
manifest entries (or empty if none); (4) "Architecture collision" — provide an
existing IST with a manifest having the same Architecture as a new build and a
build in builds with the same nodeArchitectureLabel value, then assert that the
resulting want deduplicates or replaces correctly per intended behavior. Use the
existing field names (testCases, targetRef, builds, objects, want, wantErr) and
nodeArchitectureLabel to mirror other cases so the test harness
(manifestpusher_test.go) picks them up.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 24883af4-af9d-4e25-add9-3ccd32ea50e3

📥 Commits

Reviewing files that changed from the base of the PR and between a6ff4ff and 4454d50.

📒 Files selected for processing (6)
  • pkg/controller/multiarchbuildconfig/multiarchbuildconfig.go
  • pkg/manifestpusher/manifestpusher.go
  • pkg/manifestpusher/manifestpusher_test.go
  • pkg/steps/build_client.go
  • pkg/steps/source.go
  • pkg/steps/source_test.go

@droslean
Copy link
Member Author

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 11, 2026
Signed-off-by: Nikolaos Moraitis <nmoraiti@redhat.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
pkg/manifestpusher/manifestpusher_test.go (1)

27-76: Test coverage is insufficient to catch the bugs in the implementation.

The single test case only covers the happy path with non-overlapping architectures (new amd64 build + existing arm64 manifest). Consider adding test cases for:

  1. IST not found - verifies new builds are returned (currently they're discarded)
  2. IST with empty DockerImageManifests - same issue
  3. Overlapping architecture - e.g., rebuilding amd64 when IST already has amd64 manifest (would expose the duplicate platform bug)
💡 Example additional test cases
{
    name:      "returns srcImages when IST not found",
    targetRef: "ns/pipeline:src",
    builds: []buildv1.Build{
        {
            Spec: buildv1.BuildSpec{
                CommonSpec: buildv1.CommonSpec{
                    NodeSelector: map[string]string{nodeArchitectureLabel: "amd64"},
                    Output: buildv1.BuildOutput{
                        To: &corev1.ObjectReference{Namespace: "ns", Name: "pipeline:src-amd64"},
                    },
                },
            },
        },
    },
    objects: []ctrlruntimeclient.Object{}, // No IST exists
    want: []types.ManifestEntry{
        {
            Image: "registry/ns/pipeline:src-amd64",
            Platform: ocispec.Platform{
                OS:           "linux",
                Architecture: "amd64",
            },
        },
    },
},
{
    name:      "rebuilding same architecture replaces existing manifest",
    targetRef: "ns/pipeline:src",
    builds: []buildv1.Build{
        {
            Spec: buildv1.BuildSpec{
                CommonSpec: buildv1.CommonSpec{
                    NodeSelector: map[string]string{nodeArchitectureLabel: "amd64"},
                    Output: buildv1.BuildOutput{
                        To: &corev1.ObjectReference{Namespace: "ns", Name: "pipeline:src-amd64"},
                    },
                },
            },
        },
    },
    objects: []ctrlruntimeclient.Object{
        &imagev1.ImageStreamTag{
            ObjectMeta: metav1.ObjectMeta{Namespace: "ns", Name: "pipeline:src"},
            Image: imagev1.Image{
                DockerImageManifests: []imagev1.ImageManifest{
                    {Digest: "sha256:oldamd64digest", OS: "linux", Architecture: "amd64"},
                },
            },
        },
    },
    want: []types.ManifestEntry{
        {
            Image: "registry/ns/pipeline:src-amd64",
            Platform: ocispec.Platform{
                OS:           "linux",
                Architecture: "amd64",
            },
        },
        // Should NOT include the old amd64 manifest
    },
},
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/manifestpusher/manifestpusher_test.go` around lines 27 - 76, Add more
unit tests to cover IST-not-found, IST-with-empty-DockerImageManifests, and
overlapping-architecture rebuilds by extending the testCases slice in
manifestpusher_test.go (the existing variable testCases) with cases that: (1)
set objects to an empty slice and assert the new build's ManifestEntry is
returned, (2) set objects to an ImageStreamTag whose Image.DockerImageManifests
is empty and assert the new build's ManifestEntry is returned, and (3) set
objects to an ImageStreamTag that already contains the same Architecture (e.g.,
amd64) and assert the returned want slice contains the new build entry but not
the old duplicate manifest; use the same targetRef/build structure as existing
tests and compare against types.ManifestEntry entries to validate
deduplication/replacement behavior for the function under test (the tests in
this file that iterate testCases).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@pkg/manifestpusher/manifestpusher_test.go`:
- Around line 27-76: Add more unit tests to cover IST-not-found,
IST-with-empty-DockerImageManifests, and overlapping-architecture rebuilds by
extending the testCases slice in manifestpusher_test.go (the existing variable
testCases) with cases that: (1) set objects to an empty slice and assert the new
build's ManifestEntry is returned, (2) set objects to an ImageStreamTag whose
Image.DockerImageManifests is empty and assert the new build's ManifestEntry is
returned, and (3) set objects to an ImageStreamTag that already contains the
same Architecture (e.g., amd64) and assert the returned want slice contains the
new build entry but not the old duplicate manifest; use the same targetRef/build
structure as existing tests and compare against types.ManifestEntry entries to
validate deduplication/replacement behavior for the function under test (the
tests in this file that iterate testCases).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 342953ad-57a3-4198-8b19-1dadbacac471

📥 Commits

Reviewing files that changed from the base of the PR and between 4454d50 and 482ef97.

📒 Files selected for processing (6)
  • pkg/controller/multiarchbuildconfig/multiarchbuildconfig.go
  • pkg/manifestpusher/manifestpusher.go
  • pkg/manifestpusher/manifestpusher_test.go
  • pkg/steps/build_client.go
  • pkg/steps/source.go
  • pkg/steps/source_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/controller/multiarchbuildconfig/multiarchbuildconfig.go

@openshift-ci-robot
Copy link
Contributor

Scheduling tests matching the pipeline_run_if_changed or not excluded by pipeline_skip_if_only_changed parameters:
/test e2e

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 12, 2026
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 12, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: droslean, jmguzik

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@droslean
Copy link
Member Author

/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 12, 2026
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 12, 2026

@droslean: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/breaking-changes 8dc3285 link false /test breaking-changes

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot bot merged commit 55bbe1c into openshift:main Mar 12, 2026
13 of 14 checks passed
@droslean droslean deleted the bug-race branch March 13, 2026 00:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants