Skip to content

feat(plugindefinitions): implement image replication for registry mirrors#1826

Open
mikolajkucinski wants to merge 13 commits intomainfrom
feat/image-replication
Open

feat(plugindefinitions): implement image replication for registry mirrors#1826
mikolajkucinski wants to merge 13 commits intomainfrom
feat/image-replication

Conversation

@mikolajkucinski
Copy link
Contributor

@mikolajkucinski mikolajkucinski commented Feb 27, 2026

Description

This PR adds chart replication support for registry mirrors. When a PluginDefinition's HelmChart becomes ready, the controller replicates the chart OCI artifact to the configured mirror registry via crane.Manifest. Image parsing uses go-containerregistry/pkg/name for correct OCI reference handling. All mirror logic is consolidated in internal/imagemirror.

What type of PR is this? (check all applicable)

  • 🍕 Feature
  • 🐛 Bug Fix
  • 📝 Documentation Update
  • 🎨 Style
  • 🧑‍💻 Code Refactor
  • 🔥 Performance Improvements
  • ✅ Test
  • 🤖 Build
  • 🔁 CI
  • 📦 Chore (Release)
  • ⏩ Revert

Related Tickets & Documents

Added tests?

  • 👍 yes
  • 🙅 no, because they aren't needed
  • 🙋 no, because I need help
  • Separate ticket for tests # (issue/pr)

Added to documentation?

  • 📜 README.md
  • 🤝 Documentation pages updated
  • 🙅 no documentation needed
  • (if applicable) generated OpenAPI docs for CRD changes

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • My changes generate no new warnings
  • New and existing unit tests pass locally with my changes

Summary by CodeRabbit

  • New Features

    • Added OCI chart replication support to automatically mirror container images to configured registries.
    • Enhanced registry mirror configuration to include authentication via Kubernetes secrets.
  • Chores

    • Updated dependencies for improved container registry tooling and OCI image handling.

@mikolajkucinski mikolajkucinski changed the title Feat/image replication feat(plugindefinition): implement image replication for registry mirrors Feb 27, 2026
@mikolajkucinski mikolajkucinski changed the title feat(plugindefinition): implement image replication for registry mirrors feat(plugindefinitions): implement image replication for registry mirrors Feb 27, 2026
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Feb 27, 2026
@github-actions github-actions bot added size/XL and removed size/XXL labels Mar 3, 2026
@coderabbitai

This comment was marked as spam.

@mikolajkucinski mikolajkucinski force-pushed the feat/image-replication branch 4 times, most recently from 17895e0 to 458dcaf Compare March 5, 2026 08:05
mikolajkucinski and others added 10 commits March 6, 2026 18:04
…pport

Extract image parsing and registry mirror config from the plugin
controller into shared packages. Add on-demand image replication
that pulls manifests to trigger mirror sync.

Bump Go to 1.25.6 required by go-containerregistry.

On-behalf-of: @SAP <mikolaj.kucinski@sap.com>
Use go-containerregistry/pkg/name.ParseReference for image ref parsing.

On-behalf-of: @SAP <mikolaj.kucinski@sap.com>
Use DescribeTable for table-driven buildMirroredImageRef tests.

On-behalf-of: @SAP <mikolaj.kucinski@sap.com>
Move image parsing, registry mirror config, and replication code from
internal/common and internal/replication into a single internal/imagemirror package.

On-behalf-of: @SAP <mikolaj.kucinski@sap.com>
On-behalf-of: @SAP <mikolaj.kucinski@sap.com>
On-behalf-of: @SAP <mikolaj.kucinski@sap.com>
On-behalf-of: @SAP <mikolaj.kucinski@sap.com>
PluginDefinition controller now replicates the Helm chart OCI artifact
directly instead of templating the chart and replicating container
images. Container image replication belongs in the Plugin controller
which has the final option values.

On-behalf-of: @SAP <mikolaj.kucinski@sap.com>
On-behalf-of: @SAP <mikolaj.kucinski@sap.com>
On-behalf-of: @SAP <mikolaj.kucinski@sap.com>
@github-actions github-actions bot removed the size/XL label Mar 6, 2026
On-behalf-of: @SAP <mikolaj.kucinski@sap.com>
@mikolajkucinski mikolajkucinski force-pushed the feat/image-replication branch from 93cbfcf to 9d5c163 Compare March 6, 2026 17:21
@mikolajkucinski mikolajkucinski marked this pull request as ready for review March 9, 2026 15:01
@mikolajkucinski mikolajkucinski requested a review from a team as a code owner March 9, 2026 15:01
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (7)
internal/ocimirror/ociref.go (2)

13-13: Regex may miss some valid YAML image field patterns.

The regex pattern captures common cases but may not handle:

  • Images specified with YAML block scalars
  • Images in nested structures like spec.containers[].image without the literal - image: prefix

For the current use case of extracting images from Helm-rendered manifests, this is likely acceptable. If broader coverage is needed later, consider using a proper YAML parser.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/ocimirror/ociref.go` at line 13, The current regex variable
imageFieldPattern may miss valid YAML image declarations (e.g., block scalars or
nested keys like spec.containers[].image); update the extraction to use a YAML
parser instead of regex for robustness: replace usages of imageFieldPattern with
parsing the manifest via a YAML library (unmarshal each document and walk
maps/arrays to collect values for the "image" key, including nested paths like
spec.containers[].image), or if you must keep regex, add a clear comment on
imageFieldPattern documenting its limitations and intended scope for
Helm-rendered manifests only.

34-38: Silent failure on parse error may lead to incorrect behavior.

When name.ParseReference fails, the function returns the original imageRef as the repository with an empty tagOrDigest. This silent fallback could cause downstream issues where callers process invalid references without knowing the parse failed. Consider either returning an error or logging a warning.

♻️ Alternative: return error to signal parse failure
-func SplitOCIRef(imageRef string) (registry, repository, tagOrDigest string) {
+func SplitOCIRef(imageRef string) (registry, repository, tagOrDigest string, err error) {
 	ref, err := name.ParseReference(imageRef)
 	if err != nil {
-		return "docker.io", imageRef, ""
+		return "", "", "", fmt.Errorf("failed to parse OCI reference %q: %w", imageRef, err)
 	}

If the current fallback behavior is intentional for graceful degradation, consider adding a comment explaining why and what the expected caller behavior is.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/ocimirror/ociref.go` around lines 34 - 38, SplitOCIRef currently
swallows name.ParseReference errors and returns a fallback ("docker.io",
imageRef, "") which can hide invalid refs; change SplitOCIRef to surface parse
failures by returning an error instead of silently falling back: update the
SplitOCIRef signature to include an error return, propagate the parse error from
name.ParseReference, and update all callers to handle the error (or, if you
prefer graceful degradation, add a clear comment explaining the fallback and
emit a warning log via the package logger when ParseReference fails). Locate the
function SplitOCIRef and the name.ParseReference call to implement this change
and update call sites accordingly.
internal/ocimirror/ociref_test.go (1)

62-104: Consider adding test cases for error/edge scenarios.

The tests cover common valid inputs well. Consider adding tests for edge cases:

  • Empty string input
  • Completely malformed references (e.g., ":::invalid")
  • References with ports (e.g., "localhost:5000/myimage:tag")

This would help document the expected behavior of SplitOCIRef when parsing fails.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/ocimirror/ociref_test.go` around lines 62 - 104, Add unit tests for
SplitOCIRef to cover edge cases: call ocimirror.SplitOCIRef with an empty string
and assert it does not panic and returns empty reg/repo and empty tagOrDigest;
call it with a malformed input like ":::invalid" and assert it returns a
non-panicking result (e.g., reg == "" and repo == ":::invalid" or whatever the
current implementation returns) to lock in current behavior; and call it with a
registry containing a port "localhost:5000/myimage:tag" and assert reg ==
"localhost:5000", repo == "myimage", tagOrDigest == ":tag". Name the tests
clearly and use the same Expect style as existing It blocks to document expected
behavior of SplitOCIRef.
internal/ocimirror/replicator.go (1)

105-115: Consider documenting the opts slice mutation behavior.

The append on line 108 mutates the original extraOpts slice's backing array if it has spare capacity. This is typically safe here since callers pass variadic arguments, but documenting this or using slices.Clone would make the intent clearer.

♻️ Optional: defensive copy of options
 func (r *OCIReplicator) TriggerReplication(ctx context.Context, mirroredRef string, extraOpts ...crane.Option) error {
 	log.FromContext(ctx).V(1).Info("triggering replication", "ref", mirroredRef)
-	opts := append([]crane.Option{crane.WithAuth(r.auth)}, extraOpts...)
+	opts := make([]crane.Option, 0, 1+len(extraOpts))
+	opts = append(opts, crane.WithAuth(r.auth))
+	opts = append(opts, extraOpts...)
 	_, err := r.manifestFetcher(mirroredRef, opts...)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/ocimirror/replicator.go` around lines 105 - 115, TriggerReplication
currently builds opts via opts := append([]crane.Option{crane.WithAuth(r.auth)},
extraOpts...) which can mutate the caller's extraOpts backing array; change this
to make a defensive copy instead (e.g., allocate a new slice and copy extraOpts
or use slices.Clone) so appending the auth option cannot alter the caller's
slice, and update the comment to note that TriggerReplication creates an
internal copy of options before appending; reference: TriggerReplication,
extraOpts, opts, manifestFetcher.
internal/controller/plugindefinition/utils.go (2)

235-237: Consider adding platform option for consistency.

TriggerReplication is called here without the platform option, but in ReplicateOCIArtifacts (replicator.go:75), the linux/amd64 platform is explicitly specified via crane.WithPlatform.

Per the PR objectives, the replication should use "linux/amd64 platform". Consider adding the platform option here for consistency:

♻️ Add platform option for consistency
+import (
+	"github.com/google/go-containerregistry/pkg/crane"
+	v1 "github.com/google/go-containerregistry/pkg/v1"
+)

-	if err := replicator.TriggerReplication(ctx, mirroredRef); err != nil {
+	if err := replicator.TriggerReplication(ctx, mirroredRef, crane.WithPlatform(&v1.Platform{OS: "linux", Architecture: "amd64"})); err != nil {
 		return failReplication(err, "chart replication failed")
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/controller/plugindefinition/utils.go` around lines 235 - 237, The
call to replicator.TriggerReplication(ctx, mirroredRef) omits the platform
option and should replicate using the linux/amd64 platform like
ReplicateOCIArtifacts does; update the TriggerReplication invocation (in
utils.go) to pass the platform option (e.g., the same
crane.WithPlatform("linux/amd64") or the replicator's platform option parameter)
so the replicator uses the linux/amd64 platform consistently with
ReplicateOCIArtifacts.

210-218: Consider a more robust skip-check mechanism.

The current logic uses strings.Contains(replicationCond.Message, chartRef) to determine if replication can be skipped. This is brittle because:

  1. Message format changes could break the check
  2. If a chart name is a substring of another (e.g., app:v1 vs app:v1.0), false matches could occur

Consider storing the replicated chart reference in a dedicated status field or using exact message matching.

♻️ Safer alternative using exact suffix match
 	if replicationCond != nil && replicationCond.IsTrue() &&
 		replicationCond.Reason == greenhousev1alpha1.OCIReplicationSucceededReason &&
-		strings.Contains(replicationCond.Message, chartRef) {
+		strings.HasSuffix(replicationCond.Message, chartRef) {
 		logger.V(1).Info("chart already replicated, skipping", "chart", chartRef)
 		return nil
 	}

Alternatively, consider adding a ReplicatedChartRef field to the status for explicit tracking.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/controller/plugindefinition/utils.go` around lines 210 - 218,
Replace the brittle substring check using replicationCond.Message with an
explicit, robust check: add or use a dedicated status field (e.g.,
Status.ReplicatedChartRef) on the plugin definition and change the skip logic in
the function that calls
pluginDef.GetConditions()/GetConditionByType(OCIReplicationReadyCondition) to
first compare pluginDef.Status.ReplicatedChartRef == chartRef (exact equality),
and only fall back to a strict exact message match (replicationCond.Message ==
expectedMessage) if the status field is not present; ensure updates to set
Status.ReplicatedChartRef when replication succeeds so future checks use the
exact equality.
internal/controller/plugindefinition/plugindefinition_controller.go (1)

100-106: Add a comment explaining the intentional error handling pattern for chart replication failures.

The replication error is intentionally swallowed here (//nolint:nilerr), allowing the reconciliation to return lifecycle.Success while ensureChartReplication sets the OCIReplicationReadyCondition to False and logs the error internally. This enables "eventually consistent" behavior: the ReadyCondition propagates the replication failure (via setReadyCondition), the resource requeues after 1 minute, and retry attempts continue without blocking the reconciliation.

While this pattern is intentional and correct, consider adding an inline comment explaining why replication failures are non-blocking and how the condition propagation works, so future maintainers understand this is deliberate design rather than oversight.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/controller/plugindefinition/plugindefinition_controller.go` around
lines 100 - 106, Add an inline comment above the ensureChartReplication call
explaining that replication errors are intentionally not returned to the
controller: note that ensureChartReplication sets the
OCIReplicationReadyCondition (via setReadyCondition) to False and logs errors
internally, the reconciliation returns lifecycle.Success with nolint:nilerr and
requeues after 1 minute to allow eventual consistency and retries, and therefore
swallowing the error here is deliberate to avoid blocking other reconciliation
logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/ocimirror/registry_mirror.go`:
- Around line 84-91: ResolveOCIRef lacks a nil-receiver guard and will panic if
called on a nil *RegistryMirrorConfig (e.g., when GetRegistryMirrorConfig
returns (nil, nil)); add an early check at the start of
RegistryMirrorConfig.ResolveOCIRef (if c == nil) to return nil, and optionally
also handle a nil c.RegistryMirrors map safely before lookup, so the method
returns nil instead of panicking.

---

Nitpick comments:
In `@internal/controller/plugindefinition/plugindefinition_controller.go`:
- Around line 100-106: Add an inline comment above the ensureChartReplication
call explaining that replication errors are intentionally not returned to the
controller: note that ensureChartReplication sets the
OCIReplicationReadyCondition (via setReadyCondition) to False and logs errors
internally, the reconciliation returns lifecycle.Success with nolint:nilerr and
requeues after 1 minute to allow eventual consistency and retries, and therefore
swallowing the error here is deliberate to avoid blocking other reconciliation
logic.

In `@internal/controller/plugindefinition/utils.go`:
- Around line 235-237: The call to replicator.TriggerReplication(ctx,
mirroredRef) omits the platform option and should replicate using the
linux/amd64 platform like ReplicateOCIArtifacts does; update the
TriggerReplication invocation (in utils.go) to pass the platform option (e.g.,
the same crane.WithPlatform("linux/amd64") or the replicator's platform option
parameter) so the replicator uses the linux/amd64 platform consistently with
ReplicateOCIArtifacts.
- Around line 210-218: Replace the brittle substring check using
replicationCond.Message with an explicit, robust check: add or use a dedicated
status field (e.g., Status.ReplicatedChartRef) on the plugin definition and
change the skip logic in the function that calls
pluginDef.GetConditions()/GetConditionByType(OCIReplicationReadyCondition) to
first compare pluginDef.Status.ReplicatedChartRef == chartRef (exact equality),
and only fall back to a strict exact message match (replicationCond.Message ==
expectedMessage) if the status field is not present; ensure updates to set
Status.ReplicatedChartRef when replication succeeds so future checks use the
exact equality.

In `@internal/ocimirror/ociref_test.go`:
- Around line 62-104: Add unit tests for SplitOCIRef to cover edge cases: call
ocimirror.SplitOCIRef with an empty string and assert it does not panic and
returns empty reg/repo and empty tagOrDigest; call it with a malformed input
like ":::invalid" and assert it returns a non-panicking result (e.g., reg == ""
and repo == ":::invalid" or whatever the current implementation returns) to lock
in current behavior; and call it with a registry containing a port
"localhost:5000/myimage:tag" and assert reg == "localhost:5000", repo ==
"myimage", tagOrDigest == ":tag". Name the tests clearly and use the same Expect
style as existing It blocks to document expected behavior of SplitOCIRef.

In `@internal/ocimirror/ociref.go`:
- Line 13: The current regex variable imageFieldPattern may miss valid YAML
image declarations (e.g., block scalars or nested keys like
spec.containers[].image); update the extraction to use a YAML parser instead of
regex for robustness: replace usages of imageFieldPattern with parsing the
manifest via a YAML library (unmarshal each document and walk maps/arrays to
collect values for the "image" key, including nested paths like
spec.containers[].image), or if you must keep regex, add a clear comment on
imageFieldPattern documenting its limitations and intended scope for
Helm-rendered manifests only.
- Around line 34-38: SplitOCIRef currently swallows name.ParseReference errors
and returns a fallback ("docker.io", imageRef, "") which can hide invalid refs;
change SplitOCIRef to surface parse failures by returning an error instead of
silently falling back: update the SplitOCIRef signature to include an error
return, propagate the parse error from name.ParseReference, and update all
callers to handle the error (or, if you prefer graceful degradation, add a clear
comment explaining the fallback and emit a warning log via the package logger
when ParseReference fails). Locate the function SplitOCIRef and the
name.ParseReference call to implement this change and update call sites
accordingly.

In `@internal/ocimirror/replicator.go`:
- Around line 105-115: TriggerReplication currently builds opts via opts :=
append([]crane.Option{crane.WithAuth(r.auth)}, extraOpts...) which can mutate
the caller's extraOpts backing array; change this to make a defensive copy
instead (e.g., allocate a new slice and copy extraOpts or use slices.Clone) so
appending the auth option cannot alter the caller's slice, and update the
comment to note that TriggerReplication creates an internal copy of options
before appending; reference: TriggerReplication, extraOpts, opts,
manifestFetcher.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2f793cfd-cb80-46df-98c5-3e21681f318c

📥 Commits

Reviewing files that changed from the base of the PR and between 76f0768 and 5bbd391.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (18)
  • api/v1alpha1/plugindefinition_types.go
  • charts/manager/crds/greenhouse.sap_catalogs.yaml
  • charts/manager/crds/greenhouse.sap_clusterplugindefinitions.yaml
  • go.mod
  • internal/controller/plugin/image_mirror.go
  • internal/controller/plugin/oci_mirror.go
  • internal/controller/plugin/oci_mirror_test.go
  • internal/controller/plugin/plugin_controller_flux.go
  • internal/controller/plugindefinition/cluster_plugindefinition_controller.go
  • internal/controller/plugindefinition/plugindefinition_controller.go
  • internal/controller/plugindefinition/utils.go
  • internal/ocimirror/ociref.go
  • internal/ocimirror/ociref_test.go
  • internal/ocimirror/registry_mirror.go
  • internal/ocimirror/registry_mirror_test.go
  • internal/ocimirror/replicator.go
  • internal/ocimirror/replicator_test.go
  • internal/ocimirror/suite_test.go
💤 Files with no reviewable changes (1)
  • internal/controller/plugin/image_mirror.go

Comment on lines +84 to +91
func (c *RegistryMirrorConfig) ResolveOCIRef(imageRef string) *ResolvedOCIRef {
registry, repo, tagOrDigest := SplitOCIRef(imageRef)
mirror, ok := c.RegistryMirrors[registry]
if !ok {
return nil
}
return &ResolvedOCIRef{Registry: registry, Mirror: mirror, Repository: repo, TagOrDigest: tagOrDigest}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add nil receiver check to prevent potential panic.

ResolveOCIRef is a method on *RegistryMirrorConfig but doesn't check for a nil receiver. If GetRegistryMirrorConfig returns (nil, nil) (which it does when no config exists), calling ResolveOCIRef on that nil pointer would panic.

🛡️ Proposed fix to add nil check
 // ResolveOCIRef looks up the mirror configuration for an OCI reference.
 func (c *RegistryMirrorConfig) ResolveOCIRef(imageRef string) *ResolvedOCIRef {
+	if c == nil {
+		return nil
+	}
 	registry, repo, tagOrDigest := SplitOCIRef(imageRef)
 	mirror, ok := c.RegistryMirrors[registry]
 	if !ok {
 		return nil
 	}
 	return &ResolvedOCIRef{Registry: registry, Mirror: mirror, Repository: repo, TagOrDigest: tagOrDigest}
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func (c *RegistryMirrorConfig) ResolveOCIRef(imageRef string) *ResolvedOCIRef {
registry, repo, tagOrDigest := SplitOCIRef(imageRef)
mirror, ok := c.RegistryMirrors[registry]
if !ok {
return nil
}
return &ResolvedOCIRef{Registry: registry, Mirror: mirror, Repository: repo, TagOrDigest: tagOrDigest}
}
func (c *RegistryMirrorConfig) ResolveOCIRef(imageRef string) *ResolvedOCIRef {
if c == nil {
return nil
}
registry, repo, tagOrDigest := SplitOCIRef(imageRef)
mirror, ok := c.RegistryMirrors[registry]
if !ok {
return nil
}
return &ResolvedOCIRef{Registry: registry, Mirror: mirror, Repository: repo, TagOrDigest: tagOrDigest}
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/ocimirror/registry_mirror.go` around lines 84 - 91, ResolveOCIRef
lacks a nil-receiver guard and will panic if called on a nil
*RegistryMirrorConfig (e.g., when GetRegistryMirrorConfig returns (nil, nil));
add an early check at the start of RegistryMirrorConfig.ResolveOCIRef (if c ==
nil) to return nil, and optionally also handle a nil c.RegistryMirrors map
safely before lookup, so the method returns nil instead of panicking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEAT] - Image Replication via configured pull through cache mirror

1 participant