Skip to content

Conversation

@mangelajo
Copy link
Member

@mangelajo mangelajo commented Oct 13, 2025

  • Deploys endpoints for various types of services but still missing:
  1. Ingress creation
  2. Openshift Route creation
  • Creates secrets automatically, keeping backwards compatibility with existing helm secrets

  • Creates configmap, and standardizes on config structures instead of raw yaml creation/parsing.

  • Deploys one router deployment per replica, with the corresponding endpoints

  • Deploys a controller

Runs on CI

Summary by CodeRabbit

  • New Features

    • Operator-based deployment: new CRDs (Client, Exporter, Lease, ExporterAccessPolicy), endpoint address format with templating, and ClusterIP support
    • Operator now provisions services for endpoints (ClusterIP/LoadBalancer/NodePort) including per-replica router services
  • Chores

    • Binaries now include build metadata (version/commit/date) surfaced at startup; build and Docker flows updated for caching and build-args; operator image default moved to quay.io; new Make targets and local deployment helpers/scripts added
  • Tests

    • Added extensive unit and envtest integration tests and test harness improvements

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 13, 2025

Walkthrough

Embeds build/version metadata into builds and Docker; adds operator multi-stage build and Make targets; replaces operator scaffold with a full reconciler (RBAC, deployments, endpoint services) and tests; extends CRDs (Client/Exporter/Lease/ExporterAccessPolicy) and endpoint schema (address + clusterIP); improves keepalive parsing and local deployment tooling.

Changes

Cohort / File(s) Summary
Build & CI
Makefile, .github/workflows/build.yaml, .github/workflows/pr-kind.yaml, .github/workflows/e2e.yaml
Introduce GIT_VERSION/GIT_COMMIT/BUILD_DATE LDFLAGS and pass as Docker build-args; add deploy-with-operator job; update e2e action to release tag and rename job key; export build args step added.
Docker / Image builds
Dockerfile, Dockerfile.operator, deploy/operator/Dockerfile (deleted), deploy/operator/Makefile
Add build ARGs and embed ldflags into manager/router builds; add Dockerfile.operator multi-stage UBI9 image, remove old operator Dockerfile; operator Makefile updated to use quay image, pass build-args, and expose operator-specific targets.
Manager & startup
cmd/main.go, cmd/router/main.go
Add version/gitCommit/buildDate variables and startup logging; main gains getWatchNamespace(), configures namespace-scoped cache and leader-election namespace; router parses flags earlier.
Operator API types & CRDs
deploy/operator/api/v1alpha1/jumpstarter_types.go, .../zz_generated.deepcopy.go, deploy/operator/config/crd/bases/*.yaml, deploy/operator/config/crd/kustomization.yaml
Replace Endpoint.Hostname with Endpoint.Address; add ClusterIPConfig and ClusterIP field; update deepcopy; add CRDs for Client, Exporter, Lease, ExporterAccessPolicy and register them in kustomization.
Operator controller & reconcilers
deploy/operator/cmd/main.go, deploy/operator/internal/controller/jumpstarter/..., deploy/operator/internal/controller/jumpstarter/endpoints/*
Replace scaffold with full JumpstarterReconciler (RBAC reconciliation, deployments, per-replica router handling), add EndpointReconciler with createOrUpdateService and service reconciliation helpers, add compare/diff helpers.
Operator tests & testutils
deploy/operator/internal/controller/jumpstarter/*_test.go, .../endpoints/*_test.go, deploy/operator/internal/controller/testutils/envtest.go
Add Ginkgo/Gomega tests and suite bootstrapping for endpoints and jumpstarter reconciler; add envtest helper GetFirstFoundEnvTestBinaryDir and adjust suites to use it.
Kustomize / Manager manifests & RBAC
deploy/operator/config/manager/kustomization.yaml, deploy/operator/config/manager/manager.yaml, deploy/operator/config/rbac/role.yaml
Add image override and update manager image/pullPolicy; expand RBAC rules to cover additional API groups/resources.
Config types & keepalive parsing
internal/config/types.go, internal/config/grpc.go, internal/config/types_test.go
Add YAML-tagged types (K8s/Internal/Keepalive/Router) and ParseDuration; update LoadGrpcConfiguration to parse durations with defaults and wrapped errors; add round-trip/parse tests.
Service logs & router split
internal/service/controller_service.go, internal/service/router_service.go, cmd/router/main.go
Startup logs now include explicit ports; router binary gets build-time version vars and logs them; flag.Parse() moved earlier.
Shell tooling & deployment scripts
hack/utils, hack/deploy_vars, hack/deploy_with_operator.sh, hack/deploy_with_helm.sh, hack/kind_cluster.yaml, hack/get_ext_ip.sh (deleted)
Add hack/utils helpers (external IP, kind image loader, ingress installer, readiness checks), deploy_vars, new deploy_with_operator script, refactor Helm deploy script to use utilities, update kind config, remove old get_ext_ip.sh.
Operator logging levels
deploy/operator/internal/log/levels.go
Add exported log verbosity constants: LevelInfo, LevelDebug, LevelTrace.

Sequence Diagram(s)

sequenceDiagram
    participant Manager as ctrl.Manager
    participant Jump as JumpstarterReconciler
    participant Endpoints as EndpointReconciler
    participant K8s as Kubernetes API

    Manager->>Jump: Reconcile(ctx, Request)
    activate Jump
    Jump->>K8s: Get Jumpstarter CR
    K8s-->>Jump: Jumpstarter resource
    Jump->>Jump: reconcileRBAC()
    Jump->>K8s: CreateOrUpdate(ServiceAccount/Role/RoleBinding)
    Jump->>Jump: reconcileDeployments(), reconcileConfigMaps(), ensureSecrets()
    loop per-controller endpoint
      Jump->>Endpoints: ReconcileControllerEndpoint(...)
      Endpoints->>K8s: CreateOrUpdate Service(s)
      K8s-->>Endpoints: Service created/updated
    end
    loop per-router replica
      Jump->>Endpoints: ReconcileRouterReplicaEndpoint(...)
      Endpoints->>K8s: CreateOrUpdate Service(s)
      K8s-->>Endpoints: Service created/updated
    end
    Jump-->>Manager: Requeue after 30m
    deactivate Jump
Loading
sequenceDiagram
    participant Main as cmd/main.go
    participant FS as ServiceAccount File

    Main->>Main: getWatchNamespace()
    alt NAMESPACE env set
        Main->>Main: use NAMESPACE
    else serviceaccount file present
        Main->>FS: read namespace file
        FS-->>Main: namespace
    else
        Main->>Main: exit (no deterministic namespace)
    end
    Main->>Manager: NewManager(mgrOptions scoped by namespace)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

  • Files/areas needing extra attention:
    • deploy/operator/internal/controller/jumpstarter/* — reconciliation logic, per-replica router handling, ownership and cleanup.
    • deploy/operator/internal/controller/jumpstarter/endpoints/* — createOrUpdateService, NodePort preservation, naming conventions.
    • CRD schema changes and deepcopy updates (deploy/operator/api & config/crd/bases/*) for API compatibility.
    • Build/tooling cross-cutting changes: Makefile, Dockerfile.operator, workflows to confirm LDFLAGS/build-args propagate correctly.

Possibly related PRs

Suggested reviewers

  • kirkbrauer
  • bennyz

"🐰
I stitched the build tags into every single crate,
I scoped the watch namespace before the manager wakes,
I hop to reconcile services, per-replica and neat,
With ClusterIP tucked where addresses now meet,
A carrot of versions, a hop — deploys complete."

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 71.43% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title "Operator: initial implementation" directly reflects the primary objective of this PR. The changeset is dominated by operator-related code in the deploy/operator/ directory, including a new JumpstarterReconciler, EndpointReconciler, RBAC configuration, multiple CRD definitions, and operator deployment logic. Supporting changes to core application code (versioning, configuration, logging) and CI/build infrastructure serve to enable the operator implementation. The title is concise and sufficiently specific—a teammate reviewing the commit history would clearly understand this PR introduces operator functionality. While the title could be more detailed, it effectively communicates the main change without noise.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch operator-implementation

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@mangelajo mangelajo marked this pull request as draft October 13, 2025 16:26
Base automatically changed from add-operator to main October 15, 2025 09:15
@mangelajo mangelajo force-pushed the operator-implementation branch 3 times, most recently from 4ed0651 to fc5e26c Compare October 15, 2025 13:10
@mangelajo mangelajo changed the title WIP: (follow up on add-operator) Operator implementation Operator implementation: service + ingress + route reconciliation Oct 15, 2025
@mangelajo mangelajo changed the title Operator implementation: service + ingress + route reconciliation Operator: service + ingress + route reconciliation Oct 15, 2025
@mangelajo mangelajo force-pushed the operator-implementation branch from fc5e26c to 39b394f Compare October 15, 2025 15:45
@mangelajo mangelajo force-pushed the operator-implementation branch from 71c4ed1 to 576f6ad Compare October 20, 2025 15:55
@mangelajo mangelajo force-pushed the operator-implementation branch 3 times, most recently from 99e01ce to 1d170e0 Compare October 22, 2025 11:50
@mangelajo mangelajo marked this pull request as ready for review October 22, 2025 11:51
@mangelajo mangelajo changed the title Operator: service + ingress + route reconciliation Operator: initial implementaiton Oct 22, 2025
@mangelajo mangelajo changed the title Operator: initial implementaiton Operator: initial implementation Oct 22, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 19

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
deploy/operator/api/v1alpha1/jumpstarter_types.go (1)

377-414: Update comment to reflect actual mutual exclusion constraints and clarify Route/Ingress handling.

The comment claiming "Multiple methods can be configured simultaneously for the same address" is inaccurate. The implementation enforces that LoadBalancer and NodePort cannot both be enabled (validated in serviceDetailsForEndpoint). Additionally, Route and Ingress fields in the Endpoint struct are never processed in ReconcileEndpoint—only one service type (LoadBalancer, NodePort, or ClusterIP) is created per endpoint.

Update the struct comment to clarify:

  • Only ONE of LoadBalancer or NodePort may be enabled per endpoint (mutually exclusive)
  • Route and Ingress configurations are present in the struct but not currently handled by this reconciler
  • ClusterIP is the default when neither LoadBalancer nor NodePort is enabled
🧹 Nitpick comments (31)
hack/kind_cluster.yaml (1)

20-22: Clarify the purpose of port 30010 for consistency.

Line 20 labels port 30010 as "grpc nodeport" without specifying a service or replica, while the new entries (lines 23, 26, 29) explicitly label their ports as "grpc router nodeport (replica N)". If port 30010 serves a different purpose (e.g., exporter or another service), a comment clarification would improve maintainability and align with the new labeling convention.

Consider updating the comment to:

-  - containerPort: 30010 # grpc nodeport
+  - containerPort: 30010 # grpc exporter nodeport

(or whatever service it actually serves)

deploy/operator/internal/utils/maps.go (1)

19-30: <!-- [scratchpad_start]
[task verification_results_analysis]

[observations]

  • Script 1 found MapsEqual IS USED in the codebase (contrary to AI summary)
  • Usage locations: deploy/operator/internal/controller/jumpstarter/endpoints/endpoints.go
  • Used at lines 142, 170, 171 to compare map[string]string values
  • Script 2 confirms Go version is 1.24.0 (maps.Equal available since Go 1.21)

[analysis]
Original review comment concern was INCORRECT:

  • AI summary claimed no usage in related diffs
  • Actual result: function is actively used in endpoints.go
  • Implementation is correct and handles all cases properly

New finding:

  • Go 1.24.0 supports maps.Equal from standard library
  • Custom MapsEqual function duplicates stdlib functionality
  • Could simplify codebase by using stdlib instead

[decision]

  • Function is not dead code (revert that concern)
  • Implementation is correct (approve logic)
  • Recommend optional refactor to use maps.Equal from stdlib
  • Tag:

[scratchpad_end]
-->

Function is in active use; consider migrating to standard library.

The custom MapsEqual implementation is correct and is actively used in deploy/operator/internal/controller/jumpstarter/endpoints/endpoints.go (lines 142, 170-171). However, since the project targets Go 1.24.0, you can simplify this by using the standard library's maps.Equal() function available since Go 1.21:

import "maps"

// Replace usages of:
// utils.MapsEqual(existing.Spec.Selector, desired.Spec.Selector)
// With:
// maps.Equal(existing.Spec.Selector, desired.Spec.Selector)

Then remove the custom MapsEqual function.

internal/config/types.go (2)

52-80: Document or enforce defaults for the other Keepalive fields.

Comments list defaults (Timeout, IntervalTime, etc.) but only MinTime is defaulted in loader. Either state “no defaulting applied; comments indicate typical values” or add defaulting where configuration is loaded.


89-97: Consider a lightweight validation helper for RouterEntry.

A Validate() error on RouterEntry (e.g., non-empty Endpoint) would catch bad ConfigMaps early. Optional, but helpful for operator UX.

internal/config/types_test.go (3)

165-170: Assert ‘default’ router presence here as well.

If “default” is missing, this block won’t fail. Add an else to surface the error.

   if defaultEntry, exists := router["default"]; exists {
     if len(defaultEntry.Labels) != 0 {
       t.Errorf("Default router should have no labels, got %d", len(defaultEntry.Labels))
     }
-  }
+  } else {
+    t.Error("Missing 'default' router")
+  }

180-206: Expand ParseDuration test cases (whitespace; optional d/w if enabled).

Add cases to harden the table:

   tests := []struct {
@@
-    {"invalid", true, ""},
+    {"invalid", true, ""},
+    {" 1s ", false, "1s"},
+    // Uncomment if ParseDuration supports days/weeks per code change:
+    // {"1d", false, "24h0m0s"},
+    // {"0.5w", false, "84h0m0s"},

10-64: Optional: add a couple more round-trip assertions.

Also assert K8s.Enabled and one more keepalive field (e.g., Timeout) to exercise YAML tags fully.

internal/config/grpc.go (1)

10-31: Add a focused test to lock in defaulting vs explicit “0s”.

I can add internal/config/grpc_test.go:

package config

import (
	"testing"
	"google.golang.org/grpc"
)

func TestLoadGrpcConfiguration_MinTimeDefaulting(t *testing.T) {
	// Unset: expect default to apply
	opt, err := LoadGrpcConfiguration(Grpc{Keepalive: Keepalive{MinTime: ""}})
	if err != nil {
		t.Fatal(err)
	}
	if opt == nil {
		t.Fatal("expected non-nil ServerOption")
	}

	// Explicit zero: expect no defaulting
	_, err = LoadGrpcConfiguration(Grpc{Keepalive: Keepalive{MinTime: "0s"}})
	if err != nil {
		t.Fatalf("unexpected error parsing 0s: %v", err)
	}
	// (Optionally inspect internal state via reflect, or just assert no panic/error.)
}

Say the word and I’ll push this.

deploy/operator/internal/controller/jumpstarter/suite_test.go (2)

69-69: Consider using a more robust path construction.

The relative path with multiple .. components is fragile and will break if the test file is moved or if the directory structure changes.

Consider using filepath.Join with a well-known base or using Go's //go:embed directive for embedding the CRD directory location, or using environment variables to make this more robust.


104-104: Highly brittle path construction.

The path construction with six levels of .. is extremely fragile and will break if this test file moves. This creates a tight coupling between the test file location and the binary directory structure.

Consider one of these approaches:

  • Use an environment variable (e.g., KUBEBUILDER_ASSETS) to point to the binary directory
  • Use go list -m -f '{{.Dir}}' to find the module root and build from there
  • Make the path configurable via a test flag
deploy/operator/internal/controller/jumpstarter/endpoints/suite_test.go (1)

69-69: Consider using a more robust path construction.

Same issue as in suite_test.go: the relative path with multiple .. components is fragile and will break if the test file is moved or if the directory structure changes.

Consider using a more robust approach such as environment variables or filepath.Join with a well-known base.

deploy/operator/config/crd/bases/jumpstarter.dev_exporters.yaml (1)

9-17: Improve CRD UX: add shortNames and printer columns

Short names and columns make kubectl discovery/ops easier.

 spec:
   group: jumpstarter.dev
   names:
     kind: Exporter
     listKind: ExporterList
     plural: exporters
     singular: exporter
+    shortNames:
+    - exp
   scope: Namespaced
   versions:
   - name: v1alpha1
     schema:
       openAPIV3Schema:
         description: Exporter is the Schema for the exporters API
@@
     served: true
     storage: true
+    additionalPrinterColumns:
+    - name: Endpoint
+      type: string
+      jsonPath: .status.endpoint
+    - name: LastSeen
+      type: date
+      jsonPath: .status.lastSeen
+    - name: Ready
+      type: string
+      jsonPath: .status.conditions[?(@.type=="Ready")].status
     subresources:
       status: {}

Also applies to: 18-21, 159-162

deploy/operator/config/crd/bases/jumpstarter.dev_clients.yaml (3)

20-47: Docs use “Identity/identities” — should be “Client/clients”

Update Go type comments (Client, ClientSpec, ClientStatus) to use correct nouns and regenerate CRDs to avoid confusing users.


48-66: Remove placeholder text from published CRD

Drop “INSERT ADDITIONAL STATUS FIELD …” from status. Fix in Go comments and re-run controller-gen.


9-17: Add conditions, shortNames, and printer columns for operability

Parity with Exporter improves ‘kubectl wait’ and listing.

 spec:
   group: jumpstarter.dev
   names:
     kind: Client
     listKind: ClientList
     plural: clients
     singular: client
+    shortNames:
+    - cli
@@
   - name: v1alpha1
     schema:
       openAPIV3Schema:
@@
-    served: true
-    storage: true
+    served: true
+    storage: true
+    additionalPrinterColumns:
+    - name: Endpoint
+      type: string
+      jsonPath: .status.endpoint
+    - name: Ready
+      type: string
+      jsonPath: .status.conditions[?(@.type=="Ready")].status
     subresources:
       status: {}

Note: add .status.conditions to the Go types (ClientStatus) and regenerate to support the Ready column.

Also applies to: 18-21, 68-71

deploy/operator/internal/controller/jumpstarter/jumpstarter_controller_test.go (3)

111-116: Add assertions for RBAC artifacts created during reconciliation

Assert the ServiceAccount/Role/RoleBinding exist to validate reconcile end-to-end.

   _, err := controllerReconciler.Reconcile(ctx, reconcile.Request{
     NamespacedName: typeNamespacedName,
   })
   Expect(err).NotTo(HaveOccurred())
-  // TODO(user): Add more specific assertions depending on your controller's reconciliation logic.
+  // Assert RBAC resources
+  Eventually(func(g Gomega) {
+    sa := &corev1.ServiceAccount{}
+    g.Expect(k8sClient.Get(ctx, types.NamespacedName{
+      Name: fmt.Sprintf("%s-controller-manager", resourceName), Namespace: "default",
+    }, sa)).To(Succeed())
+  }).Should(Succeed())
+  // Add similar checks for Role and RoleBinding if desired

Also applies to: 118-124


41-45: Avoid hardcoding “default” — create and use a per-test namespace

Prevents cross-test interference and eases cleanup.

- Namespace: "default", // TODO(user):Modify as needed
+ Namespace: ns.Name,

Create the namespace in BeforeEach and delete it in AfterEach:

ns := &corev1.Namespace{ObjectMeta: metav1.ObjectMeta{GenerateName: "js-e2e-"}}
Expect(k8sClient.Create(ctx, ns)).To(Succeed())
DeferCleanup(func() { _ = k8sClient.Delete(ctx, ns) })

Also applies to: 101-109


128-142: ensurePort: consider adding IPv4 and empty-address corner cases to table

Add cases for “1.2.3.4” and whitespace-only to lock behavior down.

deploy/operator/internal/controller/jumpstarter/endpoints/endpoints_test.go (1)

120-152: Consider verifying the NodePort value in the test.

The test sets Port: 30090 in the NodePortConfig but only verifies annotations and labels. Consider adding an assertion to verify that service.Spec.Ports[0].NodePort equals the configured port to ensure the reconciler correctly applies the NodePort value.

Apply this diff to add NodePort verification:

 		Expect(err).NotTo(HaveOccurred())
 		Expect(service.Spec.Type).To(Equal(corev1.ServiceTypeNodePort))
+		Expect(service.Spec.Ports[0].NodePort).To(Equal(int32(30090)))
 		Expect(service.Annotations["nodeport.kubernetes.io/port"]).To(Equal("30090"))
 		Expect(service.Labels["nodeport"]).To(Equal("true"))
hack/utils (1)

139-153: Consider quoting command variables for safety.

Lines 145, 158, and 159 use ${GRPCURL}, ${GRPC_ENDPOINT}, and ${GRPC_ROUTER_ENDPOINT} without quotes. While unlikely in this context, consider quoting these variables to prevent issues if paths contain spaces.

Apply this diff:

-  while ! ${GRPCURL} -insecure "${endpoint}" list; do
+  while ! "${GRPCURL}" -insecure "${endpoint}" list; do

And similarly for lines 158-159:

-  wait_for_grpc_endpoint "${GRPC_ENDPOINT}"
-  wait_for_grpc_endpoint "${GRPC_ROUTER_ENDPOINT}"
+  wait_for_grpc_endpoint "${GRPC_ENDPOINT}"
+  wait_for_grpc_endpoint "${GRPC_ROUTER_ENDPOINT}"
hack/deploy_with_helm.sh (4)

16-20: Ingress install should key off NETWORKING_MODE (single source of truth).

Today you gate install with INGRESS_ENABLED while Helm flags switch on NETWORKING_MODE. This can drift. Prefer using NETWORKING_MODE=="ingress" for both.


24-29: Use --set-string and quote values containing dots/colons.

GRPC endpoints (host:port) and baseDomain (dots) can be misparsed by Helm --set. Use --set-string or values.yaml.

Apply:

-HELM_SETS="${HELM_SETS} --set global.baseDomain=${BASEDOMAIN}"
-HELM_SETS="${HELM_SETS} --set jumpstarter-controller.grpc.endpoint=${GRPC_ENDPOINT}"
-HELM_SETS="${HELM_SETS} --set jumpstarter-controller.grpc.routerEndpoint=${GRPC_ROUTER_ENDPOINT}"
-HELM_SETS="${HELM_SETS} --set jumpstarter-controller.image=${IMAGE_REPO}"
-HELM_SETS="${HELM_SETS} --set jumpstarter-controller.tag=${IMAGE_TAG}"
+HELM_SETS="${HELM_SETS} --set-string global.baseDomain=${BASEDOMAIN}"
+HELM_SETS="${HELM_SETS} --set-string jumpstarter-controller.grpc.endpoint=${GRPC_ENDPOINT}"
+HELM_SETS="${HELM_SETS} --set-string jumpstarter-controller.grpc.routerEndpoint=${GRPC_ROUTER_ENDPOINT}"
+HELM_SETS="${HELM_SETS} --set-string jumpstarter-controller.image=${IMAGE_REPO}"
+HELM_SETS="${HELM_SETS} --set-string jumpstarter-controller.tag=${IMAGE_TAG}"

31-35: Suffix consistency: use '-np' to match chart names.

NodePort suffix elsewhere is '-np'. Keep flags/Service names aligned to avoid surprises in templating and cleanups.


37-40: Quote variables and function args.

Avoid word-splitting/globbing; also verify kind_load_image is sourced.

-echo -e "${GREEN}Loading the ${IMG} in kind ...${NC}"
-kind_load_image ${IMG}
+echo -e "${GREEN}Loading the ${IMG} in kind ...${NC}"
+kind_load_image "${IMG}"
deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go (3)

261-265: NodePort autoincrement may exceed allowed range.

Port + replicaIndex can fall outside kube-apiserver NodePort range. Guard or clamp.

-    svcPort.NodePort = endpoint.NodePort.Port + int32(i)
+    next := endpoint.NodePort.Port + int32(i)
+    if next < 30000 || next > 32767 { // typical defaults; make configurable
+        next = endpoint.NodePort.Port
+    }
+    svcPort.NodePort = next

164-166: Unconditional requeue every 10s.

Let watches drive reconciliation; requeue on transient errors or pending conditions only. This reduces churn.

-    return ctrl.Result{RequeueAfter: 10 * time.Second}, nil
+    return ctrl.Result{}, nil

418-424: Status is stubbed.

Expose basic Ready/Replicas/Endpoints and wire conditions for controller/router Deployments.

I can sketch a minimal status with Available condition and observedGeneration.

deploy/operator/config/crd/bases/operator.jumpstarter.dev_jumpstarters.yaml (2)

898-904: Avoid defaulting ingress.class to 'default'.

This value often doesn’t exist; better to omit default and let users set their controller class.

-                                class:
-                                  default: default
+                                class:

Apply similarly in routers/restApi sections.


980-987: Enforce NodePort lower bound (or relax the doc).

Schema allows 1..65535 while the description says 30000–32767 (typical). Either set minimum: 30000 or adjust the text.

-                                  minimum: 1
+                                  minimum: 30000

(Apply to both NodePort.port occurrences.)

Also applies to: 1318-1325

deploy/operator/Makefile (2)

199-211: Bootstrap the buildx builder.

Some environments require an explicit bootstrap before the first multi-arch build.

 $(CONTAINER_TOOL) buildx use jumpstarter-operator-builder
+$(CONTAINER_TOOL) buildx inspect --bootstrap

180-186: Propagate version args only when present.

Minor: to improve reproducibility, consider pinning BUILD_DATE to SOURCE_DATE_EPOCH when set.

-    --build-arg BUILD_DATE=$(BUILD_DATE) \
+    --build-arg BUILD_DATE=$(BUILD_DATE) \
+    --build-arg SOURCE_DATE_EPOCH=$${SOURCE_DATE_EPOCH:-} \
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3ca8e81 and 1d170e0.

⛔ Files ignored due to path filters (1)
  • deploy/operator/go.sum is excluded by !**/*.sum
📒 Files selected for processing (42)
  • .github/workflows/build.yaml (2 hunks)
  • .github/workflows/e2e.yaml (1 hunks)
  • Dockerfile (2 hunks)
  • Dockerfile.operator (1 hunks)
  • Makefile (7 hunks)
  • cmd/main.go (5 hunks)
  • deploy/operator/Dockerfile (0 hunks)
  • deploy/operator/Makefile (6 hunks)
  • deploy/operator/api/v1alpha1/jumpstarter_types.go (3 hunks)
  • deploy/operator/api/v1alpha1/zz_generated.deepcopy.go (3 hunks)
  • deploy/operator/cmd/main.go (3 hunks)
  • deploy/operator/config/crd/bases/jumpstarter.dev_clients.yaml (1 hunks)
  • deploy/operator/config/crd/bases/jumpstarter.dev_exporteraccesspolicies.yaml (1 hunks)
  • deploy/operator/config/crd/bases/jumpstarter.dev_exporters.yaml (1 hunks)
  • deploy/operator/config/crd/bases/jumpstarter.dev_leases.yaml (1 hunks)
  • deploy/operator/config/crd/bases/operator.jumpstarter.dev_jumpstarters.yaml (3 hunks)
  • deploy/operator/config/crd/kustomization.yaml (1 hunks)
  • deploy/operator/config/manager/kustomization.yaml (1 hunks)
  • deploy/operator/config/manager/manager.yaml (1 hunks)
  • deploy/operator/config/rbac/role.yaml (2 hunks)
  • deploy/operator/go.mod (3 hunks)
  • deploy/operator/internal/controller/jumpstarter/endpoints/endpoints.go (1 hunks)
  • deploy/operator/internal/controller/jumpstarter/endpoints/endpoints_test.go (1 hunks)
  • deploy/operator/internal/controller/jumpstarter/endpoints/suite_test.go (1 hunks)
  • deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go (1 hunks)
  • deploy/operator/internal/controller/jumpstarter/jumpstarter_controller_test.go (4 hunks)
  • deploy/operator/internal/controller/jumpstarter/rbac.go (1 hunks)
  • deploy/operator/internal/controller/jumpstarter/suite_test.go (3 hunks)
  • deploy/operator/internal/controller/jumpstarter_controller.go (0 hunks)
  • deploy/operator/internal/log/levels.go (1 hunks)
  • deploy/operator/internal/utils/maps.go (1 hunks)
  • hack/deploy_vars (1 hunks)
  • hack/deploy_with_helm.sh (2 hunks)
  • hack/deploy_with_operator.sh (1 hunks)
  • hack/get_ext_ip.sh (0 hunks)
  • hack/kind_cluster.yaml (1 hunks)
  • hack/utils (1 hunks)
  • internal/config/grpc.go (1 hunks)
  • internal/config/types.go (1 hunks)
  • internal/config/types_test.go (1 hunks)
  • internal/service/controller_service.go (1 hunks)
  • internal/service/router_service.go (1 hunks)
💤 Files with no reviewable changes (3)
  • deploy/operator/internal/controller/jumpstarter_controller.go
  • hack/get_ext_ip.sh
  • deploy/operator/Dockerfile
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-02-17T20:31:08.553Z
Learnt from: NickCao
PR: jumpstarter-dev/jumpstarter-controller#101
File: internal/controller/client_controller_test.go:152-152
Timestamp: 2025-02-17T20:31:08.553Z
Learning: In the jumpstarter-controller repository, the `TokenKey` constant is defined within the `internal/controller` package and is accessible to all files within the same package.

Applied to files:

  • deploy/operator/internal/controller/jumpstarter/suite_test.go
📚 Learning: 2025-05-13T19:57:56.811Z
Learnt from: NickCao
PR: jumpstarter-dev/jumpstarter-controller#137
File: hack/deploy_with_helm.sh:26-34
Timestamp: 2025-05-13T19:57:56.811Z
Learning: The jumpstarter project uses a custom kind cluster configuration with an expanded NodePort range (3000-32767) and explicit port mappings for ingress (5080/5443) and gRPC services (30010/30011 mapped to 8082/8083).

Applied to files:

  • hack/kind_cluster.yaml
📚 Learning: 2025-10-10T13:32:54.936Z
Learnt from: mangelajo
PR: jumpstarter-dev/jumpstarter-controller#169
File: deploy/operator/Makefile:62-67
Timestamp: 2025-10-10T13:32:54.936Z
Learning: Podman now supports the buildx command, making it compatible with Docker's buildx syntax for multi-platform builds. No special gating or conditional logic is needed when using buildx with CONTAINER_TOOL=podman in Makefiles.

Applied to files:

  • deploy/operator/Makefile
🧬 Code graph analysis (13)
internal/service/router_service.go (1)
internal/log/levels.go (1)
  • Info (61-63)
deploy/operator/internal/controller/jumpstarter/endpoints/endpoints_test.go (2)
deploy/operator/internal/controller/jumpstarter/endpoints/endpoints.go (2)
  • Reconciler (35-38)
  • NewReconciler (49-54)
deploy/operator/api/v1alpha1/jumpstarter_types.go (3)
  • Endpoint (380-414)
  • LoadBalancerConfig (476-496)
  • NodePortConfig (454-473)
deploy/operator/internal/controller/jumpstarter/rbac.go (2)
deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go (1)
  • JumpstarterReconciler (49-53)
deploy/operator/api/v1alpha1/jumpstarter_types.go (1)
  • Jumpstarter (525-531)
deploy/operator/internal/controller/jumpstarter/endpoints/suite_test.go (1)
api/v1alpha1/client_types.go (1)
  • Client (44-50)
internal/config/types_test.go (1)
internal/config/types.go (10)
  • Config (11-15)
  • Authentication (19-23)
  • Internal (26-33)
  • K8s (36-39)
  • Provisioning (42-44)
  • Grpc (47-49)
  • Keepalive (53-81)
  • Router (87-87)
  • RouterEntry (90-97)
  • ParseDuration (100-105)
internal/service/controller_service.go (1)
internal/log/levels.go (1)
  • Info (61-63)
deploy/operator/api/v1alpha1/zz_generated.deepcopy.go (1)
deploy/operator/api/v1alpha1/jumpstarter_types.go (1)
  • ClusterIPConfig (499-511)
internal/config/grpc.go (1)
internal/config/types.go (3)
  • Grpc (47-49)
  • Keepalive (53-81)
  • ParseDuration (100-105)
deploy/operator/cmd/main.go (2)
deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go (1)
  • JumpstarterReconciler (49-53)
deploy/operator/internal/controller/jumpstarter/endpoints/endpoints.go (1)
  • NewReconciler (49-54)
deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go (4)
deploy/operator/internal/controller/jumpstarter/endpoints/endpoints.go (1)
  • Reconciler (35-38)
deploy/operator/api/v1alpha1/jumpstarter_types.go (1)
  • Endpoint (380-414)
deploy/operator/internal/log/levels.go (2)
  • LevelTrace (17-17)
  • LevelDebug (12-12)
internal/config/types.go (9)
  • Config (11-15)
  • Provisioning (42-44)
  • Grpc (47-49)
  • Keepalive (53-81)
  • Authentication (19-23)
  • Internal (26-33)
  • K8s (36-39)
  • Router (87-87)
  • RouterEntry (90-97)
deploy/operator/internal/controller/jumpstarter/endpoints/endpoints.go (2)
deploy/operator/api/v1alpha1/jumpstarter_types.go (1)
  • Endpoint (380-414)
deploy/operator/internal/utils/maps.go (1)
  • MapsEqual (20-30)
internal/config/types.go (1)
deploy/helm/jumpstarter/charts/jumpstarter-controller/model.py (1)
  • JWTAuthenticator (174-180)
deploy/operator/internal/controller/jumpstarter/jumpstarter_controller_test.go (2)
deploy/operator/api/v1alpha1/jumpstarter_types.go (5)
  • JumpstarterSpec (130-154)
  • ControllerConfig (192-230)
  • GRPCConfig (244-260)
  • Endpoint (380-414)
  • RoutersConfig (158-187)
deploy/operator/internal/controller/jumpstarter/endpoints/endpoints.go (1)
  • NewReconciler (49-54)
🪛 actionlint (1.7.8)
.github/workflows/build.yaml

44-44: shellcheck reported issue in this script: SC2086:info:3:36: Double quote to prevent globbing and word splitting

(shellcheck)


44-44: shellcheck reported issue in this script: SC2086:info:4:36: Double quote to prevent globbing and word splitting

(shellcheck)

🪛 OSV Scanner (2.2.3)
deploy/operator/go.mod

[HIGH] 10-10: github.com/golang-jwt/jwt/v5 5.2.1: Excessive memory allocation during header parsing in github.com/golang-jwt/jwt

(GO-2025-3553)


[HIGH] 10-10: github.com/golang-jwt/jwt/v5 5.2.1: jwt-go allows excessive memory allocation during header parsing

(GHSA-mh63-6h87-95cp)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: e2e-tests (ubuntu-24.04-arm)
  • GitHub Check: e2e-tests (ubuntu-24.04)
  • GitHub Check: e2e-tests-release-0-7
  • GitHub Check: tests
  • GitHub Check: deploy-kind
  • GitHub Check: lint-go
🔇 Additional comments (34)
hack/kind_cluster.yaml (1)

23-31: Port mapping changes look good.

The replica numbering in comments (lines 23, 26, 29) is clear and consistent. The sequential NodePort (30011–30013) and host port (8083–8085) mappings are well-organized and support the multi-router deployment model described in the PR objectives.

internal/config/grpc.go (1)

24-31: Nice cleanup on wrapping the policy, and error context.

The explicit EnforcementPolicy var and wrapped error improve clarity.

internal/config/types.go (1)

58-61: Comment accurately identifies the default value mismatch; decide on implementation approach.

The bool zero-value for PermitWithoutStream when omitted from config is false, but the comment claims "Default: true". There is no loader-side defaulting logic (unlike MinTime in internal/config/grpc.go:20-22), so the documented default doesn't match actual behavior.

However, evidence suggests the intent is true: kubebuilder annotation (jumpstarter_types.go:276), hardcoded true values throughout the codebase, and fallback default in config.go:87.

Choose one approach:

  • Update comment to "Default: false" (if omitted field should be false), or
  • Make field *bool and default to true when nil (matches documented intent and operator behavior), or
  • Add defaulting logic in internal/config/grpc.go (but loses ability to distinguish omitted vs. explicit false).

Recommend option 2 for consistency with existing operator schema and intent.

internal/service/controller_service.go (1)

748-748: LGTM! Improved observability.

The log message now explicitly states the port number, improving operational clarity and making it easier to verify service startup configuration.

internal/service/router_service.go (1)

138-138: LGTM! Consistent observability improvement.

The log message enhancement matches the change in controller_service.go, providing better visibility into service configuration at startup.

deploy/operator/api/v1alpha1/zz_generated.deepcopy.go (3)

4-4: LGTM! Auto-generated copyright year update.

The copyright year update is appropriate for 2025.


54-81: LGTM! Properly generated deepcopy methods.

The auto-generated DeepCopyInto and DeepCopy methods for ClusterIPConfig correctly handle nil-safe copying of the Annotations and Labels maps.


126-130: LGTM! Correct integration with Endpoint deepcopy.

The Endpoint.DeepCopyInto method properly handles the new optional ClusterIP field with nil-safe deepcopy.

deploy/operator/internal/log/levels.go (1)

1-18: LGTM! Clear and well-documented log level constants.

The log level constants follow controller-runtime's logr conventions and provide clear documentation for each verbosity level. This centralized definition will improve logging consistency across the codebase.

cmd/main.go (4)

60-64: LGTM! Standard versioning variables.

The version variables follow best practices for injecting build metadata via ldflags, enabling runtime version reporting.


71-93: LGTM! Robust namespace detection logic.

The namespace detection follows a sensible priority order (explicit env var → auto-detected from service account → fail). The logging at each step aids troubleshooting.


128-133: LGTM! Useful version logging at startup.

Version information at startup is valuable for debugging and operational tracking.


184-194: Namespace requirement properly enforced with intelligent fallback mechanism.

The code correctly enforces single-namespace operation: it first checks the NAMESPACE environment variable, then falls back to the Kubernetes service account namespace file (automatically mounted in every pod), and finally errors if neither is available. The deployment will work correctly in Kubernetes clusters without explicit NAMESPACE configuration, as the service account mount is automatically provided by Kubernetes.

deploy/operator/go.mod (1)

6-6: LGTM! Local dependency replace directive in place.

The direct dependency on jumpstarter-controller is properly managed with the replace directive on line 17, ensuring local development workflow.

deploy/operator/config/manager/manager.yaml (1)

66-67: LGTM! Image configuration updated appropriately.

The image reference now uses the official quay.io registry, and imagePullPolicy: IfNotPresent is a sensible default for the latest tag, reducing unnecessary pulls while still allowing updates.

deploy/operator/config/manager/kustomization.yaml (1)

3-8: LGTM! Proper Kustomize configuration.

The addition of apiVersion/kind metadata and the images block follows Kustomize best practices. The image override is consistent with the changes in manager.yaml and enables easy tag management for different environments.

.github/workflows/e2e.yaml (1)

22-28: LGTM!

The addition of a release-specific e2e test job is a good practice for ensuring compatibility with the release-0.7 version.

.github/workflows/build.yaml (2)

42-50: LGTM!

The build args collection correctly computes and exports version metadata. The shellcheck warnings about missing quotes are false positives in this context, as git rev-parse HEAD and date outputs don't contain characters that would cause word splitting or globbing issues.


92-95: LGTM!

Build args are correctly passed to the Docker build context, enabling version metadata injection into the built images.

deploy/operator/config/crd/kustomization.yaml (1)

6-10: LGTM!

The addition of new CRD bases aligns with the introduction of new custom resources (Client, Exporter, Lease, ExporterAccessPolicy) mentioned in the PR objectives.

Dockerfile (2)

5-7: LGTM!

Adding build arguments for version metadata enables proper version tracking in the built binaries.


14-17: LGTM!

Cache mounts will significantly improve build performance by persisting module downloads and build artifacts across builds.

Dockerfile.operator (1)

1-47: LGTM!

The operator Dockerfile correctly implements version metadata injection via ldflags and utilizes cache mounts for improved build performance. The multi-stage build follows best practices.

deploy/operator/cmd/main.go (3)

50-53: LGTM! Good practice for version tracking.

The build-time version information variables follow best practices and enable proper version tracking in production deployments.


98-103: LGTM! Helpful startup diagnostics.

Logging version information at startup provides valuable diagnostics for troubleshooting and version verification in production.


218-222: LGTM! Clean dependency injection.

The EndpointReconciler is properly wired into the JumpstarterReconciler using constructor injection, following good dependency management practices.

deploy/operator/config/crd/bases/jumpstarter.dev_exporteraccesspolicies.yaml (1)

1-166: LGTM! Well-structured CRD manifest.

The ExporterAccessPolicy CRD is properly defined with comprehensive OpenAPI schema validation, label selectors, and appropriate metadata annotations.

hack/deploy_with_operator.sh (2)

29-37: LGTM! Safe restart sequence.

The restart logic properly scales down, waits for pod deletion with timeout, then scales back up. The || true on the wait command prevents spurious failures if pods don't exist yet.


53-85: LGTM! Clean conditional endpoint configuration.

The endpoint configuration properly adapts to ingress vs nodeport modes with clear comments explaining the port mappings.

deploy/operator/api/v1alpha1/jumpstarter_types.go (1)

498-511: LGTM! Consistent ClusterIP configuration structure.

The ClusterIPConfig struct follows the same pattern as NodePortConfig and LoadBalancerConfig, maintaining consistency across service type configurations.

hack/utils (1)

92-133: LGTM! Robust resource readiness checking.

The wait logic properly handles asynchronous resource creation with reasonable timeouts and clear progress messages.

Makefile (3)

10-18: LGTM! Proper version metadata setup.

The version information variables and LDFLAGS are correctly configured to embed build metadata into binaries, following Go build best practices.


66-66: LGTM! Maintains CRD consistency.

Copying CRDs to the operator config directory ensures both Helm and operator deployment methods use the same CRD definitions.


120-124: LGTM! Version metadata propagated to container builds.

The build-args correctly pass version information to the Dockerfile, enabling the same metadata to be embedded in containerized builds.

github.com/go-task/slim-sprig/v3 v3.0.0 // indirect
github.com/goccy/go-json v0.10.2 // indirect
github.com/gogo/protobuf v1.3.2 // indirect
github.com/golang-jwt/jwt/v5 v5.2.1 // indirect
Copy link
Contributor

@coderabbitai coderabbitai bot Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify JWT library version for security vulnerability.

Static analysis flagged golang-jwt/jwt/v5 v5.2.1 with a HIGH severity vulnerability (GO-2025-3553, GHSA-mh63-6h87-95cp) related to excessive memory allocation during header parsing. Since this is a transitive dependency, check if a patched version is available and consider updating the direct dependencies that pull it in.


🏁 Script executed:

#!/bin/bash
# Check which direct dependencies require jwt and if newer versions are available
echo "=== Checking go.mod for direct dependencies that might pull in jwt ==="
rg -n "github.com/golang-jwt/jwt" deploy/operator/go.mod

echo ""
echo "=== Checking parent go.mod ==="
rg -n "github.com/golang-jwt/jwt" go.mod

echo ""
echo "=== Check latest version of jwt library ==="
curl -s https://api.github.com/repos/golang-jwt/jwt/releases/latest | jq -r '.tag_name'

Length of output: 264


I need to verify the vulnerability status. Let me search for details on the specific CVE and whether it's fixed in the newer version:

Update github.com/golang-jwt/jwt/v5 from v5.2.1 to v5.2.2 or later to patch HIGH severity vulnerability.

The current version v5.2.1 is affected by GHSA-mh63-6h87-95cp, which is fixed in v5.2.2. This vulnerability causes excessive memory allocation during header parsing when processing untrusted JWT tokens, with a CVSS score of 7.5 (High). Latest available version is v5.3.0. Update the direct dependencies that pull in JWT to resolve this transitive dependency issue.

🤖 Prompt for AI Agents
In deploy/operator/go.mod around line 57, the module
github.com/golang-jwt/jwt/v5 is pinned to v5.2.1 which is vulnerable
(GHSA-mh63-6h87-95cp); update it to at least v5.2.2 (preferably v5.3.0) to patch
the excessive memory allocation issue. Edit go.mod to bump the version to v5.2.2
or v5.3.0, run go get github.com/golang-jwt/jwt/v5@v5.3.0 (or @v5.2.2) and then
run go mod tidy and go test to ensure dependency resolution and tests pass; if
jwt is only a transitive dependency, identify and update the direct dependency
that requires the older jwt version or add a replace directive to force the
fixed version.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will handle in a follow up PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mangelajo, understood! Since this is a HIGH severity security vulnerability, would you like me to create a GitHub issue to track updating github.com/golang-jwt/jwt/v5 to v5.2.2+ in the follow-up PR? That way it won't get lost.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes please, do it for both go.mods , the base project on the repository and deploy/operator

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

@mangelajo, I'll create a GitHub issue to track this security update for both go.mod files.


🏁 Script executed:

#!/bin/bash
# Create GitHub issue for JWT library security update

gh issue create \
  --title "Security: Update github.com/golang-jwt/jwt/v5 to fix HIGH severity vulnerability" \
  --body "## Summary
Update \`github.com/golang-jwt/jwt/v5\` from v5.2.1 to v5.2.2 or later (v5.3.0 available) to address HIGH severity vulnerability GHSA-mh63-6h87-95cp.

## Vulnerability Details
- **CVE**: GHSA-mh63-6h87-95cp
- **Severity**: HIGH (CVSS 7.5)
- **Issue**: Excessive memory allocation during header parsing when processing untrusted JWT tokens
- **Current Version**: v5.2.1 (vulnerable)
- **Fixed In**: v5.2.2+
- **Latest Version**: v5.3.0

## Affected Files
- \`go.mod\` (repository root)
- \`deploy/operator/go.mod\`

## Remediation Steps
Since this is a transitive dependency, identify and update the direct dependencies that pull it in, or add a replace directive to force the fixed version:

\`\`\`bash
# For root go.mod
cd <repo-root>
go get github.com/golang-jwt/jwt/v5@v5.3.0
go mod tidy
go test ./...

# For operator go.mod
cd deploy/operator
go get github.com/golang-jwt/jwt/v5@v5.3.0
go mod tidy
go test ./...
\`\`\`

## References
- Original PR: https://github.com/jumpstarter-dev/jumpstarter-controller/pull/170
- Discussion: https://github.com/jumpstarter-dev/jumpstarter-controller/pull/170#discussion_r2451864479

---
_This issue was created as a follow-up to the security review in PR #170 at the request of @mangelajo._" \
  --assignee mangelajo \
  --label security

Length of output: 88


Let me retry without skipping the repository clone:


🏁 Script executed:

#!/bin/bash
# Create GitHub issue for JWT library security update

gh issue create \
  --title "Security: Update github.com/golang-jwt/jwt/v5 to fix HIGH severity vulnerability" \
  --body "## Summary
Update \`github.com/golang-jwt/jwt/v5\` from v5.2.1 to v5.2.2 or later (v5.3.0 available) to address HIGH severity vulnerability GHSA-mh63-6h87-95cp.

## Vulnerability Details
- **CVE**: GHSA-mh63-6h87-95cp
- **Severity**: HIGH (CVSS 7.5)
- **Issue**: Excessive memory allocation during header parsing when processing untrusted JWT tokens
- **Current Version**: v5.2.1 (vulnerable)
- **Fixed In**: v5.2.2+
- **Latest Version**: v5.3.0

## Affected Files
- \`go.mod\` (repository root)
- \`deploy/operator/go.mod\`

## Remediation Steps
Since this is a transitive dependency, identify and update the direct dependencies that pull it in, or add a replace directive to force the fixed version:

\`\`\`bash
# For root go.mod
cd <repo-root>
go get github.com/golang-jwt/jwt/v5@v5.3.0
go mod tidy
go test ./...

# For operator go.mod
cd deploy/operator
go get github.com/golang-jwt/jwt/v5@v5.3.0
go mod tidy
go test ./...
\`\`\`

## References
- Original PR: https://github.com/jumpstarter-dev/jumpstarter-controller/pull/170
- Discussion: https://github.com/jumpstarter-dev/jumpstarter-controller/pull/170#discussion_r2451864479

---
_This issue was created as a follow-up to the security review in PR #170 at the request of @mangelajo._" \
  --assignee mangelajo \
  --label security

Length of output: 68


@mangelajo, done! I've created issue #175 to track updating github.com/golang-jwt/jwt/v5 in both go.mod files:

🔗 #175

The issue includes:

  • Vulnerability details (GHSA-mh63-6h87-95cp, HIGH severity)
  • Both affected go.mod files
  • Remediation steps for updating to v5.3.0
  • Backlinks to this PR and discussion

The issue has been assigned to you and labeled as security.

Comment on lines 124 to 135
existingService.Spec = desiredService.Spec
existingService.Annotations = desiredService.Annotations
existingService.Labels = desiredService.Labels

if err := r.Client.Update(ctx, existingService); err != nil {
return false, err
}
return true, nil
}

return false, nil
}
Copy link
Contributor

@coderabbitai coderabbitai bot Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Preserve immutable Service fields on update.

Updating Spec wholesale can fail (clusterIP/clusterIPs/ipFamilies/ipFamilyPolicy are immutable). Copy them from the existing Service before Update.

-        existingService.Spec = desiredService.Spec
+        // Preserve immutable fields
+        desired := desiredService.Spec
+        desired.ClusterIP = existingService.Spec.ClusterIP
+        desired.ClusterIPs = existingService.Spec.ClusterIPs
+        desired.IPFamilies = existingService.Spec.IPFamilies
+        desired.IPFamilyPolicy = existingService.Spec.IPFamilyPolicy
+        existingService.Spec = desired
🤖 Prompt for AI Agents
deploy/operator/internal/controller/jumpstarter/endpoints/endpoints.go lines
124-135: the code blindly replaces existingService.Spec with desiredService.Spec
which can fail because some Service fields (clusterIP, clusterIPs, ipFamilies,
ipFamilyPolicy) are immutable; before overwriting, copy those immutable fields
from existingService.Spec into desiredService.Spec (or preserve them on the
object you send to Update) so they remain unchanged, then assign/update; ensure
you handle nil slices/fields safely and only copy the specific immutable fields
rather than the whole Spec.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are already doing that because we are taking the existing service, and copying the fields we want to mutate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mangelajo, I understand your point, but there's a subtle issue with the current implementation. Let me clarify:

Current flow:

  1. existingService is fetched from the cluster (lines 100-101) — it has all immutable fields like ClusterIP, ClusterIPs, IPFamilies, IPFamilyPolicy set
  2. NodePort is correctly preserved by copying from existingService to desiredService (lines 114-120) ✅
  3. Line 124: existingService.Spec = desiredService.Spec — This replaces the entire Spec, including the immutable fields

The problem:
desiredService is created fresh in methods like ReconcileEndpoint and typically doesn't have ClusterIP, ClusterIPs, IPFamilies, or IPFamilyPolicy set. When you assign desiredService.Spec to existingService.Spec, you're losing these immutable fields that were originally in existingService.

The fix:
Before the assignment on line 124, copy the immutable fields from existingService.Spec to desiredService.Spec:

// Preserve immutable fields
desiredService.Spec.ClusterIP = existingService.Spec.ClusterIP
desiredService.Spec.ClusterIPs = existingService.Spec.ClusterIPs
desiredService.Spec.IPFamilies = existingService.Spec.IPFamilies
desiredService.Spec.IPFamilyPolicy = existingService.Spec.IPFamilyPolicy

existingService.Spec = desiredService.Spec

This ensures that when you send the update to the Kubernetes API server, the immutable fields remain unchanged, preventing potential rejection errors.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mangelajo, thank you for addressing the immutable Service fields preservation! This will prevent potential API server rejections when updating existing services.

🐰✨


If you found this review helpful, would you consider giving us a shout-out on X?

Thank you for using CodeRabbit!

@mangelajo mangelajo requested a review from bennyz October 22, 2025 13:47
@mangelajo mangelajo force-pushed the operator-implementation branch from 1d170e0 to 0855c26 Compare October 24, 2025 09:26
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
.github/workflows/build.yaml (1)

61-64: Release branch detection never matches due to quoted glob.

In bash, pattern matching on the RHS of [[ ... == pattern ]] requires it to be unquoted. Using "refs/heads/release-*" makes it a literal string, so release branch tags are never added. Fix both occurrences.

Apply this diff:

-          if [[ "${{ github.ref }}" == "refs/heads/release-*" ]]; then
+          if [[ "${{ github.ref }}" == refs/heads/release-* ]]; then
             RELEASE_BRANCH_NAME=$(basename "${{ github.ref }}")
             TAGS="$TAGS,${{ env.REGISTRY }}/${{ matrix.image_name }}:${RELEASE_BRANCH_NAME}"
           fi

And in the Helm step:

-          if [[ "${{ github.ref }}" == "refs/heads/release-*" ]]; then
+          if [[ "${{ github.ref }}" == refs/heads/release-* ]]; then
             RELEASE_BRANCH_NAME=$(basename "${{ github.ref }}")
             helm chart save jumpstarter-*.tgz ${{ env.QUAY_ORG }}/helm:${RELEASE_BRANCH_NAME}
             helm chart push ${{ env.QUAY_ORG }}/helm:${RELEASE_BRANCH_NAME}
           fi

Also applies to: 130-134

cmd/main.go (1)

218-231: Use watchNamespace instead of os.Getenv("NAMESPACE") for config key.

If the namespace was auto-detected from the service account, config.LoadConfiguration still uses the env var, which may be empty. This will fail to load the config.

Apply this diff:

-    client.ObjectKey{
-      Namespace: os.Getenv("NAMESPACE"),
-      Name:      "jumpstarter-controller",
-    },
+    client.ObjectKey{
+      Namespace: watchNamespace,
+      Name:      "jumpstarter-controller",
+    },
♻️ Duplicate comments (10)
internal/config/types.go (2)

30-32: Documentation mentions unsupported "30d" format.

This concern was already raised in a previous review. The comment example "30d" is not supported by time.ParseDuration. Either update the documentation to remove "30d" or extend ParseDuration to support day/week suffixes as previously suggested.


99-105: ParseDuration lacks support for duration formats mentioned in documentation.

This was previously flagged. The function delegates directly to time.ParseDuration, which rejects "d" (days) and "w" (weeks) suffixes mentioned in the documentation. Consider implementing the suggested enhancement to support these units or update all documentation to reflect the actual supported formats.

internal/config/grpc.go (1)

15-22: Defaulting logic incorrectly overrides explicit "0s" and uses magic number.

This issue was previously identified. The current code checks minTime == 0 after parsing, which means an explicit "0s" from the user will be overridden by the 1s default. Additionally, the magic number 1e9 should be replaced with time.Second for clarity. The fix should check ka.MinTime == "" before parsing to detect truly unset fields.

deploy/operator/config/rbac/role.yaml (1)

7-192: Address RBAC scope concerns from previous review.

The ClusterRole permissions remain very broad. Previous reviews identified several issues that should be addressed:

  1. Unnecessary get verb on status subresources (lines 32-36, 52-56, 99-102, 130-133, 188-192): The get verb is not valid for status subresources. Only update and patch work.

  2. Excessive delete permissions (lines 16, 43, 63): The delete verb on secrets, serviceaccounts, and services is rarely needed for operators and violates least-privilege principles.

  3. OpenShift-specific rules in base manifest (lines 160-192): The route.openshift.io rules should be moved to a kustomize overlay to avoid granting unexpected permissions on vanilla Kubernetes clusters.

deploy/operator/go.mod (1)

57-57: Update JWT library to patch HIGH severity vulnerability.

The github.com/golang-jwt/jwt/v5 v5.2.1 dependency has a HIGH severity vulnerability (GHSA-mh63-6h87-95cp, CVSS 7.5) causing excessive memory allocation during header parsing. This was flagged in a previous review and remains unresolved.

Update to at least v5.2.2 (preferably v5.3.0):

# If this is a transitive dependency, identify and update the direct dependency
go get -u github.com/golang-jwt/jwt/v5@v5.3.0
go mod tidy

Or add a replace directive if the direct dependency hasn't been updated:

replace github.com/golang-jwt/jwt/v5 => github.com/golang-jwt/jwt/v5 v5.3.0
deploy/operator/internal/controller/jumpstarter/rbac.go (2)

145-149: Remove delete verb from secrets - contradicts documented behavior.

Line 148 grants the delete verb on secrets, but this contradicts the documented behavior and violates least-privilege principles. According to comments in the controller code, secrets are only created and never deleted.

Apply this fix:

 {
   APIGroups: []string{""},
   Resources: []string{"secrets"},
-  Verbs:     []string{"get", "list", "watch", "create", "update", "patch", "delete"},
+  Verbs:     []string{"get", "list", "watch", "create", "update", "patch"},
 },

151-164: Add missing status and finalizers rules for exporteraccesspolicies.

The Role grants CRUD verbs for exporteraccesspolicies (line 152) but omits the status and finalizers subresources that are present for other CRDs (clients, exporters, leases). This inconsistency will prevent the controller from updating status and finalizers for exporteraccesspolicies.

Apply this fix:

 {
   APIGroups: []string{"jumpstarter.dev"},
-  Resources: []string{"clients/status", "exporters/status", "leases/status"},
+  Resources: []string{"clients/status", "exporters/status", "leases/status", "exporteraccesspolicies/status"},
   Verbs:     []string{"get", "update", "patch"},
 },
 {
   APIGroups: []string{"jumpstarter.dev"},
-  Resources: []string{"clients/finalizers", "exporters/finalizers", "leases/finalizers"},
+  Resources: []string{"clients/finalizers", "exporters/finalizers", "leases/finalizers", "exporteraccesspolicies/finalizers"},
   Verbs:     []string{"update"},
 },
deploy/operator/internal/controller/jumpstarter/endpoints/endpoints.go (1)

124-127: Controller Service selector should include controller=.

Selector only matches app=jumpstarter-controller; it can pick pods from other Jumpstarter CRs in the same namespace. Include controller=<owner.GetName()>.

-	podSelector := map[string]string{
-		"app": "jumpstarter-controller",
-	}
+	podSelector := map[string]string{
+		"app":        "jumpstarter-controller",
+		"controller": owner.GetName(),
+	}
deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go (2)

563-566: Typo: use --metrics-bind-address.

Single dash may not be parsed by pflag.

-								"-metrics-bind-address=:8080",
+								"--metrics-bind-address=:8080",

1034-1037: Service cleanup omits '-ing' and '-route' variants.

You create ClusterIP services with '-ing' and '-route' suffixes but never delete them on scale-down.

-	suffixes := []string{"", "-lb", "-np"}
+	suffixes := []string{"", "-lb", "-np", "-ing", "-route"}
🧹 Nitpick comments (24)
internal/config/types_test.go (3)

10-64: Consider expanding test coverage for round-trip validation.

The test validates only three fields after the YAML round-trip (Internal.Prefix, Keepalive.MinTime, and Keepalive.PermitWithoutStream). Consider adding assertions for other populated fields like K8s.Enabled, Provisioning.Enabled, TokenLifetime, and the remaining Keepalive fields to ensure complete round-trip integrity.


139-178: LGTM with optional enhancement.

The test effectively validates parsing of ConfigMap-style YAML. Consider verifying the actual label values (e.g., router-index equals "1") rather than just checking label existence for more thorough validation.


180-206: Consider expanding ParseDuration test coverage.

The tests validate basic functionality but miss several edge cases:

  • Whitespace handling (e.g., " 1s ", "\t10s")
  • Negative durations (e.g., "-1s")
  • Large values that might overflow
  • Day/week suffixes mentioned in documentation (though these currently fail, tests should either verify the error or document the limitation)

Additionally, note that the tests only cover formats supported by time.ParseDuration, which is inconsistent with the "30d" example in the documentation.

deploy/operator/internal/controller/testutils/envtest.go (1)

37-57: Add depth parameter validation and consider error signaling.

The function doesn't validate the depth parameter. A negative value would be accepted but produce an incorrect path, and callers won't be able to distinguish between "path not found" and "no subdirectories found" since both return an empty string.

Consider this enhancement:

 func GetFirstFoundEnvTestBinaryDir(depth int) string {
+	if depth < 0 {
+		logf.Log.Info("Invalid depth parameter", "depth", depth)
+		return ""
+	}
 	// Build the path based on depth
 	pathComponents := make([]string, 0, depth+2)
 	for i := 0; i < depth; i++ {
 		pathComponents = append(pathComponents, "..")
 	}
 	pathComponents = append(pathComponents, "bin", "k8s")
 
 	basePath := filepath.Join(pathComponents...)
 	entries, err := os.ReadDir(basePath)
 	if err != nil {
 		logf.Log.Error(err, "Failed to read directory", "path", basePath)
 		return ""
 	}
 	for _, entry := range entries {
 		if entry.IsDir() {
 			return filepath.Join(basePath, entry.Name())
 		}
 	}
+	logf.Log.Info("No ENVTEST binary directories found", "path", basePath)
 	return ""
 }
.github/workflows/build.yaml (1)

42-51: Quote GITHUB_OUTPUT redirections (SC2086) and use printf for robustness.

Avoid word-splitting/globbing and handle newlines safely.

Apply this diff:

-          echo "git_commit=${GIT_COMMIT}" >> $GITHUB_OUTPUT
-          echo "build_date=${BUILD_DATE}" >> $GITHUB_OUTPUT
-          echo "GIT_COMMIT=${GIT_COMMIT}"
-          echo "BUILD_DATE=${BUILD_DATE}"
+          printf 'git_commit=%s\n' "$GIT_COMMIT" >> "$GITHUB_OUTPUT"
+          printf 'build_date=%s\n' "$BUILD_DATE" >> "$GITHUB_OUTPUT"
+          printf 'GIT_COMMIT=%s\n' "$GIT_COMMIT"
+          printf 'BUILD_DATE=%s\n' "$BUILD_DATE"

Please re-run actionlint/shellcheck locally or in CI to confirm warnings are resolved.

deploy/operator/internal/controller/jumpstarter/suite_test.go (1)

74-76: Verify envtest binaries discovery across IDE/CI.

Depth 5 should resolve to bin/k8s under repo root. If running from different CWDs, consider falling back through multiple depths (e.g., 3..6) before giving up.

Would you like a helper to probe a depth range and log which one resolved?

deploy/operator/config/crd/bases/jumpstarter.dev_exporters.yaml (1)

1-162: CRD schema looks sound; consider minor polish.

  • Quote the Unknown enum for consistency.
  • Optionally add additionalPrinterColumns for endpoint/lastSeen to improve kubectl UX.

Example tweaks:

-                      - Unknown
+                      - "Unknown"

And later (optional):

spec:
  versions:
  - name: v1alpha1
    additionalPrinterColumns:
    - name: Endpoint
      type: string
      jsonPath: .status.endpoint
    - name: LastSeen
      type: date
      jsonPath: .status.lastSeen
cmd/main.go (1)

155-158: Comment contradicts behavior; clarify single-namespace requirement.

The comment suggests “watch all namespaces,” but the code exits if none is set. Update the comment to reflect the enforced single-namespace mode since 0.8.0.

Apply this diff:

-// Get the namespace to watch. Try to auto-detect from the pod's service account,
-// fall back to NAMESPACE environment variable, or watch all namespaces if neither is available
+// Get the namespace to watch. Try NAMESPACE env var first, then the pod's service account.
+// Since 0.8.0 single-namespace mode is required; if none is detected the process exits.
hack/deploy_with_operator.sh (3)

97-111: Avoid emitting empty image fields; gate on IMAGE_REPO being set

If IMAGE_REPO is unset/empty, the CR will contain empty image values, potentially overriding defaults. Gate the image fields or omit them when empty.

Apply:

-  controller:
-    image: ${IMAGE_REPO}
-    imagePullPolicy: IfNotPresent
+  controller:
+$( [ -n "${IMAGE_REPO:-}" ] && printf "    image: %s\n    imagePullPolicy: IfNotPresent\n" "${IMAGE_REPO}" )

Repeat similarly for routers (Lines 109-111).


20-23: Guard image loading when variables are unset

If IMG/OPERATOR_IMG are empty, kind_load_image will fail.

-kind_load_image "${IMG}"
-kind_load_image "${OPERATOR_IMG}"
+[ -n "${IMG:-}" ] && kind_load_image "${IMG}"
+[ -n "${OPERATOR_IMG:-}" ] && kind_load_image "${OPERATOR_IMG}"

39-44: Increase readiness timeout for operator

120s is tight on CI. Consider 300s to reduce flakes.

-  --timeout=120s
+  --timeout=300s
deploy/operator/internal/controller/jumpstarter/endpoints/endpoints_test.go (3)

90-101: Assert OwnerReferences to ensure GC behavior

Add an assertion that created Services are controlled by the owner object.

-Expect(service.Spec.Type)...
+Expect(service.Spec.Type)...
+Expect(metav1.IsControlledBy(service, owner)).To(BeTrue())

Apply to ClusterIP, LoadBalancer, and NodePort verifications.

Also applies to: 124-135, 160-171


167-171: Validate explicit NodePort allocation

Since endpoint.NodePort.Port is specified (30090), assert the Service exposes that NodePort to catch allocator drift.

-Expect(service.Spec.Selector["app"]).To(Equal("jumpstarter-controller"))
+Expect(service.Spec.Selector["app"]).To(Equal("jumpstarter-controller"))
+Expect(service.Spec.Ports).To(HaveLen(1))
+Expect(service.Spec.Ports[0].NodePort).To(Equal(int32(30090)))

257-272: Namespace lifecycle: prefer DeferCleanup or delete namespace

Avoid test cross‑talk. Use Ginkgo DeferCleanup to remove the test namespace or explicitly delete it in AfterEach.

-BeforeEach(func() {
+BeforeEach(func() {
   ...
-  err := k8sClient.Create(ctx, ns)
+  err := k8sClient.Create(ctx, ns)
   ...
-})
+  DeferCleanup(func() { _ = k8sClient.Delete(ctx, ns) })
+})

Also applies to: 347-362

hack/utils (3)

6-11: Make get_script_dir portable (macOS lacks readlink -f)

Provide a fallback to realpath/python for macOS.

-get_script_dir() {
-  dirname "$(readlink -f "$0")"
-}
+get_script_dir() {
+  if command -v readlink >/dev/null 2>&1 && readlink -f "$0" >/dev/null 2>&1; then
+    dirname "$(readlink -f "$0")"
+  elif command -v realpath >/dev/null 2>&1; then
+    dirname "$(realpath "$0")"
+  else
+    python - <<'PY'
+import os,sys
+print(os.path.dirname(os.path.realpath(sys.argv[0])))
+PY
+  fi
+}

63-86: Ingress install: add OS guard for ip_tables and pin manifest version

lsmod/ip_tables check fails on macOS; remote HEAD is not reproducible.

-  lsmod | grep ip_tables || \
-    (echo "ip_tables module not loaded needed by nginx ingress, please run 'sudo modprobe ip_tables'" && exit 1)
+  if [[ "$(uname -s)" == "Linux" ]]; then
+    lsmod | grep -q '^ip_tables' || {
+      echo "ip_tables kernel module required for ingress on Linux. Run: sudo modprobe ip_tables"; exit 1; }
+  fi
-  kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/main/deploy/static/provider/kind/deploy.yaml
+  kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/helm-chart-4.11.2/deploy/static/provider/kind/deploy.yaml

155-160: Fail fast if GRPC endpoints are unset

Add checks to avoid grpcurl calls with empty endpoints.

check_grpc_endpoints() {
+  : "${GRPC_ENDPOINT:?GRPC_ENDPOINT not set}"
+  : "${GRPC_ROUTER_ENDPOINT:?GRPC_ROUTER_ENDPOINT not set}"
  echo -e "${GREEN}Waiting for grpc endpoints to be ready:${NC}"
  wait_for_grpc_endpoint "${GRPC_ENDPOINT}"
  wait_for_grpc_endpoint "${GRPC_ROUTER_ENDPOINT}"
}

Also applies to: 169-176

deploy/operator/Makefile (1)

201-211: Harden sed for cross-Dockerfile FROM insertion

Use a simpler, portable pattern to prefix the first FROM only.

- sed -e '1 s/\(^FROM\)/FROM --platform=\$$\{BUILDPLATFORM\}/; t' -e ' 1,// s//FROM --platform=\$$\{BUILDPLATFORM\}/' ../../Dockerfile.operator > ../../Dockerfile.operator.cross
+ sed -e '0,/^FROM/{s//FROM --platform=\${BUILDPLATFORM}/}' ../../Dockerfile.operator > ../../Dockerfile.operator.cross

Based on learnings.

hack/deploy_with_helm.sh (2)

23-29: Use --set-string for endpoints to avoid type casting issues.

Addresses like host:port can be parsed as numbers/booleans by Helm’s --set. Use --set-string for endpoint-like values.

-HELM_SETS="${HELM_SETS} --set jumpstarter-controller.grpc.endpoint=${GRPC_ENDPOINT}"
-HELM_SETS="${HELM_SETS} --set jumpstarter-controller.grpc.routerEndpoint=${GRPC_ROUTER_ENDPOINT}"
+HELM_SETS="${HELM_SETS} --set-string jumpstarter-controller.grpc.endpoint=${GRPC_ENDPOINT}"
+HELM_SETS="${HELM_SETS} --set-string jumpstarter-controller.grpc.routerEndpoint=${GRPC_ROUTER_ENDPOINT}"

2-2: Harden shell flags.

Prevent unbound variable surprises from sourced files.

-set -eo pipefail
+set -euo pipefail
Makefile (1)

180-182: Parallel deploy may race prerequisites.

Running deploy-with-operator with -j can start docker-build, build-operator, cluster, grpcurl concurrently. If build-operator assumes CRDs or cluster state, this can flake.

  • Keep -j for independent compile steps, but serialize cluster and deploy script:
    • Split into prep (builds) and apply (cluster+deploy) targets; run apply without -j.
deploy/operator/internal/controller/jumpstarter/endpoints/endpoints.go (2)

70-76: Preserve NodePorts by port name, not index.

Index-based copy breaks if ports are reordered or counts change, reassigning wrong NodePorts.

-			if service.Spec.Type == corev1.ServiceTypeNodePort || service.Spec.Type == corev1.ServiceTypeLoadBalancer {
-				for i := range existingService.Spec.Ports {
-					if existingService.Spec.Ports[i].NodePort != 0 && i < len(service.Spec.Ports) {
-						service.Spec.Ports[i].NodePort = existingService.Spec.Ports[i].NodePort
-					}
-				}
-			}
+			if service.Spec.Type == corev1.ServiceTypeNodePort || service.Spec.Type == corev1.ServiceTypeLoadBalancer {
+				byName := make(map[string]int32, len(existingService.Spec.Ports))
+				for _, p := range existingService.Spec.Ports {
+					if p.NodePort != 0 && p.Name != "" {
+						byName[p.Name] = p.NodePort
+					}
+				}
+				for i := range service.Spec.Ports {
+					if np, ok := byName[service.Spec.Ports[i].Name]; ok {
+						service.Spec.Ports[i].NodePort = np
+					}
+				}
+			}

Also applies to: 86-86


104-110: Log the reconciled selector.

You log service.Spec.Selector, but the API object you persist is existingService. Log the selector from the object being updated.

-		"selector", service.Spec.Selector,
+		"selector", existingService.Spec.Selector,
deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go (1)

326-330: Validate NodePort range when offsetting by replica.

Blindly adding the replica index can exceed 32767. Clamp or skip setting when out of range to avoid API errors.

-					svcPort.NodePort = endpoint.NodePort.Port + i
+					next := endpoint.NodePort.Port + i
+					if next >= 30000 && next <= 32767 {
+						svcPort.NodePort = next
+					} else {
+						// Leave 0 to let Kubernetes assign; logs help in E2E
+						log.V(loglevels.LevelDebug).Info("NodePort out of range; letting K8s assign", "requested", next, "replica", i)
+					}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1d170e0 and 0855c26.

⛔ Files ignored due to path filters (1)
  • deploy/operator/go.sum is excluded by !**/*.sum
📒 Files selected for processing (44)
  • .github/workflows/build.yaml (2 hunks)
  • .github/workflows/e2e.yaml (1 hunks)
  • .github/workflows/pr-kind.yaml (1 hunks)
  • Dockerfile (2 hunks)
  • Dockerfile.operator (1 hunks)
  • Makefile (7 hunks)
  • cmd/main.go (5 hunks)
  • deploy/operator/Dockerfile (0 hunks)
  • deploy/operator/Makefile (7 hunks)
  • deploy/operator/api/v1alpha1/jumpstarter_types.go (3 hunks)
  • deploy/operator/api/v1alpha1/zz_generated.deepcopy.go (3 hunks)
  • deploy/operator/cmd/main.go (3 hunks)
  • deploy/operator/config/crd/bases/jumpstarter.dev_clients.yaml (1 hunks)
  • deploy/operator/config/crd/bases/jumpstarter.dev_exporteraccesspolicies.yaml (1 hunks)
  • deploy/operator/config/crd/bases/jumpstarter.dev_exporters.yaml (1 hunks)
  • deploy/operator/config/crd/bases/jumpstarter.dev_leases.yaml (1 hunks)
  • deploy/operator/config/crd/bases/operator.jumpstarter.dev_jumpstarters.yaml (3 hunks)
  • deploy/operator/config/crd/kustomization.yaml (1 hunks)
  • deploy/operator/config/manager/kustomization.yaml (1 hunks)
  • deploy/operator/config/manager/manager.yaml (1 hunks)
  • deploy/operator/config/rbac/role.yaml (2 hunks)
  • deploy/operator/go.mod (3 hunks)
  • deploy/operator/internal/controller/jumpstarter/endpoints/endpoints.go (1 hunks)
  • deploy/operator/internal/controller/jumpstarter/endpoints/endpoints_test.go (1 hunks)
  • deploy/operator/internal/controller/jumpstarter/endpoints/suite_test.go (1 hunks)
  • deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go (1 hunks)
  • deploy/operator/internal/controller/jumpstarter/jumpstarter_controller_test.go (4 hunks)
  • deploy/operator/internal/controller/jumpstarter/rbac.go (1 hunks)
  • deploy/operator/internal/controller/jumpstarter/suite_test.go (3 hunks)
  • deploy/operator/internal/controller/jumpstarter_controller.go (0 hunks)
  • deploy/operator/internal/controller/testutils/envtest.go (1 hunks)
  • deploy/operator/internal/log/levels.go (1 hunks)
  • deploy/operator/internal/utils/maps.go (1 hunks)
  • hack/deploy_vars (1 hunks)
  • hack/deploy_with_helm.sh (2 hunks)
  • hack/deploy_with_operator.sh (1 hunks)
  • hack/get_ext_ip.sh (0 hunks)
  • hack/kind_cluster.yaml (1 hunks)
  • hack/utils (1 hunks)
  • internal/config/grpc.go (1 hunks)
  • internal/config/types.go (1 hunks)
  • internal/config/types_test.go (1 hunks)
  • internal/service/controller_service.go (1 hunks)
  • internal/service/router_service.go (1 hunks)
💤 Files with no reviewable changes (3)
  • hack/get_ext_ip.sh
  • deploy/operator/Dockerfile
  • deploy/operator/internal/controller/jumpstarter_controller.go
🚧 Files skipped from review as they are similar to previous changes (14)
  • .github/workflows/e2e.yaml
  • deploy/operator/internal/controller/jumpstarter/endpoints/suite_test.go
  • deploy/operator/internal/log/levels.go
  • deploy/operator/config/crd/bases/jumpstarter.dev_leases.yaml
  • hack/deploy_vars
  • deploy/operator/config/crd/bases/jumpstarter.dev_exporteraccesspolicies.yaml
  • internal/service/controller_service.go
  • Dockerfile.operator
  • deploy/operator/api/v1alpha1/zz_generated.deepcopy.go
  • deploy/operator/config/crd/kustomization.yaml
  • Dockerfile
  • hack/kind_cluster.yaml
  • deploy/operator/config/manager/manager.yaml
  • deploy/operator/config/crd/bases/jumpstarter.dev_clients.yaml
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-10T13:32:54.936Z
Learnt from: mangelajo
PR: jumpstarter-dev/jumpstarter-controller#169
File: deploy/operator/Makefile:62-67
Timestamp: 2025-10-10T13:32:54.936Z
Learning: Podman now supports the buildx command, making it compatible with Docker's buildx syntax for multi-platform builds. No special gating or conditional logic is needed when using buildx with CONTAINER_TOOL=podman in Makefiles.

Applied to files:

  • deploy/operator/Makefile
🧬 Code graph analysis (11)
internal/config/grpc.go (1)
internal/config/types.go (3)
  • Grpc (47-49)
  • Keepalive (53-81)
  • ParseDuration (100-105)
deploy/operator/internal/controller/jumpstarter/endpoints/endpoints_test.go (2)
deploy/operator/internal/controller/jumpstarter/endpoints/endpoints.go (2)
  • Reconciler (34-37)
  • NewReconciler (40-45)
deploy/operator/api/v1alpha1/jumpstarter_types.go (4)
  • Endpoint (380-414)
  • ClusterIPConfig (499-511)
  • LoadBalancerConfig (476-496)
  • NodePortConfig (454-473)
deploy/operator/internal/controller/jumpstarter/rbac.go (2)
deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go (1)
  • JumpstarterReconciler (54-58)
deploy/operator/api/v1alpha1/jumpstarter_types.go (1)
  • Jumpstarter (525-531)
deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go (5)
deploy/operator/internal/controller/jumpstarter/endpoints/endpoints.go (1)
  • Reconciler (34-37)
deploy/operator/api/v1alpha1/jumpstarter_types.go (2)
  • Jumpstarter (525-531)
  • Endpoint (380-414)
internal/log/levels.go (2)
  • Info (61-63)
  • Error (51-53)
deploy/operator/internal/log/levels.go (2)
  • LevelTrace (17-17)
  • LevelDebug (12-12)
internal/config/types.go (9)
  • Config (11-15)
  • Provisioning (42-44)
  • Grpc (47-49)
  • Keepalive (53-81)
  • Authentication (19-23)
  • Internal (26-33)
  • K8s (36-39)
  • Router (87-87)
  • RouterEntry (90-97)
internal/service/router_service.go (1)
internal/log/levels.go (1)
  • Info (61-63)
deploy/operator/internal/controller/jumpstarter/endpoints/endpoints.go (1)
deploy/operator/api/v1alpha1/jumpstarter_types.go (1)
  • Endpoint (380-414)
internal/config/types_test.go (1)
internal/config/types.go (10)
  • Config (11-15)
  • Authentication (19-23)
  • Internal (26-33)
  • K8s (36-39)
  • Provisioning (42-44)
  • Grpc (47-49)
  • Keepalive (53-81)
  • Router (87-87)
  • RouterEntry (90-97)
  • ParseDuration (100-105)
deploy/operator/cmd/main.go (2)
deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go (1)
  • JumpstarterReconciler (54-58)
deploy/operator/internal/controller/jumpstarter/endpoints/endpoints.go (1)
  • NewReconciler (40-45)
deploy/operator/internal/controller/jumpstarter/suite_test.go (1)
deploy/operator/internal/controller/testutils/envtest.go (1)
  • GetFirstFoundEnvTestBinaryDir (37-57)
deploy/operator/internal/controller/jumpstarter/jumpstarter_controller_test.go (2)
deploy/operator/api/v1alpha1/jumpstarter_types.go (5)
  • JumpstarterSpec (130-154)
  • ControllerConfig (192-230)
  • GRPCConfig (244-260)
  • Endpoint (380-414)
  • RoutersConfig (158-187)
deploy/operator/internal/controller/jumpstarter/endpoints/endpoints.go (1)
  • NewReconciler (40-45)
internal/config/types.go (1)
deploy/helm/jumpstarter/charts/jumpstarter-controller/model.py (1)
  • JWTAuthenticator (174-180)
🪛 actionlint (1.7.8)
.github/workflows/build.yaml

44-44: shellcheck reported issue in this script: SC2086:info:3:36: Double quote to prevent globbing and word splitting

(shellcheck)


44-44: shellcheck reported issue in this script: SC2086:info:4:36: Double quote to prevent globbing and word splitting

(shellcheck)

🪛 OSV Scanner (2.2.3)
deploy/operator/go.mod

[HIGH] 10-10: github.com/golang-jwt/jwt/v5 5.2.1: Excessive memory allocation during header parsing in github.com/golang-jwt/jwt

(GO-2025-3553)


[HIGH] 10-10: github.com/golang-jwt/jwt/v5 5.2.1: jwt-go allows excessive memory allocation during header parsing

(GHSA-mh63-6h87-95cp)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: tests
  • GitHub Check: e2e-tests-release-0-7
  • GitHub Check: e2e-tests (ubuntu-24.04-arm)
  • GitHub Check: e2e-tests (ubuntu-24.04)
  • GitHub Check: deploy-kind
  • GitHub Check: deploy-with-operator
  • GitHub Check: lint-go
🔇 Additional comments (20)
internal/service/router_service.go (1)

138-138: LGTM! Improved log message clarity.

The updated log message now includes the port number, making it more informative for operational monitoring. The port 8083 correctly matches the listener configuration on line 133.

internal/config/types_test.go (1)

66-137: LGTM!

Comprehensive test that validates Router YAML round-trips with good coverage of different label scenarios (no labels, single label, multiple labels). The logging on line 93 is helpful for debugging.

deploy/operator/api/v1alpha1/jumpstarter_types.go (1)

498-511: LGTM!

The ClusterIPConfig type follows Kubernetes conventions and is consistent with other service configuration types in the file (NodePortConfig, LoadBalancerConfig, etc.).

deploy/operator/config/crd/bases/operator.jumpstarter.dev_jumpstarters.yaml (1)

846-883: LGTM!

The CRD schema correctly reflects the API changes from jumpstarter_types.go:

  • hostname field replaced with address field
  • Address validation pattern supports hostnames, IPv4, IPv6, ports, and templating
  • ClusterIP configuration properly defined with standard Kubernetes metadata fields

The schema is consistent across all three endpoint groups (controller.grpc, controller.restApi, routers.grpc).

deploy/operator/internal/controller/jumpstarter/rbac.go (1)

22-48: Good improvement on reconciliation logic.

The ServiceAccount reconciliation now properly implements the mutate function (lines 39-42) to update labels and annotations, addressing the previous review concern about empty mutate functions. The pattern is correctly applied to Role and RoleBinding as well.

deploy/operator/config/manager/kustomization.yaml (1)

1-8: LGTM!

The kustomization file correctly adds metadata and image override configuration. The image override to quay.io/jumpstarter-dev/jumpstarter-operator:latest aligns with the operator deployment strategy introduced in this PR.

.github/workflows/pr-kind.yaml (1)

21-30: LGTM!

The new deploy-with-operator job correctly parallels the existing deployment workflow and will test the operator-based deployment path. The fetch-depth: 0 is appropriate for version metadata generation referenced in the PR objectives.

.github/workflows/build.yaml (1)

92-95: Build args wiring looks correct.

The build-args align with Dockerfiles expecting GIT_VERSION, GIT_COMMIT, BUILD_DATE.

Confirm both Dockerfiles (root and deploy/operator) declare matching ARG names.

deploy/operator/internal/controller/jumpstarter/suite_test.go (1)

69-71: CRD path depth looks right for operator tree.

Going up three dirs resolves to deploy/operator/config/crd/bases; LGTM.

cmd/main.go (1)

183-195: Single-namespace scoping and leader-election namespace look good.

Manager cache scoping and LeaderElectionNamespace set to the watch namespace are appropriate for the new requirement.

Ensure RBAC and Lease namespace are provisioned in the same namespace (or adjust if you intend a dedicated leader election namespace).

deploy/operator/cmd/main.go (2)

98-104: Version metadata logging: LGTM.

Matches build args and helps with supportability.


218-222: EndpointReconciler wiring: LGTM.

Constructor injection is clean and keeps JumpstarterReconciler testable.

deploy/operator/internal/controller/jumpstarter/jumpstarter_controller_test.go (2)

112-116: LGTM: EndpointReconciler wiring

Constructor injection looks correct and keeps concerns separated.


168-171: No issues found—ldflags variables properly configured

The verification confirms all three target symbols exist in cmd/main.go and are correctly declared for ldflags injection. Variables version, gitCommit, and buildDate are defined at the package level (lines 61-63) with proper string types and sensible defaults, allowing the build system to override them via -X flags without errors.

deploy/operator/Makefile (3)

32-54: LGTM: image registry switch and IMG default

Switch to quay.io namespace is consistent; defaults align with kustomize image substitution.


199-209: Podman buildx: good usage

Using podman buildx is supported; build args passed through correctly.


168-171: No changes needed—variables are correctly defined.

The ldflags variables version, gitCommit, and buildDate are already properly defined as package-level variables in deploy/operator/cmd/main.go and are being correctly targeted by the -X main.version, -X main.gitCommit, and -X main.buildDate injection flags in the Makefile.

hack/deploy_with_operator.sh (1)

52-67: Code matches IngressConfig field specification—no action required

Verification confirms the YAML field class: "" in the deploy script correctly corresponds to the IngressConfig struct's Class field with json tag "class,omitempty" (jumpstarter_types.go, line 442). The field will be properly recognized and not ignored.

deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go (1)

287-304: Multiple controller endpoints write the same Service names.

The loop uses a constant servicePort.Name ("controller-grpc"), so all endpoints reconcile the same set of services. If multiple controller endpoints are supported, consider unique names per endpoint or dedup at the spec level.

Would you prefer to enforce a single controller endpoint or append an index to servicePort.Name (e.g., controller-grpc-0)? I can provide a patch.

hack/deploy_with_helm.sh (1)

37-40: No action needed — helper is properly defined and sourced.

hack/utils defines kind_load_image() with full implementation, and hack/deploy_with_helm.sh sources it on line 6 before calling it on line 40. The function will be available at runtime.

Likely an incorrect or invalid review comment.

@mangelajo mangelajo force-pushed the operator-implementation branch 2 times, most recently from f1f9a3b to b79e1e8 Compare October 24, 2025 09:56
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (5)
deploy/operator/go.mod (1)

58-58: Patch vulnerable golang-jwt/jwt/v5 (GHSA-mh63-6h87-95cp).

Version v5.2.1 is flagged HIGH for excessive memory allocation during header parsing. Force >= v5.2.2 (or latest) via an explicit require or replace to avoid pulling the vulnerable indirect.

Two equivalent fixes (pick one):

@@
 require (
-    github.com/golang-jwt/jwt/v5 v5.2.1 // indirect
+    github.com/golang-jwt/jwt/v5 v5.2.2 // indirect
 )

or

@@
 replace github.com/jumpstarter-dev/jumpstarter-controller => ../../
+replace github.com/golang-jwt/jwt/v5 => github.com/golang-jwt/jwt/v5 v5.2.2

Please confirm the latest patched version and whether any transitive pulls pin it back. If helpful, I can generate a script to list the jwt module graph edges.

deploy/operator/internal/controller/jumpstarter/rbac.go (2)

146-149: Least‑privilege: remove “delete” on Secrets.

Controller logic never deletes Secrets (and comments say it shouldn’t). Drop “delete” from this rule.

-    Verbs:     []string{"get", "list", "watch", "create", "update", "patch", "delete"},
+    Verbs:     []string{"get", "list", "watch", "create", "update", "patch"},

156-164: Add exporteraccesspolicies status/finalizers.

Status and finalizers for exporteraccesspolicies are missing; other CRDs include them. Add for parity so controllers can patch status and manage finalizers.

       {
         APIGroups: []string{"jumpstarter.dev"},
-        Resources: []string{"clients/status", "exporters/status", "leases/status"},
+        Resources: []string{"clients/status", "exporters/status", "leases/status", "exporteraccesspolicies/status"},
         Verbs:     []string{"get", "update", "patch"},
       },
       {
         APIGroups: []string{"jumpstarter.dev"},
-        Resources: []string{"clients/finalizers", "exporters/finalizers", "leases/finalizers"},
+        Resources: []string{"clients/finalizers", "exporters/finalizers", "leases/finalizers", "exporteraccesspolicies/finalizers"},
         Verbs:     []string{"update"},
       },
deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go (2)

576-579: Typo: use “--metrics-bind-address”.

-                                "-metrics-bind-address=:8080",
+                                "--metrics-bind-address=:8080",

1060-1062: Service cleanup misses “-ing” and “-route” variants.

Router reconciliation creates “-ing” and “-route” ClusterIP services, but cleanup doesn’t delete them when scaling down.

-    suffixes := []string{"", "-lb", "-np"}
+    suffixes := []string{"", "-lb", "-np", "-ing", "-route"}
🧹 Nitpick comments (2)
deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go (2)

197-202: Use structured logger instead of fmt.Printf for diffs.

Prefer log.V with structured fields to integrate with controller-runtime logging.

-            fmt.Printf("\n=== Deployment differences detected ===\n")
-            fmt.Printf("Name: %s\n", existingDeployment.Name)
-            fmt.Printf("Namespace: %s\n", existingDeployment.Namespace)
-            fmt.Printf("\n%s\n", diff)
-            fmt.Printf("========================================\n\n")
+            log.V(1).Info("Deployment differences detected",
+                "name", existingDeployment.Name,
+                "namespace", existingDeployment.Namespace,
+                "diff", diff)

And for the router block:

-            fmt.Printf("\n=== Router deployment differences detected ===\n")
-            fmt.Printf("Name: %s\n", existingDeployment.Name)
-            fmt.Printf("Namespace: %s\n", existingDeployment.Namespace)
-            fmt.Printf("Replica: %d\n", i)
-            fmt.Printf("\n%s\n", diff)
-            fmt.Printf("==============================================\n\n")
+            log.V(1).Info("Router deployment differences detected",
+                "name", existingDeployment.Name,
+                "namespace", existingDeployment.Namespace,
+                "replica", i,
+                "diff", diff)

Also applies to: 255-261


67-71: Least‑privilege: consider dropping “delete” on Secrets in manager RBAC.

The controller does not delete Secrets; aligning kubebuilder RBAC with actual behavior tightens posture.

-// +kubebuilder:rbac:groups="",resources=secrets,verbs=get;list;watch;create;update;patch;delete
+// +kubebuilder:rbac:groups="",resources=secrets,verbs=get;list;watch;create;update;patch
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0855c26 and b79e1e8.

📒 Files selected for processing (5)
  • deploy/operator/go.mod (3 hunks)
  • deploy/operator/internal/controller/jumpstarter/compare.go (1 hunks)
  • deploy/operator/internal/controller/jumpstarter/endpoints/endpoints.go (1 hunks)
  • deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go (1 hunks)
  • deploy/operator/internal/controller/jumpstarter/rbac.go (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (4)
deploy/operator/internal/controller/jumpstarter/compare.go (1)
internal/log/levels.go (1)
  • Info (61-63)
deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go (5)
deploy/operator/internal/controller/jumpstarter/endpoints/endpoints.go (1)
  • Reconciler (34-37)
deploy/operator/api/v1alpha1/jumpstarter_types.go (2)
  • Jumpstarter (525-531)
  • Endpoint (380-414)
internal/log/levels.go (2)
  • Info (61-63)
  • Error (51-53)
deploy/operator/internal/log/levels.go (2)
  • LevelTrace (17-17)
  • LevelDebug (12-12)
internal/config/types.go (9)
  • Config (11-15)
  • Provisioning (42-44)
  • Grpc (47-49)
  • Keepalive (53-81)
  • Authentication (19-23)
  • Internal (26-33)
  • K8s (36-39)
  • Router (87-87)
  • RouterEntry (90-97)
deploy/operator/internal/controller/jumpstarter/endpoints/endpoints.go (1)
deploy/operator/api/v1alpha1/jumpstarter_types.go (1)
  • Endpoint (380-414)
deploy/operator/internal/controller/jumpstarter/rbac.go (2)
deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go (1)
  • JumpstarterReconciler (54-58)
deploy/operator/api/v1alpha1/jumpstarter_types.go (1)
  • Jumpstarter (525-531)
🪛 OSV Scanner (2.2.3)
deploy/operator/go.mod

[HIGH] 10-10: github.com/golang-jwt/jwt/v5 5.2.1: Excessive memory allocation during header parsing in github.com/golang-jwt/jwt

(GO-2025-3553)


[HIGH] 10-10: github.com/golang-jwt/jwt/v5 5.2.1: jwt-go allows excessive memory allocation during header parsing

(GHSA-mh63-6h87-95cp)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: deploy-with-operator
  • GitHub Check: deploy-kind
  • GitHub Check: e2e-tests (ubuntu-24.04-arm)
  • GitHub Check: e2e-tests (ubuntu-24.04)
  • GitHub Check: e2e-tests-release-0-7
🔇 Additional comments (2)
deploy/operator/go.mod (1)

3-3: Verify Go toolchain version.

The module targets go 1.24.0. Ensure CI/build images actually run Go 1.24.x; otherwise bump down to the current CI version to avoid toolchain mismatch.

If you want, I can generate a quick script to read CI config and check go versions across workflows.

deploy/operator/internal/controller/jumpstarter/compare.go (1)

247-256: Nice selective-diff logging.

Granular update detection with selective comparisons looks solid and low‑noise. No changes requested.

@mangelajo mangelajo force-pushed the operator-implementation branch from b79e1e8 to b60f0cc Compare October 24, 2025 11:35
some objects stay on a endless reconciliation loops because we
don't configure all the fields and they will differ later when
k8s updates them to defaults.

this commit fixes such problem.
@mangelajo mangelajo force-pushed the operator-implementation branch from b60f0cc to 77df0f4 Compare October 24, 2025 11:38
@mangelajo mangelajo force-pushed the operator-implementation branch from 094920f to 91f4bec Compare October 24, 2025 12:01
@mangelajo mangelajo force-pushed the operator-implementation branch from 91f4bec to 9392eb9 Compare October 24, 2025 12:04
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
deploy/operator/internal/controller/jumpstarter/rbac.go (1)

203-211: Add missing exporteraccesspolicies status and finalizers subresources.

The Role grants CRUD permissions for exporteraccesspolicies at lines 198-201, but the status (lines 203-206) and finalizers (lines 208-211) subresources only include clients, exporters, and leases. This asymmetry means the controller cannot update status or finalizers for exporteraccesspolicies resources.

Apply this fix to add the missing subresources:

 			{
 				APIGroups: []string{"jumpstarter.dev"},
-				Resources: []string{"clients/status", "exporters/status", "leases/status"},
+				Resources: []string{"clients/status", "exporters/status", "leases/status", "exporteraccesspolicies/status"},
 				Verbs:     []string{"get", "update", "patch"},
 			},
 			{
 				APIGroups: []string{"jumpstarter.dev"},
-				Resources: []string{"clients/finalizers", "exporters/finalizers", "leases/finalizers"},
+				Resources: []string{"clients/finalizers", "exporters/finalizers", "leases/finalizers", "exporteraccesspolicies/finalizers"},
 				Verbs:     []string{"update"},
 			},
🧹 Nitpick comments (2)
hack/deploy_with_operator.sh (2)

5-9: Add error handling for sourced files.

The script sources hack/utils and hack/deploy_vars without checking if they exist, which could lead to silent failures or unclear errors if either file is missing or moved.

Consider adding explicit checks:

source "${SCRIPT_DIR}/utils" || { echo "Error: Failed to source utils"; exit 1; }
source "${SCRIPT_DIR}/deploy_vars" || { echo "Error: Failed to source deploy_vars"; exit 1; }

53-85: Add validation for required variables.

The endpoint configuration generation assumes BASEDOMAIN is set. If this variable is empty or unset, the generated YAML will be malformed, causing a silent failure at kubectl apply time later in the script.

Add validation early in the script:

: "${BASEDOMAIN:?BASEDOMAIN is required but not set}"
: "${NETWORKING_MODE:?NETWORKING_MODE is required but not set}"
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 77df0f4 and 9392eb9.

📒 Files selected for processing (5)
  • Dockerfile (2 hunks)
  • Makefile (7 hunks)
  • cmd/router/main.go (2 hunks)
  • deploy/operator/internal/controller/jumpstarter/rbac.go (1 hunks)
  • hack/deploy_with_operator.sh (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • Dockerfile
  • cmd/router/main.go
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-24T11:57:13.455Z
Learnt from: mangelajo
PR: jumpstarter-dev/jumpstarter-controller#170
File: deploy/operator/internal/controller/jumpstarter/rbac.go:193-196
Timestamp: 2025-10-24T11:57:13.455Z
Learning: In the Jumpstarter operator codebase (deploy/operator/internal/controller/jumpstarter/rbac.go), the Role created by `createRole()` defines RBAC permissions for the managed Jumpstarter controller application, not for the operator itself. The managed controller needs delete permissions on secrets for its runtime operations.

Applied to files:

  • deploy/operator/internal/controller/jumpstarter/rbac.go
🧬 Code graph analysis (1)
deploy/operator/internal/controller/jumpstarter/rbac.go (2)
deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go (1)
  • JumpstarterReconciler (54-58)
deploy/operator/api/v1alpha1/jumpstarter_types.go (1)
  • Jumpstarter (525-531)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: deploy-kind
  • GitHub Check: deploy-with-operator
  • GitHub Check: e2e-tests-release-0-7
  • GitHub Check: e2e-tests (ubuntu-24.04)
  • GitHub Check: e2e-tests (ubuntu-24.04-arm)
  • GitHub Check: tests
  • GitHub Check: lint-go
🔇 Additional comments (6)
Makefile (4)

10-18: Version metadata implementation is solid.

The use of git commands with fallback to "unknown" is defensive and follows good practices for embedding build metadata. The LDFLAGS definition is correctly structured for use across multiple binaries.


98-100: New operator build and deployment targets are well-structured.

The build-operator, deploy-with-operator targets appropriately depend on prerequisite targets (docker-build, build-operator, cluster, grpcurl) and delegate to external scripts. Dependency ordering is correct.

Also applies to: 177-179


181-183: Operator logs target is useful.

The operator-logs target provides clear access to operator pod logs with the label selector and follow flag for convenient debugging during development.


221-221: Config is already v2.5.0 compatible; no migration needed.

The .golangci.yml file is already using v2 format (version: "2", linters.default: none, explicit enable list) and does not use the deprecated enable-all/disable-all syntax mentioned in the review. All enabled linters are valid in v2.5.0, and the v2.5.0 changelog reports no breaking changes affecting the current configuration.

Likely an incorrect or invalid review comment.

hack/deploy_with_operator.sh (1)

125-131: No issues found.

All three helper functions—wait_for_jumpstarter_resources(), check_grpc_endpoints(), and print_deployment_success()—are defined in hack/utils. The code is correct as written.

deploy/operator/internal/controller/jumpstarter/rbac.go (1)

20-63: ServiceAccount reconciliation looks good.

The implementation correctly avoids setting an owner reference to prevent garbage collection of the ServiceAccount when the Jumpstarter CR is deleted. The mutate logic properly handles both creation and update paths with appropriate change detection.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (4)
deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go (3)

100-100: Add missing RBAC markers for exporteraccesspolicies subresources.

The RBAC marker for exporteraccesspolicies is missing status and finalizers subresources, which are present for other CRDs (clients, exporters, leases). This prevents the managed controller from updating status or managing finalizers.

Add the missing markers after line 100:

 // +kubebuilder:rbac:groups=jumpstarter.dev,resources=exporteraccesspolicies,verbs=get;list;watch;create;update;patch;delete
+// +kubebuilder:rbac:groups=jumpstarter.dev,resources=exporteraccesspolicies/status,verbs=get;update;patch
+// +kubebuilder:rbac:groups=jumpstarter.dev,resources=exporteraccesspolicies/finalizers,verbs=update

1113-1113: Add missing service suffixes to cleanup logic.

The cleanup logic only handles "", "-lb", and "-np" suffixes, but ReconcileRouterReplicaEndpoint also creates services with "-ing" and "-route" suffixes (see endpoints.go lines 195, 203). These services will leak when scaling down.

Apply this fix:

-    suffixes := []string{"", "-lb", "-np"}
+    suffixes := []string{"", "-lb", "-np", "-ing", "-route"}

612-612: Fix flag syntax: use double-dash for metrics-bind-address.

The flag uses a single dash (-metrics-bind-address=:8080) which is inconsistent with the other flags and may not parse correctly with pflag.

Apply this fix:

-                                "-metrics-bind-address=:8080",
+                                "--metrics-bind-address=:8080",
deploy/operator/internal/controller/jumpstarter/rbac.go (1)

200-209: Add missing status and finalizers rules for exporteraccesspolicies.

The Role grants CRUD for exporteraccesspolicies but omits status and finalizers subresources, which are present for other CRDs (clients, exporters, leases). This is inconsistent and prevents the managed controller from updating status or managing finalizers.

Apply this fix:

             {
                 APIGroups: []string{"jumpstarter.dev"},
-                Resources: []string{"clients/status", "exporters/status", "leases/status"},
+                Resources: []string{"clients/status", "exporters/status", "leases/status", "exporteraccesspolicies/status"},
                 Verbs:     []string{"get", "update", "patch"},
             },
             {
                 APIGroups: []string{"jumpstarter.dev"},
-                Resources: []string{"clients/finalizers", "exporters/finalizers", "leases/finalizers"},
+                Resources: []string{"clients/finalizers", "exporters/finalizers", "leases/finalizers", "exporteraccesspolicies/finalizers"},
                 Verbs:     []string{"update"},
             },
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9392eb9 and f1668b0.

📒 Files selected for processing (3)
  • deploy/operator/internal/controller/jumpstarter/endpoints/endpoints.go (1 hunks)
  • deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go (1 hunks)
  • deploy/operator/internal/controller/jumpstarter/rbac.go (1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-10-24T11:57:13.455Z
Learnt from: mangelajo
PR: jumpstarter-dev/jumpstarter-controller#170
File: deploy/operator/internal/controller/jumpstarter/rbac.go:193-196
Timestamp: 2025-10-24T11:57:13.455Z
Learning: In the Jumpstarter operator codebase (deploy/operator/internal/controller/jumpstarter/rbac.go), the Role created by `createRole()` defines RBAC permissions for the managed Jumpstarter controller application, not for the operator itself. The managed controller needs delete permissions on secrets for its runtime operations.

Applied to files:

  • deploy/operator/internal/controller/jumpstarter/rbac.go
📚 Learning: 2025-10-24T11:57:23.777Z
Learnt from: mangelajo
PR: jumpstarter-dev/jumpstarter-controller#170
File: deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go:328-333
Timestamp: 2025-10-24T11:57:23.777Z
Learning: In the jumpstarter-controller operator (deploy/operator/), the design allows only one Jumpstarter CR per namespace, which will be enforced by a validation webhook. This constraint eliminates concerns about resource name collisions within a namespace.

Applied to files:

  • deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go
  • deploy/operator/internal/controller/jumpstarter/endpoints/endpoints.go
🧬 Code graph analysis (3)
deploy/operator/internal/controller/jumpstarter/rbac.go (2)
deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go (1)
  • JumpstarterReconciler (54-58)
deploy/operator/api/v1alpha1/jumpstarter_types.go (1)
  • Jumpstarter (525-531)
deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go (5)
deploy/operator/internal/controller/jumpstarter/endpoints/endpoints.go (1)
  • Reconciler (34-37)
deploy/operator/api/v1alpha1/jumpstarter_types.go (2)
  • Jumpstarter (525-531)
  • Endpoint (380-414)
internal/log/levels.go (1)
  • Info (61-63)
deploy/operator/internal/log/levels.go (2)
  • LevelTrace (17-17)
  • LevelDebug (12-12)
internal/config/types.go (9)
  • Config (11-15)
  • Provisioning (42-44)
  • Grpc (47-49)
  • Keepalive (53-81)
  • Authentication (19-23)
  • Internal (26-33)
  • K8s (36-39)
  • Router (87-87)
  • RouterEntry (90-97)
deploy/operator/internal/controller/jumpstarter/endpoints/endpoints.go (1)
deploy/operator/api/v1alpha1/jumpstarter_types.go (1)
  • Endpoint (380-414)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: e2e-tests (ubuntu-24.04-arm)
  • GitHub Check: e2e-tests (ubuntu-24.04)
  • GitHub Check: e2e-tests-release-0-7
  • GitHub Check: lint-go
  • GitHub Check: tests
  • GitHub Check: deploy-kind
  • GitHub Check: deploy-with-operator

Comment on lines +359 to +362
if endpoint.NodePort != nil && endpoint.NodePort.Enabled && endpoint.NodePort.Port > 0 {
// increase nodeport numbers based in replica, not perfect because it needs to be
// consecutive, but this is mostly for E2E testing.
svcPort.NodePort = endpoint.NodePort.Port + i
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

NodePort replica increment is fragile.

Adding the replica index to the NodePort (line 362) assumes consecutive ports are available and could fail if ports are already allocated. While the comment acknowledges this limitation for E2E testing, consider using a more robust allocation strategy or validating port availability.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go (3)

361-364: NodePort allocation is fragile for production use.

Adding the replica index to the NodePort assumes consecutive ports are available and could fail if ports are already allocated. While acknowledged as acceptable for E2E testing, consider documenting this limitation or implementing validation.


611-615: Fix the single-dash flag typo.

The -metrics-bind-address flag uses a single dash, which may not parse correctly with pflag. It should use double dashes like the other flags.

Apply this diff:

                             Args: []string{
                                 "--leader-elect",
                                 "--health-probe-bind-address=:8081",
-                                "-metrics-bind-address=:8080",
+                                "--metrics-bind-address=:8080",
                             },

1113-1115: Service cleanup may miss ClusterIP variants with -ing and -route suffixes.

The suffixes array only includes ["", "-lb", "-np"], which could leave ClusterIP services with -ing and -route suffixes orphaned when scaling down.

Apply this diff if those service types are created:

-    suffixes := []string{"", "-lb", "-np"}
+    suffixes := []string{"", "-lb", "-np", "-ing", "-route"}
🧹 Nitpick comments (1)
deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go (1)

210-214: Consider using structured logging instead of fmt.Printf.

The deployment diff is printed to stdout using fmt.Printf. Kubernetes operators typically use the structured logger for all output. Consider using log.Info() with the diff as a field value instead.

Apply this approach:

-            fmt.Printf("\n=== Controller deployment differences detected ===\n")
-            fmt.Printf("Name: %s\n", existingDeployment.Name)
-            fmt.Printf("Namespace: %s\n", existingDeployment.Namespace)
-            fmt.Printf("\n%s\n", diff)
-            fmt.Printf("========================================\n\n")
+            log.Info("Controller deployment differences detected",
+                "name", existingDeployment.Name,
+                "namespace", existingDeployment.Namespace,
+                "diff", diff)

Note: The same pattern appears for router deployments at lines 277-282.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f1668b0 and cdc9403.

📒 Files selected for processing (2)
  • deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go (1 hunks)
  • deploy/operator/internal/controller/jumpstarter/rbac.go (1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-10-24T11:57:23.777Z
Learnt from: mangelajo
PR: jumpstarter-dev/jumpstarter-controller#170
File: deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go:328-333
Timestamp: 2025-10-24T11:57:23.777Z
Learning: In the jumpstarter-controller operator (deploy/operator/), the design allows only one Jumpstarter CR per namespace, which will be enforced by a validation webhook. This constraint eliminates concerns about resource name collisions within a namespace.

Applied to files:

  • deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go
📚 Learning: 2025-10-24T11:57:13.455Z
Learnt from: mangelajo
PR: jumpstarter-dev/jumpstarter-controller#170
File: deploy/operator/internal/controller/jumpstarter/rbac.go:193-196
Timestamp: 2025-10-24T11:57:13.455Z
Learning: In the Jumpstarter operator codebase (deploy/operator/internal/controller/jumpstarter/rbac.go), the Role created by `createRole()` defines RBAC permissions for the managed Jumpstarter controller application, not for the operator itself. The managed controller needs delete permissions on secrets for its runtime operations.

Applied to files:

  • deploy/operator/internal/controller/jumpstarter/rbac.go
🧬 Code graph analysis (2)
deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go (4)
deploy/operator/internal/controller/jumpstarter/endpoints/endpoints.go (1)
  • Reconciler (34-37)
deploy/operator/api/v1alpha1/jumpstarter_types.go (2)
  • Jumpstarter (525-531)
  • Endpoint (380-414)
deploy/operator/internal/log/levels.go (2)
  • LevelTrace (17-17)
  • LevelDebug (12-12)
internal/config/types.go (9)
  • Config (11-15)
  • Provisioning (42-44)
  • Grpc (47-49)
  • Keepalive (53-81)
  • Authentication (19-23)
  • Internal (26-33)
  • K8s (36-39)
  • Router (87-87)
  • RouterEntry (90-97)
deploy/operator/internal/controller/jumpstarter/rbac.go (2)
deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go (1)
  • JumpstarterReconciler (54-58)
deploy/operator/api/v1alpha1/jumpstarter_types.go (1)
  • Jumpstarter (525-531)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: deploy-with-operator
  • GitHub Check: e2e-tests (ubuntu-24.04)
  • GitHub Check: e2e-tests (ubuntu-24.04-arm)
  • GitHub Check: deploy-kind
  • GitHub Check: e2e-tests-release-0-7
  • GitHub Check: tests
  • GitHub Check: lint-go
🔇 Additional comments (2)
deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go (1)

100-102: LGTM! ExporterAccessPolicy RBAC permissions are complete.

The status and finalizers subresources for exporteraccesspolicies are now properly included in the RBAC markers.

deploy/operator/internal/controller/jumpstarter/rbac.go (1)

16-155: LGTM! RBAC reconciliation is well-implemented.

The RBAC reconciliation logic correctly:

  • Avoids setting controller reference on ServiceAccount to prevent garbage collection (lines 21-22, 35, 49)
  • Sets controller references on Role and RoleBinding in both creation and update paths (lines 78, 93, 123, 139)
  • Includes exporteraccesspolicies status and finalizers permissions (lines 202, 207)
  • Uses proper update checks before applying changes
  • Maintains secret delete permissions for the managed controller's runtime operations

Based on learnings.

@mangelajo mangelajo force-pushed the operator-implementation branch from cdc9403 to 4d5c98d Compare October 27, 2025 11:09
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (3)
deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go (3)

361-365: NodePort auto-increment is fragile.

Assumes consecutive availability; OK for E2E, but add a comment or preflight check if used beyond tests.


1113-1116: Clean up all service variants (-ing, -route).

Current suffixes miss Ingress/Route clusterIP helpers; leaks on scale‑down.

-	suffixes := []string{"", "-lb", "-np"}
+	suffixes := []string{"", "-lb", "-np", "-ing", "-route"}

611-616: Typo: use double dash for metrics flag.

Single dash may not parse with pflag.

-								"-metrics-bind-address=:8080",
+								"--metrics-bind-address=:8080",
🧹 Nitpick comments (5)
deploy/operator/config/rbac/role.yaml (3)

7-21: Trim operator RBAC: drop delete on secrets/serviceaccounts (least privilege).

Operator code intentionally never deletes Secrets or ServiceAccounts; granting delete here is unnecessary risk. Split the rule so delete applies only to ConfigMaps/Services, not to Secrets/ServiceAccounts. Based on learnings.

Apply:

-  resources:
-  - configmaps
-  - secrets
-  - serviceaccounts
-  - services
-  verbs:
-  - create
-  - delete
-  - get
-  - list
-  - patch
-  - update
-  - watch
+  resources:
+  - configmaps
+  - services
+  verbs:
+  - create
+  - delete
+  - get
+  - list
+  - patch
+  - update
+  - watch
+---
+- apiGroups:
+  - ""
+  resources:
+  - secrets
+  - serviceaccounts
+  verbs:
+  - create
+  - get
+  - list
+  - patch
+  - update
+  - watch

175-194: Gate OpenShift Route permissions behind an overlay.

To avoid surprising perms on vanilla clusters, move route.openshift.io rules to an OpenShift kustomize overlay. Keep base ClusterRole Kubernetes‑only.

Also applies to: 188-194


30-36: Remove unnecessary "get" verb from status subresource permissions.

The operator does not read the status of Kubernetes native resources (Services, Deployments, Ingresses). Removing the "get" verb reduces the attack surface while maintaining necessary write access for patch/update operations.

Apply the three diffs as suggested:

-  - services/status
-  verbs:
-  - get
-  - patch
-  - update
+  - services/status
+  verbs:
+  - patch
+  - update
-  - deployments/status
-  verbs:
-  - get
-  - patch
-  - update
+  - deployments/status
+  verbs:
+  - patch
+  - update
-  - ingresses/status
-  verbs:
-  - get
-  - patch
-  - update
+  - ingresses/status
+  verbs:
+  - patch
+  - update

Also update the corresponding kubebuilder:rbac annotations in deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go (lines 66, 68, 83) to remove "get" from the status verbs to keep annotations synchronized with the actual RBAC policy.

deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go (1)

206-216: Prefer structured logging over fmt.Printf.

Use controller-runtime logger (with verbosity) so logs are leveled and captured.

-			fmt.Printf("\n=== Controller deployment differences detected ===\n")
-			fmt.Printf("Name: %s\n", existingDeployment.Name)
-			fmt.Printf("Namespace: %s\n", existingDeployment.Namespace)
-			fmt.Printf("\n%s\n", diff)
-			fmt.Printf("========================================\n\n")
+			log.V(1).Info("Controller deployment diff", "name", existingDeployment.Name, "namespace", existingDeployment.Namespace, "diff", diff)

Analogous change for the router diff block.

Also applies to: 272-284

deploy/operator/internal/controller/jumpstarter/rbac.go (1)

184-221: Secrets delete in managed Role is acceptable — add a brief rationale.

Per project design, the managed controller deletes secrets at runtime. Add a clarifying comment near this rule to avoid future least‑privilege “cleanup” attempts. Based on learnings.

Also applies to: 191-199

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cdc9403 and 4d5c98d.

📒 Files selected for processing (3)
  • deploy/operator/config/rbac/role.yaml (2 hunks)
  • deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go (1 hunks)
  • deploy/operator/internal/controller/jumpstarter/rbac.go (1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-10-24T11:57:23.777Z
Learnt from: mangelajo
PR: jumpstarter-dev/jumpstarter-controller#170
File: deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go:328-333
Timestamp: 2025-10-24T11:57:23.777Z
Learning: In the jumpstarter-controller operator (deploy/operator/), the design allows only one Jumpstarter CR per namespace, which will be enforced by a validation webhook. This constraint eliminates concerns about resource name collisions within a namespace.

Applied to files:

  • deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go
📚 Learning: 2025-10-24T11:57:13.455Z
Learnt from: mangelajo
PR: jumpstarter-dev/jumpstarter-controller#170
File: deploy/operator/internal/controller/jumpstarter/rbac.go:193-196
Timestamp: 2025-10-24T11:57:13.455Z
Learning: In the Jumpstarter operator codebase (deploy/operator/internal/controller/jumpstarter/rbac.go), the Role created by `createRole()` defines RBAC permissions for the managed Jumpstarter controller application, not for the operator itself. The managed controller needs delete permissions on secrets for its runtime operations.

Applied to files:

  • deploy/operator/internal/controller/jumpstarter/rbac.go
  • deploy/operator/config/rbac/role.yaml
🧬 Code graph analysis (2)
deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go (4)
deploy/operator/internal/controller/jumpstarter/endpoints/endpoints.go (1)
  • Reconciler (34-37)
deploy/operator/api/v1alpha1/jumpstarter_types.go (2)
  • Jumpstarter (525-531)
  • Endpoint (380-414)
deploy/operator/internal/log/levels.go (2)
  • LevelTrace (17-17)
  • LevelDebug (12-12)
internal/config/types.go (9)
  • Config (11-15)
  • Provisioning (42-44)
  • Grpc (47-49)
  • Keepalive (53-81)
  • Authentication (19-23)
  • Internal (26-33)
  • K8s (36-39)
  • Router (87-87)
  • RouterEntry (90-97)
deploy/operator/internal/controller/jumpstarter/rbac.go (2)
deploy/operator/internal/controller/jumpstarter/jumpstarter_controller.go (1)
  • JumpstarterReconciler (54-58)
deploy/operator/api/v1alpha1/jumpstarter_types.go (1)
  • Jumpstarter (525-531)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: deploy-with-operator
  • GitHub Check: deploy-kind
  • GitHub Check: tests
  • GitHub Check: e2e-tests (ubuntu-24.04)
  • GitHub Check: e2e-tests-release-0-7
  • GitHub Check: e2e-tests (ubuntu-24.04-arm)
  • GitHub Check: lint-go
🔇 Additional comments (1)
deploy/operator/internal/controller/jumpstarter/rbac.go (1)

71-79: Good: CreateOrUpdate now sets ownerRef on create.

Convergence and GC are correct for Role and RoleBinding.

Also applies to: 115-124

@mangelajo mangelajo merged commit 37c64f8 into main Oct 27, 2025
9 checks passed
@mangelajo mangelajo deleted the operator-implementation branch October 27, 2025 17:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants