Skip to content

Configure secrets as env variables#503

Merged
menakaj merged 9 commits intowso2:mainfrom
hanzjk:secrets-ui-changes
Mar 12, 2026
Merged

Configure secrets as env variables#503
menakaj merged 9 commits intowso2:mainfrom
hanzjk:secrets-ui-changes

Conversation

@hanzjk
Copy link
Copy Markdown
Contributor

@hanzjk hanzjk commented Mar 6, 2026

Purpose

Describe the problems, issues, or needs driving this feature/fix and include links to related issues in the following format: Resolves issue1, issue2, etc.

resolves #432

This PR contains the following UI changes to configure env variable values as secrets. Additionaly it contains backend logic changes related to refactoring the secret syncing logic.

Env provisioning at creation flow

Screenshot 2026-03-06 at 13 44 06

Env view at deploy pane

Screenshot 2026-03-06 at 13 43 26

Editing an exiting secret when deploying

Screenshot 2026-03-06 at 13 43 47 1

Adding a new env variable when deploying

Screenshot 2026-03-06 at 13 43 34

Goals

Describe the solutions that this feature/fix will introduce to resolve the problems described above

Approach

Describe how you are implementing the solutions. Include an animated GIF or screenshot if the change affects the UI (email documentation@wso2.com to review all UI text). Include a link to a Markdown file or Google doc if the feature write-up is too long to paste here.

User stories

Summary of user stories addressed by this change>

Release note

Brief description of the new feature or bug fix as it will appear in the release notes

Documentation

Link(s) to product documentation that addresses the changes of this PR. If no doc impact, enter �N/A� plus brief explanation of why there�s no doc impact

Training

Link to the PR for changes to the training content in https://github.com/wso2/WSO2-Training, if applicable

Certification

Type �Sent� when you have provided new/updated certification questions, plus four answers for each question (correct answer highlighted in bold), based on this change. Certification questions/answers should be sent to certification@wso2.com and NOT pasted in this PR. If there is no impact on certification exams, type �N/A� and explain why.

Marketing

Link to drafts of marketing content that will describe and promote this feature, including product page changes, technical articles, blog posts, videos, etc., if applicable

Automation tests

  • Unit tests

    Code coverage information

  • Integration tests

    Details about the test cases and coverage

Security checks

Samples

Provide high-level details about the samples related to this feature

Related PRs

List any other related PRs

Migrations (if applicable)

Describe migration steps and platforms on which migration has been tested

Test environment

List all JDK versions, operating systems, databases, and browser/versions on which this feature/fix was tested

Learning

Describe the research phase and any blog posts, patterns, libraries, or add-ons you used to solve the problem.

Summary by CodeRabbit

  • New Features

    • Mark environment variables and configuration items as sensitive with secretRef support, UI masking/edit flows, and secret-preservation rules
    • Create and fetch workflow runs via new workflow-run APIs
    • New/updated build & publish workflow templates for image builds and workload generation
  • Bug Fixes / Improvements

    • Simplified local setup: consolidated scripts and removed legacy Thunder and Docker kubeconfig steps
    • Secret storage unified to KV v2 upsert semantics for safer updates
  • Configuration Updates

    • Defaults: KV v2 secret provider, cluster naming and gateway/TLS adjustments

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 6, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Removed legacy CR-based resource ops and observer spec; migrated OpenChoreo client to typed OpenAPI workflow-run and secret-reference APIs, standardized secret storage to KV v2 Upsert, added secretRef/isSensitive fields across models and UI, and reorganized setup scripts into modular utilities and new Argo workflow templates.

Changes

Cohort / File(s) Summary
Top-level build & setup
Makefile, deployments/scripts/*, deployments/scripts/utils.sh, deployments/scripts/setup-prerequisites.sh, deployments/scripts/setup-k3d.sh, deployments/scripts/setup-openchoreo.sh, deployments/scripts/setup-platform.sh, deployments/scripts/teardown.sh
Consolidated/setup refactor: removed Thunder/docker-kubeconfig targets & scripts, standardized cluster name to openchoreo-local-setup, added utils.sh helpers, introduced setup-prerequisites.sh, modularized install flows, and adjusted k3d/colima handling.
Makefile adjustments
Makefile
Removed setup-thunder and setup-kubeconfig-docker from PHONY/help; setup-k3d now chains ./setup-k3d.sh && ./setup-prerequisites.sh; updated dev-up prerequisites.
OpenChoreo client API surface
agent-manager-service/clients/openchoreosvc/client/*.go (client.go, generic-workflows.go, secret_references.go, errors.go, types.go, etc.)
Replaced generic Apply/Get/Delete resource ops with typed OpenAPI methods (Create/Get WorkflowRun) and full SecretReference CRUD; introduced typed ErrorResponses and moved from map/CR handling to typed request/response models.
Client implementations & removals
agent-manager-service/clients/openchoreosvc/client/*, .../client/resources.go (removed), .../client/utils.go (removed)
Removed legacy resource-based client and endpoint-parsing utilities; added dedicated secret_reference and workflow run implementations and converters.
Secret management & providers
agent-manager-service/clients/secretmanagersvc/*, .../providers/openbao/*, agent-manager-service/config/*, wiring/*
Switched Create/Update to UpsertSecret, unified OpenBao provider to KV v2 (removed version), simplified paths and config (removed OPENBAO_VERSION and KubeConfig).
Mocks & tests
agent-manager-service/clients/clientmocks/*, agent-manager-service/tests/*, tests/apitestutils/mock_clients.go
Updated mocks to new signatures (CreateWorkflowRun/GetWorkflowRun, InjectTracingEnvVars, UpsertSecret/GetSecret), adjusted call-tracking and test assertions accordingly.
Service layer
agent-manager-service/services/*
Adjusted services to use UpsertSecret and typed OpenChoreo APIs; added error-translation helpers; updated monitor executor/scheduler to use workflow-run APIs and removed CR cleanup.
Models & OpenAPI
agent-manager-service/spec/*, models/*, docs/api_v1_openapi.yaml
Added secretRef/isSensitive to env/config models; made EnvironmentVariable.Value optional; added InputInterface BasePath/Visibility; updated OpenAPI docs.
Console UI & types
console/workspaces/libs/shared-component/..., console/workspaces/libs/views/..., console/workspaces/libs/types/..., console/workspaces/pages/*
Rewrote EnvironmentVariable UI to support secret lifecycle (mask/edit/new secret), introduced EnvVariableItem with isSensitive/secretRef, updated schemas and payload handling to preserve unedited secrets.
Helm & Argo workflows
deployments/helm-charts/wso2-amp-build-extension/..., .../wso2-amp-platform-resources-extension/...
Added ClusterWorkflowTemplates (checkout, build variants, generate-workload, publish), inlined workflows into component workflows, added ExternalSecret/registry push resources and TLS/authz templates; removed legacy CI workflow YAMLs.
Removed files / specs
agent-manager-service/clients/observabilitysvc/observer-api.yaml, agent-manager-service/clients/openchoreosvc/client/resources.go, deployments/scripts/setup-amp-thunder.sh, deployments/scripts/generate-docker-kubeconfig.sh, multiple old Argo CI templates
Deleted Observer OpenAPI spec, legacy resource-based client, deprecated scripts, and multiple old Argo CI workflow templates.
Minor & config updates
assorted Makefiles, deployments/docker-compose.yml, deployments/single-cluster/*, evaluation-job/Makefile, python-instrumentation-provider/Makefile, traces-observer-service/Makefile
Standardized cluster naming across Makefiles/scripts, updated compose and single-cluster port mappings/hosts, adjusted default OTEL endpoint and environment defaults.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Service as Agent Manager Service
    participant SecretMgr as Secret Manager (KV v2)
    participant OpenChoreo as OpenChoreo API
    participant DB as Database

    Service->>SecretMgr: UpsertSecret(location, data)
    activate SecretMgr
    SecretMgr-->>Service: secretPath
    deactivate SecretMgr

    Service->>OpenChoreo: CreateSecretReference(namespace, CreateSecretReferenceRequest{secretPath})
    activate OpenChoreo
    OpenChoreo-->>Service: SecretReferenceInfo
    deactivate OpenChoreo

    Service->>OpenChoreo: CreateWorkflowRun(namespace, CreateWorkflowRunRequest{workflow, parameters})
    activate OpenChoreo
    OpenChoreo-->>Service: WorkflowRunResponse{name, status}
    deactivate OpenChoreo

    Service->>DB: Persist run record (runID, workflowName, secretRef...)
    DB-->>Service: OK
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 I hopped through signatures, turned CRs into calls,
Secrets now nest where KV v2 gently sprawls,
UI hides whispers and marks what is true,
Setup scripts hum in parallel — tidy and new,
A little rabbit applauds the tidy code halls.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

@hanzjk hanzjk marked this pull request as ready for review March 6, 2026 14:08
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 18

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (5)
agent-manager-service/clients/secretmanagersvc/providers/openbao/client.go (1)

95-109: ⚠️ Potential issue | 🟡 Minor

Metadata written before secret data may leave orphan metadata on failure.

If the secret write at line 106 fails, the metadata written at lines 97-104 will remain as an orphan entry. Consider reversing the order (write secret first, then metadata) or adding cleanup logic on secret write failure.

🛠️ Proposed fix to reverse order
-	// Write metadata separately for v2
-	metaPath := c.buildMetadataPath(key)
-	_, err = c.client.Logical().WriteWithContext(ctx, metaPath, map[string]interface{}{
-		"custom_metadata": map[string]string{
-			"managed-by": metadata.ManagedBy,
-		},
-	})
-	if err != nil {
-		return fmt.Errorf("failed to write metadata: %w", err)
-	}
-
 	_, err = c.client.Logical().WriteWithContext(ctx, secretPath, secretToPush)
 	if err != nil {
 		return fmt.Errorf("failed to write secret: %w", err)
 	}

+	// Write metadata after secret to avoid orphans on failure
+	metaPath := c.buildMetadataPath(key)
+	_, err = c.client.Logical().WriteWithContext(ctx, metaPath, map[string]interface{}{
+		"custom_metadata": map[string]string{
+			"managed-by": metadata.ManagedBy,
+		},
+	})
+	if err != nil {
+		return fmt.Errorf("failed to write metadata: %w", err)
+	}
+
 	return nil
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@agent-manager-service/clients/secretmanagersvc/providers/openbao/client.go`
around lines 95 - 109, The current code writes metadata (metaPath via
c.client.Logical().WriteWithContext) before writing the secret (secretPath),
which can leave orphan metadata if the secret write fails; change the sequence
to write the secret first (call c.client.Logical().WriteWithContext with
secretPath and secretToPush), then write metadata using
buildMetadataPath/metaPath, or if you must keep the metadata-first approach add
cleanup logic to delete the metadata on secret write failure (call
c.client.Logical().Delete/appropriate API with metaPath) and return any combined
error; ensure both operations use ctx and propagate errors with fmt.Errorf
wrapping.
agent-manager-service/services/agent_configuration_service.go (1)

1344-1353: ⚠️ Potential issue | 🟠 Major

Make the provider secret path config-specific before using UpsertSecret.

This SecretLocation is scoped only by org/project/agent/environment/provider handle, so two configs for the same agent/environment/provider will now upsert the same KV entry. Rotating or deleting one config will overwrite or remove the other config's provider credential. Using proxyName (or config.UUID) in the location would keep each config isolated.

💡 Minimal fix
 			secretLoc := secretmanagersvc.SecretLocation{
 				OrgName:         config.OrganizationName,
 				ProjectName:     config.ProjectName,
 				AgentName:       config.AgentID,
 				EnvironmentName: envUUID.String(), // Use UUID for stable, consistent KV path
-				ComponentName:   provider.Artifact.Handle,
+				ComponentName:   proxyName,
 				SecretKey:       secretmanagersvc.SecretKeyAPIKey,
 			}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@agent-manager-service/services/agent_configuration_service.go` around lines
1344 - 1353, The SecretLocation is currently too coarse and will collide across
configs; update the secretLoc construction before calling
s.secretClient.UpsertSecret to include the config-specific identifier (e.g.,
config.UUID or proxyName) so each config gets its own KV path. Concretely,
modify the SecretLocation used in secretLoc (or its ComponentName) to
incorporate config.UUID/proxyName (for example by appending "/"+config.UUID to
provider.Artifact.Handle or setting a dedicated field if available) so the
UpsertSecret call targets a config-scoped path.
agent-manager-service/services/llm_provider_service.go (1)

189-197: ⚠️ Potential issue | 🟠 Major

UpsertSecret is unsafe in the create path with the current rollback logic.

If two requests create the same provider handle concurrently, both write the same secret location, one DB insert wins, and the loser hits the unique constraint. The loser then executes the compensating delete at Lines 236-245 and can remove the winner's secret. Create() needs create-only secret semantics, or rollback must prove that this request exclusively owns the path before deleting it.

Also applies to: 236-245

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@agent-manager-service/services/llm_provider_service.go` around lines 189 -
197, The Create() flow uses s.secretClient.UpsertSecret with
secretmanagersvc.SecretLocation/SecretKeyAPIKey which is unsafe on concurrent
creates; change this to a create-only operation (e.g.,
s.secretClient.CreateSecret) so a second concurrent request fails without
overwriting an existing path, or if the secret client has no create-only API,
generate a unique request token stored in the secret value and after Upsert
check that the returned kvPath/metadata matches this request token before
proceeding; also update the rollback/delete logic (the compensating delete
currently invoked around the block at lines 236-245) to verify ownership
(compare returned kvPath or the stored request token/owner id) and only call
s.secretClient.DeleteSecret when the secret was created by this request to avoid
deleting another request's secret.
deployments/helm-charts/wso2-amp-platform-resources-extension/templates/component-types/agent-api.yaml (1)

107-117: ⚠️ Potential issue | 🟠 Major

Don't hardcode the upstream rewrite to /.

This strips the workload's configured base path. Chat agents expect /chat, and custom API agents can define their own base path, so rewriting every request to / will send traffic to the wrong upstream route.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@deployments/helm-charts/wso2-amp-platform-resources-extension/templates/component-types/agent-api.yaml`
around lines 107 - 117, The URLRewrite filter currently hardcodes
ReplacePrefixMatch to "/", which strips the workload's configured base path and
breaks upstream routes; update the URLRewrite configuration (the filter with
type: URLRewrite and nested urlRewrite.path.type: ReplacePrefixMatch) to use the
component's configured base path or avoid rewriting to "/" — e.g., replace the
hardcoded replacePrefixMatch="/" with the dynamic base path value (derived from
the same metadata used for componentName or the workload's basePath setting) or
remove the ReplacePrefixMatch filter so the original request path is preserved
for chat agents and custom API agents.
deployments/scripts/setup-openchoreo.sh (1)

367-370: ⚠️ Potential issue | 🟠 Major

sed -i '' breaks on GNU sed / Linux environments.

This BSD/macOS-specific syntax fails on Linux devcontainers and CI runners using GNU sed. GNU sed does not accept an empty extension argument to -i.

Use a portable approach instead. The simplest cross-platform solution:

-sed -i '' 's|http://amp-api.wso2-amp.svc.cluster.local:9000/auth/external/jwks.json|http://host.docker.internal:9000/auth/external/jwks.json|g' "${SCRIPT_DIR}/../values/api-platform-operator-local-config.yaml"
+sed -i.bak 's|http://amp-api.wso2-amp.svc.cluster.local:9000/auth/external/jwks.json|http://host.docker.internal:9000/auth/external/jwks.json|g' "${SCRIPT_DIR}/../values/api-platform-operator-local-config.yaml" && rm -f "${SCRIPT_DIR}/../values/api-platform-operator-local-config.yaml.bak"

This works on both GNU sed (Linux) and BSD sed (macOS) because both accept a non-empty backup suffix.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deployments/scripts/setup-openchoreo.sh` around lines 367 - 370, The sed
invocation using the BSD-only form sed -i '' will fail on GNU sed; update the
replace operation in the block that copies
api-platform-operator-local-config.yaml (referencing SCRIPT_DIR and
api-platform-operator-local-config.yaml) to use a portable approach: invoke sed
with a non-empty backup suffix (so it works on both GNU and BSD sed) and then
remove the created backup file afterward, ensuring the replacement of the JWKS
URI remains the same; adjust the script to handle the backup file cleanup so no
extra files remain.
🟡 Minor comments (3)
agent-manager-service/clients/openchoreosvc/client/projects.go-125-132 (1)

125-132: ⚠️ Potential issue | 🟡 Minor

Missing JSON400 in update error handling.

The CreateProject error handling (lines 52-58) includes JSON400 for validation errors, but UpdateProjectWithResponse error handling omits it. If the API returns 400 for invalid update payloads, it won't be properly mapped.

🐛 Proposed fix
 	if updateResp.StatusCode() != http.StatusOK {
 		return handleErrorResponse(updateResp.StatusCode(), ErrorResponses{
+			JSON400: updateResp.JSON400,
 			JSON401: updateResp.JSON401,
 			JSON403: updateResp.JSON403,
 			JSON404: updateResp.JSON404,
 			JSON500: updateResp.JSON500,
 		})
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@agent-manager-service/clients/openchoreosvc/client/projects.go` around lines
125 - 132, The UpdateProjectWithResponse error branch is missing mapping for
validation errors; update the non-OK handling for updateResp in
UpdateProjectWithResponse to include JSON400 in the ErrorResponses (i.e., add
JSON400: updateResp.JSON400) so 400 responses are properly passed to
handleErrorResponse; locate the block that checks updateResp.StatusCode() !=
http.StatusOK and add the JSON400 field to the returned ErrorResponses struct.
deployments/scripts/utils.sh-1-9 (1)

1-9: ⚠️ Potential issue | 🟡 Minor

Add a shebang or shell directive.

Static analysis flagged that the shell type is unknown. Since this is a utility script meant to be sourced, add a shell directive comment at the top.

🔧 Proposed fix
+#!/bin/bash
+# shellcheck shell=bash
+
 # Util: Check if a command is installed
 check_command() {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deployments/scripts/utils.sh` around lines 1 - 9, The script lacks a shell
directive so static analysis can't determine the shell; add a shell directive
comment at the top (for example a ShellCheck directive like "# shellcheck
shell=bash" or similar) so analyzers know the intended shell for this file;
update the top of deployments/scripts/utils.sh above the check_command function
to include the directive and keep the rest (check_command) unchanged.
agent-manager-service/services/monitor_scheduler_test.go-79-83 (1)

79-83: ⚠️ Potential issue | 🟡 Minor

Return a typed not-found error from the default workflow-run mock.

Line 81 currently returns a plain fmt.Errorf, so any scheduler branch that checks errors.Is(err, utils.ErrNotFound) will behave differently under test than it does against the real client.

🧪 Suggested fix
 		ocClient: &clientmocks.OpenChoreoClientMock{
 			GetWorkflowRunFunc: func(ctx context.Context, namespaceName string, runName string) (*client.WorkflowRunResponse, error) {
-				return nil, fmt.Errorf("mock: workflow run not found")
+				return nil, fmt.Errorf("%w: mock workflow run not found", utils.ErrNotFound)
 			},
 		},

Also add the utils import.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@agent-manager-service/services/monitor_scheduler_test.go` around lines 79 -
83, The mock GetWorkflowRunFunc currently returns a generic fmt.Errorf which
prevents error comparisons against utils.ErrNotFound; update the mock in
OpenChoreoClientMock so it returns (nil, utils.ErrNotFound) instead of
fmt.Errorf(...), and add the utils import to the test file so the typed
not-found error is used and error checks like errors.Is(err, utils.ErrNotFound)
behave correctly.
🧹 Nitpick comments (21)
deployments/helm-charts/wso2-amp-platform-resources-extension/templates/environment.yaml (1)

12-14: Consider templating the data plane name.

Line 14 still hardcodes default, which makes this chart harder to reuse in clusters where the target DataPlane has a different name. Since the rest of the resource is already values-driven, wiring dataPlaneRef.name through Helm values would keep this template environment-agnostic.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@deployments/helm-charts/wso2-amp-platform-resources-extension/templates/environment.yaml`
around lines 12 - 14, The template hardcodes dataPlaneRef.name as "default";
change it to read from Helm values so the DataPlane name is configurable.
Replace the literal under dataPlaneRef.name with the Helm value reference you
use elsewhere (e.g., .Values.dataPlane.name or similar), ensure a sensible
default via default/required function if desired, and update any chart
values.yaml to include the dataPlane.name key so templates referencing
dataPlaneRef.name (dataPlaneRef.name) will use the provided value at install
time.
deployments/single-cluster/values-op.yaml (1)

9-13: Clarify whether service.type is needed when enabled: false.

If OpenSearch is disabled, the service.type: LoadBalancer configuration may be unused. Consider adding a comment explaining whether this is intentional (e.g., for documentation or easy re-enablement) or remove it to reduce confusion.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deployments/single-cluster/values-op.yaml` around lines 9 - 13, Clarify or
remove the unused service.type setting under openSearch when openSearch.enabled
is false: either add a comment next to openSearch.enabled/service.type stating
that service.type: LoadBalancer is intentionally kept for documentation or quick
re-enablement, or delete the service block entirely to avoid confusion; update
the openSearch -> enabled and openSearch -> service.type entries accordingly so
future readers know whether service.type is intentionally preserved or should be
removed.
deployments/helm-charts/wso2-amp-platform-resources-extension/templates/authz-cluster-role-binding.yaml (1)

17-20: Consider removing inline defaults that duplicate values.yaml.

Lines 17 and 20 provide inline defaults (| default "amp-api-client" and | default "super-admin") that duplicate the defaults already defined in values.yaml. This creates a maintenance risk if the defaults are updated in one place but not the other.

♻️ Suggested simplification
 spec:
   effect: allow
   entitlement:
     claim: sub
-    value: {{ .Values.authz.ampApiClient.clientId | default "amp-api-client" }}
+    value: {{ .Values.authz.ampApiClient.clientId }}
   roleRef:
     kind: AuthzClusterRole
-    name: {{ .Values.authz.ampApiClient.role | default "super-admin" }}
+    name: {{ .Values.authz.ampApiClient.role }}

Since values.yaml already defines these defaults, the inline defaults are redundant.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@deployments/helm-charts/wso2-amp-platform-resources-extension/templates/authz-cluster-role-binding.yaml`
around lines 17 - 20, Remove the redundant inline defaults in the template so
the chart relies on values.yaml defaults: edit the
authz-cluster-role-binding.yaml template to stop using the Helm default operator
on .Values.authz.ampApiClient.clientId and .Values.authz.ampApiClient.role
(remove `| default "amp-api-client"` and `| default "super-admin"`), leaving the
raw value references so updates in values.yaml are authoritative.
deployments/helm-charts/wso2-amp-build-extension/templates/cluster-workflow-templates/balleina-buildpack-build.yaml (1)

39-39: Add a timeout to the Podman readiness check.

The polling loop will hang indefinitely if Podman fails to start. Consider adding a timeout to fail gracefully with a clear error message.

⏱️ Suggested fix with timeout
-            until podman info --format '{{ "{{" }}.Host.RemoteSocket.Exists{{ "}}" }}' 2>/dev/null | grep -q true; do sleep 1; done
+            TIMEOUT=60
+            ELAPSED=0
+            until podman info --format '{{ "{{" }}.Host.RemoteSocket.Exists{{ "}}" }}' 2>/dev/null | grep -q true; do
+              sleep 1
+              ELAPSED=$((ELAPSED + 1))
+              if [ "$ELAPSED" -ge "$TIMEOUT" ]; then
+                echo "Error: Podman daemon failed to start within ${TIMEOUT}s"
+                exit 1
+              fi
+            done
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@deployments/helm-charts/wso2-amp-build-extension/templates/cluster-workflow-templates/balleina-buildpack-build.yaml`
at line 39, The podman readiness loop using "until podman info --format
'{{{{.Host.RemoteSocket.Exists}}}}' 2>/dev/null | grep -q true; do sleep 1;
done" can hang indefinitely; modify this loop to include a timeout (e.g., a
MAX_WAIT_SECONDS variable or use the timeout command), increment a counter each
second and break/fail when exceeded, and on timeout print a clear error message
and exit non-zero so the container/job fails fast; update the same shell block
that contains the until ... podman info command to implement this timeout and
error handling.
agent-manager-service/Makefile (1)

57-62: Make remote spec downloads fail fast and reproducible.

Lines 59 and 69 fetch from moving release branches with curl -sL. A 404/500 still exits 0, regenerating later can pull a different spec without any repo change, and the fixed /tmp/*.yaml filenames can race across parallel runs. Prefer immutable revisions plus curl -fsSL and mktemp/trap cleanup for both targets.

♻️ Proposed pattern
-	`@curl` -sL "https://raw.githubusercontent.com/openchoreo/openchoreo/release-v0.16/openapi/openchoreo-api.yaml" -o /tmp/openchoreo-api.yaml
-	`@oapi-codegen` -config clients/openchoreosvc/gen/oapi-codegen.yaml /tmp/openchoreo-api.yaml
-	`@oapi-codegen` -config clients/openchoreosvc/gen/oapi-codegen-client.yaml /tmp/openchoreo-api.yaml
-	`@rm` /tmp/openchoreo-api.yaml
+	`@spec_file`=$$(mktemp); \
+	  trap 'rm -f "$$spec_file"' EXIT; \
+	  curl -fsSL "https://raw.githubusercontent.com/openchoreo/openchoreo/<pinned-commit>/openapi/openchoreo-api.yaml" -o "$$spec_file"; \
+	  oapi-codegen -config clients/openchoreosvc/gen/oapi-codegen.yaml "$$spec_file"; \
+	  oapi-codegen -config clients/openchoreosvc/gen/oapi-codegen-client.yaml "$$spec_file"

Apply the same pattern to gen-observer-client.

Also applies to: 67-72

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@agent-manager-service/Makefile` around lines 57 - 62, Update the
gen-oc-client and gen-observer-client make targets to fail fast and avoid races:
replace curl -sL with curl -fsSL, fetch the OpenAPI spec from an immutable
commit/tag (not moving release branch), write to a unique temporary file created
with mktemp, and register a trap to rm the temp file on exit; then pass that
temp filename to the existing oapi-codegen invocations (the targets
gen-oc-client and gen-observer-client reference the /tmp/*.yaml file and the
oapi-codegen -config calls), ensuring cleanup and non-zero exit on HTTP errors.
Makefile (1)

1-1: Static analysis hints: consider adding standard targets.

The static analyzer flags missing all, clean, and test phony targets. These are conventional Makefile targets that improve discoverability. Consider adding them if they would benefit the development workflow.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Makefile` at line 1, Add conventional phony targets to the Makefile to
satisfy static analysis: add "all" that depends on the default dev-up or build
target, "clean" that removes generated artifacts/containers/volumes used by your
workflow, and "test" that runs the project's test suite or CI checks; update the
existing .PHONY list to include all, clean, and test and ensure the
corresponding target names (all, clean, test) are implemented in the file so the
analyzer no longer flags them.
deployments/helm-charts/wso2-amp-build-extension/templates/cluster-workflow-templates/google-buildpack-build.yaml (1)

66-66: Hardcoded lifecycle version may drift out of sync.

The lifecycle version 0.20.5 is hardcoded here, while the image is pulled dynamically from a Helm include. If the pulled lifecycle image version changes, this tag will become stale and cause runtime failures.

Consider parameterizing this version or deriving it from the same source as the LIFECYCLE_IMG variable.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@deployments/helm-charts/wso2-amp-build-extension/templates/cluster-workflow-templates/google-buildpack-build.yaml`
at line 66, The podman tag command uses a hardcoded lifecycle version "0.20.5"
which can drift from the dynamically provided LIFECYCLE_IMG; change the tag to
be derived from the same source as LIFECYCLE_IMG (or parameterize it) so they
stay in sync. Update the tag invocation that references LIFECYCLE_IMG to compute
or use a LIFECYCLE_TAG/parameter (or extract the image tag from LIFECYCLE_IMG)
and use that variable instead of the fixed "0.20.5" so the pushed tag always
matches the pulled image.
console/workspaces/libs/shared-component/src/components/DeploymentConfig.tsx (1)

46-52: Consider reusing EnvVariableItem type.

The inline state type duplicates the EnvVariableItem interface exported from EnvironmentVariable.tsx. Reusing the exported type improves maintainability and ensures consistency.

♻️ Suggested fix
+import { EnvironmentVariable, EnvVariableItem } from "./EnvironmentVariable";
-import { EnvironmentVariable } from "./EnvironmentVariable";
 ...
-    const [envVariables, setEnvVariables] = useState<Array<{
-        key: string;
-        value: string;
-        isSensitive?: boolean;
-        secretRef?: string;
-        isSecretEdited?: boolean;
-    }>>([]);
+    const [envVariables, setEnvVariables] = useState<Array<EnvVariableItem>>([]);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@console/workspaces/libs/shared-component/src/components/DeploymentConfig.tsx`
around lines 46 - 52, The state declaration for envVariables duplicates the
EnvVariableItem shape; import and use the exported EnvVariableItem type from
EnvironmentVariable (e.g., add an import for EnvVariableItem from
EnvironmentVariable.tsx) and change the useState generic to use
EnvVariableItem[] for envVariables and setEnvVariables so the component
references the single source of truth (EnvVariableItem) instead of the inline
type.
console/workspaces/libs/shared-component/src/components/EnvironmentVariable.tsx (2)

154-159: Avoid using array index as React key.

Using index as the key can cause issues when items are reordered or deleted, leading to incorrect component state preservation. Consider using a unique identifier like envVar.key combined with index as a fallback.

♻️ Suggested fix
-            <Card key={index} variant="outlined" sx={{ p: 2 }}>
+            <Card key={`${envVar.key}-${index}`} variant="outlined" sx={{ p: 2 }}>
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@console/workspaces/libs/shared-component/src/components/EnvironmentVariable.tsx`
around lines 154 - 159, The Card elements are using the array index as the React
key (in the envVariables.map block), which can break state when items change;
update the key prop to use a stable unique identifier from the item (e.g.,
envVar.key or envVar.name) and only fall back to a deterministic composite with
the index (e.g., `${envVar.key}-${index}`) if the unique id is missing; adjust
the Card key assignment in the envVariables.map callback that references
editingIndex and isSecret to use that stable identifier instead of plain index.

55-58: Unused props declared in interface.

The props keyFieldsDisabled and isValueSecret are declared in EnvironmentVariableProps but are never destructured or used in the component implementation. Consider removing them if not needed, or implementing their intended behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@console/workspaces/libs/shared-component/src/components/EnvironmentVariable.tsx`
around lines 55 - 58, EnvironmentVariableProps declares keyFieldsDisabled and
isValueSecret but EnvironmentVariable doesn't use them; either remove these
unused props from EnvironmentVariableProps or implement their behavior by
destructuring keyFieldsDisabled and isValueSecret in the EnvironmentVariable
component and applying them: use keyFieldsDisabled to set the disabled attribute
(or readOnly) on the key input/control(s) (look for input, TextField or KeyInput
in the component) and use isValueSecret to render the value input as a
password/secret field or toggle masking (apply type="password" or secret
rendering to the value input control, e.g., ValueInput/textarea); update prop
types and any callers accordingly.
deployments/scripts/utils.sh (1)

64-100: Quote variable expansions to prevent word splitting.

Variables like ${CLUSTER_NAME} and ${CLUSTER_CONTEXT} should be quoted to handle values containing spaces or special characters safely.

♻️ Proposed fix
 refresh_kubeconfig() {
     echo "🔄 Refreshing kubeconfig..."
-    k3d kubeconfig merge ${CLUSTER_NAME} --kubeconfig-merge-default --kubeconfig-switch-context
+    k3d kubeconfig merge "${CLUSTER_NAME}" --kubeconfig-merge-default --kubeconfig-switch-context
 }

 wait_for_cluster() {
     echo "⏳ Waiting for cluster to be ready..."
     for i in {1..30}; do
-        if kubectl cluster-info --context ${CLUSTER_CONTEXT} --request-timeout=5s &>/dev/null; then
+        if kubectl cluster-info --context "${CLUSTER_CONTEXT}" --request-timeout=5s &>/dev/null; then
             echo "✅ Cluster is now ready"
             return 0
         fi
         echo "   Attempt $i/30..."
         sleep 2
     done
     return 1
 }

 ensure_cluster_accessible() {
     refresh_kubeconfig

     echo "🔍 Checking cluster accessibility..."
-    if kubectl cluster-info --context ${CLUSTER_CONTEXT} --request-timeout=10s &>/dev/null; then
+    if kubectl cluster-info --context "${CLUSTER_CONTEXT}" --request-timeout=10s &>/dev/null; then
         echo "✅ Cluster is running and accessible"
         return 0
     fi

     echo "⚠️  Cluster not accessible. Restarting..."
-    k3d cluster stop ${CLUSTER_NAME} 2>/dev/null || true
-    k3d cluster start ${CLUSTER_NAME}
+    k3d cluster stop "${CLUSTER_NAME}" 2>/dev/null || true
+    k3d cluster start "${CLUSTER_NAME}"

     refresh_kubeconfig
     wait_for_cluster
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deployments/scripts/utils.sh` around lines 64 - 100, In refresh_kubeconfig,
wait_for_cluster, and ensure_cluster_accessible, quote all variable expansions
to prevent word-splitting and globbing: replace occurrences of ${CLUSTER_NAME}
and ${CLUSTER_CONTEXT} used in k3d and kubectl commands with "$CLUSTER_NAME" and
"$CLUSTER_CONTEXT" (e.g., in refresh_kubeconfig's k3d kubeconfig merge,
wait_for_cluster's kubectl cluster-info call, and ensure_cluster_accessible's
kubectl/k3d cluster stop/start and subsequent calls); keep the rest of the logic
unchanged.
deployments/scripts/setup-colima.sh (1)

51-55: Consider making Colima configuration overridable via environment variables.

The CPU, memory, and VM type are hardcoded. Consider allowing environment variable overrides for flexibility in different development environments.

♻️ Proposed enhancement
-COLIMA_CPU=4
-COLIMA_MEMORY=8
-COLIMA_VM_TYPE="vz"
+COLIMA_CPU="${COLIMA_CPU:-4}"
+COLIMA_MEMORY="${COLIMA_MEMORY:-8}"
+COLIMA_VM_TYPE="${COLIMA_VM_TYPE:-vz}"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deployments/scripts/setup-colima.sh` around lines 51 - 55, The colima start
invocation currently hardcodes VM type, CPU and memory via COLIMA_VM_TYPE,
COLIMA_CPU and COLIMA_MEMORY; make these values overridable by using environment
variable fallbacks (e.g., default values if not set) and reference them in the
colima start command so callers can set COLIMA_VM_TYPE, COLIMA_CPU,
COLIMA_MEMORY (or alternate names you choose) externally; ensure PROFILE remains
used and add a short comment or default assignment near the top of the script to
document the defaults for colima start.
deployments/scripts/setup-platform.sh (1)

49-51: Consider replacing fixed sleep with health check polling.

The fixed sleep 5 may be insufficient for slower environments or excessive for faster ones. A polling loop checking service health would be more robust.

♻️ Proposed enhancement with health check polling
 echo ""
 echo "⏳ Waiting for services to be healthy..."
-sleep 5
+MAX_WAIT=60
+WAIT_INTERVAL=2
+elapsed=0
+while [ $elapsed -lt $MAX_WAIT ]; do
+    if docker compose -f "$COMPOSE_FILE" ps | grep -q "healthy"; then
+        break
+    fi
+    sleep $WAIT_INTERVAL
+    elapsed=$((elapsed + WAIT_INTERVAL))
+done
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deployments/scripts/setup-platform.sh` around lines 49 - 51, Replace the
hardcoded sleep by polling actual service health: remove the single "sleep 5"
following the echo "⏳ Waiting for services to be healthy..." and implement a
retry loop that queries the services' health endpoints (e.g., via curl or
kubectl get endpoints) until they return success or a configurable timeout is
reached; check the same health endpoints for all required services, wait a short
interval between retries, log each attempt, and exit non‑zero if the timeout
elapses so callers can fail fast.
agent-manager-service/tests/monitor_test.go (1)

1159-1204: This test still validates API behavior, not DB cleanup.

After the delete, it only re-GETs through the API. That can still pass if rows remain soft-deleted or related records are orphaned, so either query the DB directly here or rename the test to match what it actually covers.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@agent-manager-service/tests/monitor_test.go` around lines 1159 - 1204, The
test TestDeleteMonitor_DBCleanup currently validates API-level deletion only
(via apitestutils.MakeAppClientWithDeps and HTTP calls) but claims to verify DB
cleanup; either rename the test to reflect API behavior or update it to assert
DB state directly: after calling DELETE using the same HTTP flow, add a direct
database query (using the project/agent DB handle used by wiring.TestClients or
the repository methods that back the monitors) to check that monitor rows and
related records are actually removed (or not soft-deleted/orphaned), referencing
the same monitorName used in TestDeleteMonitor_DBCleanup; if you choose
renaming, change the test name and comment to e.g.
TestDeleteMonitor_APIBehavior.
deployments/scripts/setup-prerequisites.sh (1)

21-26: Validate kubectl and helm in this standalone script.

Unlike setup-k3d.sh, this file can be executed directly but never checks that its required binaries exist. Adding the same guards here keeps failures actionable.

🔍 Suggested change
 # Check prerequisites
+check_command kubectl
+check_command helm
 if ! kubectl cluster-info --context $CLUSTER_CONTEXT &> /dev/null; then
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deployments/scripts/setup-prerequisites.sh` around lines 21 - 26, This script
lacks checks that required binaries exist before using them; add guard checks at
the top to verify kubectl and helm are on PATH (e.g. use "command -v kubectl
>/dev/null || { echo ...; exit 1; }" and similarly for helm) before the existing
kubectl cluster-info call so failures are actionable; keep messages consistent
with the current echo style and reference CLUSTER_CONTEXT only after confirming
kubectl exists.
agent-manager-service/tests/apitestutils/mock_clients.go (1)

161-163: Consider a more descriptive error message.

The generic fmt.Errorf("error") provides minimal context when debugging test failures. Consider including the secret path for better diagnostics:

 GetSecretFunc: func(ctx context.Context, secretPath string) (map[string]string, error) {
-    return nil, fmt.Errorf("error")
+    return nil, fmt.Errorf("secret not found at path: %s", secretPath)
 },

This is a minor nit since tests overriding this mock will provide their own implementation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@agent-manager-service/tests/apitestutils/mock_clients.go` around lines 161 -
163, Replace the generic error in the mock GetSecretFunc with a descriptive
message that includes the secretPath so failures show which secret failed;
update the anonymous function assigned to GetSecretFunc to return
fmt.Errorf("failed to get secret %s: %w", secretPath, err) style (or
fmt.Errorf("mock GetSecretFunc: secret %s not found", secretPath) if no wrapped
err) so test logs clearly identify the secret involved when GetSecretFunc is
invoked.
agent-manager-service/clients/openchoreosvc/client/components.go (5)

121-121: Remove commented-out dead code.

This commented line appears to be leftover from a previous implementation and should be removed to improve code clarity.

-	// containerPort, basePath := getInputInterfaceConfig(req)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@agent-manager-service/clients/openchoreosvc/client/components.go` at line
121, Remove the dead commented-out line "containerPort, basePath :=
getInputInterfaceConfig(req)" from components.go to clean up leftover code;
locate the comment inside the relevant function or block where
getInputInterfaceConfig was previously used (search for that identifier or
nearby variable names) and delete the single commented line so the file contains
only active code and meaningful comments.

1415-1427: Redundant type conversions in helper functions.

The string(key) conversion in getAnnotation (line 1419) and getLabel (line 1426) is unnecessary since key is already of type string. These are no-op conversions.

♻️ Suggested cleanup
 func getAnnotation(annotations *map[string]string, key string) string {
 	if annotations == nil {
 		return ""
 	}
-	return (*annotations)[string(key)]
+	return (*annotations)[key]
 }

 func getLabel(labels *map[string]string, key string) string {
 	if labels == nil {
 		return ""
 	}
-	return (*labels)[string(key)]
+	return (*labels)[key]
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@agent-manager-service/clients/openchoreosvc/client/components.go` around
lines 1415 - 1427, The getAnnotation and getLabel helpers use unnecessary no-op
conversions (string(key)) when key is already a string; update both functions
(getAnnotation and getLabel) to remove the redundant string(...) casts and
directly index the map with key (i.e., use (*annotations)[key] and
(*labels)[key]) while keeping the existing nil checks intact.

1232-1236: Missing endpoint URL produces potentially malformed URL.

If endpointURLs[endpointName] is not found (endpoint missing from release binding status), the URL will be constructed as just the basePath without a host, which may be invalid.

Consider handling the missing case explicitly:

🛡️ Suggested fix
+			endpointURL, hasURL := endpointURLs[endpointName]
+			fullURL := ""
+			if hasURL {
+				fullURL = fmt.Sprintf("%s%s", endpointURL, basePath)
+			}
 			details := models.EndpointsResponse{
 				Endpoint: models.Endpoint{
 					Name: endpointName,
-					URL:  fmt.Sprintf("%s%s", endpointURLs[endpointName], basePath),
+					URL:  fullURL,
 				},
 			}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@agent-manager-service/clients/openchoreosvc/client/components.go` around
lines 1232 - 1236, The code builds an endpoint URL using
endpointURLs[endpointName] without checking presence, so if the key is missing
the URL becomes just basePath; update the logic around creating
models.EndpointsResponse / models.Endpoint to first lookup endpointURLs using
val, ok := endpointURLs[endpointName] and handle the missing case (e.g., log or
return an error, skip adding this endpoint, or set Endpoint.URL to an explicit
empty/invalid-marker) rather than concatenating an absent host; adjust callers
of the endpoint construction accordingly to avoid emitting malformed URLs and
ensure consistent behavior when endpointName is not present.

296-297: Exported mutable slice variable can be accidentally modified.

DefaultEndpointVisibility is exported as a mutable slice. Since slices in Go are not thread-safe and any code importing this package could modify the slice's contents, this could cause unexpected behavior.

Consider making it immutable by using a function that returns a fresh copy:

♻️ Suggested refactor
-// DefaultEndpointVisibility is the default visibility for endpoints
-var DefaultEndpointVisibility = []string{string(gen.WorkloadEndpointVisibilityExternal)}

+// DefaultEndpointVisibility returns the default visibility for endpoints
+func DefaultEndpointVisibility() []string {
+	return []string{string(gen.WorkloadEndpointVisibilityExternal)}
+}

Then update usages on lines 312 and 324:

-"visibility":    DefaultEndpointVisibility,
+"visibility":    DefaultEndpointVisibility(),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@agent-manager-service/clients/openchoreosvc/client/components.go` around
lines 296 - 297, DefaultEndpointVisibility is an exported mutable slice that
callers can modify; make it immutable by replacing the exported variable with an
unexported const slice and an exported getter that returns a fresh copy.
Introduce an unexported variable defaultEndpointVisibility :=
[]string{string(gen.WorkloadEndpointVisibilityExternal)} and add func
DefaultEndpointVisibility() []string that allocates a new slice, copies
defaultEndpointVisibility into it, and returns the copy; then update all call
sites that referenced the variable to call DefaultEndpointVisibility() instead.

840-848: Silent continuation on conversion errors may hide data issues.

When convertComponentFromTyped fails, the error is logged but the component is silently skipped. This could mask data corruption or API contract violations. Consider either:

  1. Failing the entire operation for conversion errors
  2. Returning partial results with an indication of failures
  3. At minimum, including the skipped component count in the response metadata
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@agent-manager-service/clients/openchoreosvc/client/components.go` around
lines 840 - 848, The loop silently skips components when
convertComponentFromTyped fails; change this to fail fast by returning an error
instead of continuing: when convertComponentFromTyped(&resp.JSON200.Items[i])
returns err, wrap the error with context (component name
resp.JSON200.Items[i].Metadata.Name and index i) and return from the enclosing
function (update its signature to return ([]*models.AgentResponse, error) if
needed) so callers can handle the failure; ensure you remove the continue path
and propagate the error instead of appending partial results to components.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ba07a147-470f-46e5-9211-bccea8dd0cd9

📥 Commits

Reviewing files that changed from the base of the PR and between 5d04694 and c548af6.

⛔ Files ignored due to path filters (5)
  • agent-manager-service/clients/observabilitysvc/gen/client.gen.go is excluded by !**/gen/**
  • agent-manager-service/clients/observabilitysvc/gen/types.gen.go is excluded by !**/gen/**
  • agent-manager-service/clients/openchoreosvc/gen/client.gen.go is excluded by !**/gen/**
  • agent-manager-service/clients/openchoreosvc/gen/types.gen.go is excluded by !**/gen/**
  • agent-manager-service/go.sum is excluded by !**/*.sum
📒 Files selected for processing (89)
  • Makefile
  • agent-manager-service/.env.example
  • agent-manager-service/Makefile
  • agent-manager-service/clients/clientmocks/openchoreo_client_fake.go
  • agent-manager-service/clients/clientmocks/secret_mgmt_client_fake.go
  • agent-manager-service/clients/observabilitysvc/observer-api.yaml
  • agent-manager-service/clients/openchoreosvc/client/builds.go
  • agent-manager-service/clients/openchoreosvc/client/client.go
  • agent-manager-service/clients/openchoreosvc/client/components.go
  • agent-manager-service/clients/openchoreosvc/client/deployments.go
  • agent-manager-service/clients/openchoreosvc/client/errors.go
  • agent-manager-service/clients/openchoreosvc/client/generic-workflows.go
  • agent-manager-service/clients/openchoreosvc/client/infrastructure.go
  • agent-manager-service/clients/openchoreosvc/client/projects.go
  • agent-manager-service/clients/openchoreosvc/client/resources.go
  • agent-manager-service/clients/openchoreosvc/client/secret_references.go
  • agent-manager-service/clients/openchoreosvc/client/types.go
  • agent-manager-service/clients/openchoreosvc/client/utils.go
  • agent-manager-service/clients/openchoreosvc/openchoreo-api.yaml
  • agent-manager-service/clients/secretmanagersvc/client.go
  • agent-manager-service/clients/secretmanagersvc/providers/openbao/client.go
  • agent-manager-service/clients/secretmanagersvc/providers/openbao/provider.go
  • agent-manager-service/clients/secretmanagersvc/types.go
  • agent-manager-service/config/config.go
  • agent-manager-service/config/config_loader.go
  • agent-manager-service/controllers/agent_controller.go
  • agent-manager-service/docs/api_v1_openapi.yaml
  • agent-manager-service/models/build_deployment.go
  • agent-manager-service/services/agent_configuration_service.go
  • agent-manager-service/services/agent_manager.go
  • agent-manager-service/services/llm_provider_service.go
  • agent-manager-service/services/monitor_executor.go
  • agent-manager-service/services/monitor_manager.go
  • agent-manager-service/services/monitor_scheduler.go
  • agent-manager-service/services/monitor_scheduler_test.go
  • agent-manager-service/spec/model_configuration_item.go
  • agent-manager-service/spec/model_environment_variable.go
  • agent-manager-service/tests/apitestutils/mock_clients.go
  • agent-manager-service/tests/create_agent_test.go
  • agent-manager-service/tests/monitor_executor_test.go
  • agent-manager-service/tests/monitor_test.go
  • agent-manager-service/utils/errors.go
  • agent-manager-service/wiring/wire.go
  • agent-manager-service/wiring/wire_gen.go
  • console/workspaces/libs/shared-component/src/components/DeploymentConfig.tsx
  • console/workspaces/libs/shared-component/src/components/EnvironmentVariable.tsx
  • console/workspaces/libs/types/src/api/common.ts
  • console/workspaces/libs/types/src/api/deployments.ts
  • console/workspaces/libs/views/src/component/EnvVariableEditor/EnvVariableEditor.tsx
  • console/workspaces/pages/add-new-agent/src/components/EnvironmentVariable.tsx
  • console/workspaces/pages/add-new-agent/src/form/schema.ts
  • console/workspaces/pages/add-new-agent/src/utils/buildAgentPayload.ts
  • deployments/docker-compose.yml
  • deployments/helm-charts/wso2-amp-build-extension/templates/cluster-workflow-templates/balleina-buildpack-build.yaml
  • deployments/helm-charts/wso2-amp-build-extension/templates/cluster-workflow-templates/checkout-source.yaml
  • deployments/helm-charts/wso2-amp-build-extension/templates/cluster-workflow-templates/docker-buildpack-build.yaml
  • deployments/helm-charts/wso2-amp-build-extension/templates/cluster-workflow-templates/generate-workload.yaml
  • deployments/helm-charts/wso2-amp-build-extension/templates/cluster-workflow-templates/google-buildpack-build.yaml
  • deployments/helm-charts/wso2-amp-build-extension/templates/cluster-workflow-templates/publish-image.yaml
  • deployments/helm-charts/wso2-amp-build-extension/templates/workflow-templates/amp-ballerina-buildpack-ci.yaml
  • deployments/helm-charts/wso2-amp-build-extension/templates/workflow-templates/amp-docker-ci.yaml
  • deployments/helm-charts/wso2-amp-build-extension/templates/workflow-templates/amp-google-cloud-buildpack-ci.yaml
  • deployments/helm-charts/wso2-amp-platform-resources-extension/templates/authz-cluster-role-binding.yaml
  • deployments/helm-charts/wso2-amp-platform-resources-extension/templates/component-types/agent-api.yaml
  • deployments/helm-charts/wso2-amp-platform-resources-extension/templates/component-workflows/amp-ballerina-buildpack.yaml
  • deployments/helm-charts/wso2-amp-platform-resources-extension/templates/component-workflows/amp-docker.yaml
  • deployments/helm-charts/wso2-amp-platform-resources-extension/templates/component-workflows/amp-google-cloud-buildpacks.yaml
  • deployments/helm-charts/wso2-amp-platform-resources-extension/templates/environment.yaml
  • deployments/helm-charts/wso2-amp-platform-resources-extension/templates/gateway-tls-certificate.yaml
  • deployments/helm-charts/wso2-amp-platform-resources-extension/values.yaml
  • deployments/helm-charts/wso2-amp-thunder-extension/values.yaml
  • deployments/scripts/env.sh
  • deployments/scripts/generate-docker-kubeconfig.sh
  • deployments/scripts/setup-amp-thunder.sh
  • deployments/scripts/setup-colima.sh
  • deployments/scripts/setup-k3d.sh
  • deployments/scripts/setup-openchoreo.sh
  • deployments/scripts/setup-platform.sh
  • deployments/scripts/setup-prerequisites.sh
  • deployments/scripts/teardown.sh
  • deployments/scripts/utils.sh
  • deployments/single-cluster-config.yaml
  • deployments/single-cluster/values-bp.yaml
  • deployments/single-cluster/values-cp.yaml
  • deployments/single-cluster/values-dp.yaml
  • deployments/single-cluster/values-op.yaml
  • evaluation-job/Makefile
  • python-instrumentation-provider/Makefile
  • traces-observer-service/Makefile
💤 Files with no reviewable changes (10)
  • agent-manager-service/.env.example
  • deployments/single-cluster/values-bp.yaml
  • deployments/scripts/setup-amp-thunder.sh
  • deployments/scripts/generate-docker-kubeconfig.sh
  • agent-manager-service/clients/openchoreosvc/client/utils.go
  • agent-manager-service/clients/observabilitysvc/observer-api.yaml
  • deployments/helm-charts/wso2-amp-build-extension/templates/workflow-templates/amp-google-cloud-buildpack-ci.yaml
  • agent-manager-service/clients/openchoreosvc/client/resources.go
  • deployments/helm-charts/wso2-amp-build-extension/templates/workflow-templates/amp-docker-ci.yaml
  • deployments/helm-charts/wso2-amp-build-extension/templates/workflow-templates/amp-ballerina-buildpack-ci.yaml

Comment thread agent-manager-service/clients/openchoreosvc/client/deployments.go Outdated
Comment thread agent-manager-service/clients/openchoreosvc/client/secret_references.go Outdated
Comment thread agent-manager-service/docs/api_v1_openapi.yaml
Comment thread agent-manager-service/docs/api_v1_openapi.yaml
Comment thread agent-manager-service/services/agent_manager.go
Comment thread deployments/scripts/setup-k3d.sh
Comment thread deployments/scripts/setup-openchoreo.sh
Comment thread deployments/scripts/setup-prerequisites.sh Outdated
Comment thread deployments/single-cluster/values-dp.yaml
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (1)
deployments/helm-charts/wso2-amp-platform-resources-extension/templates/component-workflows/amp-google-cloud-buildpacks.yaml (1)

148-149: ⚠️ Potential issue | 🟠 Major

Namespace mismatch between ExternalSecret and Workflow.

The ExternalSecret is created in ${metadata.namespace} (line 149) but the Workflow runs in ${metadata.namespaceName} (line 58). If these placeholders resolve to different values, the secret won't be mountable by workflow pods since Kubernetes secrets cannot be mounted across namespaces.

The publish-image step likely mounts this secret with optional: true, causing silent fallback to unauthenticated image pushes.

🔧 Proposed fix
         metadata:
           name: ${metadata.workflowRunName}-registry-push-secret
-          namespace: ${metadata.namespace}
+          namespace: ${metadata.namespaceName}
#!/bin/bash
# Verify how metadata.namespace vs metadata.namespaceName are used across workflow templates

echo "=== Checking namespace placeholder usage in component-workflows ==="
rg -n 'metadata\.namespace|metadata\.namespaceName' deployments/helm-charts/wso2-amp-platform-resources-extension/templates/component-workflows/ --type yaml

echo ""
echo "=== Checking if publish-image template uses registry-push-secret ==="
cat deployments/helm-charts/wso2-amp-build-extension/templates/cluster-workflow-templates/publish-image.yaml | grep -A10 -B2 'registry-push-secret\|optional:'
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@deployments/helm-charts/wso2-amp-platform-resources-extension/templates/component-workflows/amp-google-cloud-buildpacks.yaml`
around lines 148 - 149, The ExternalSecret is created in ${metadata.namespace}
while the Workflow runs in ${metadata.namespaceName}, causing a namespace
mismatch that prevents the workflow pods from mounting the secret; locate the
ExternalSecret resource (name: ${metadata.workflowRunName}-registry-push-secret)
and make its namespace match the workflow namespace placeholder (use
${metadata.namespaceName} instead of ${metadata.namespace}), and also inspect
the publish-image template (registry-push-secret reference) to ensure the secret
mount is not set optional: true—set optional to false or remove the optional
flag so image push fails loudly if the secret is unavailable.
🧹 Nitpick comments (4)
agent-manager-service/clients/openchoreosvc/client/secret_references.go (1)

152-163: Consider extracting shared helper for data source construction.

The dataSources and labels building logic (lines 154-169) is duplicated from CreateSecretReference. Extracting a small helper could reduce repetition:

func buildSecretDataSources(kvPath string, keys []string) []gen.SecretDataSource { ... }
func buildSecretReferenceLabels(projectName, componentName string) map[string]string { ... }

This is minor since both functions remain clear as-is.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@agent-manager-service/clients/openchoreosvc/client/secret_references.go`
around lines 152 - 163, The dataSources and labels construction in
UpdateSecretReference duplicates logic in CreateSecretReference; extract two
small helpers—e.g., buildSecretDataSources(kvPath string, keys []string)
[]gen.SecretDataSource and buildSecretReferenceLabels(projectName, componentName
string) map[string]string—move the loop that creates gen.SecretDataSource (using
RemoteReference with Property set to each key) into buildSecretDataSources and
the label map creation into buildSecretReferenceLabels, then replace the inline
code in both UpdateSecretReference and CreateSecretReference with calls to those
helpers to remove duplication and keep behavior identical.
agent-manager-service/services/agent_manager.go (2)

1601-1625: Preserved secrets that don't exist in KV are silently dropped.

When GetSecret returns ErrSecretNotFound or when a preserved key doesn't exist in existingData, the loop at lines 1612-1618 silently skips that key. This means if a secret was deleted from OpenBao (manually or by another process), the "preserved" reference will be lost without warning.

Consider adding a warning log when a preserved key is missing from KV:

🔧 Suggested improvement
 for _, key := range preservedSecretKeys {
     if existingData != nil {
         if val, ok := existingData[key]; ok {
             dataToWrite[key] = val
+        } else {
+            s.logger.Warn("Preserved secret key not found in KV, will be removed", "key", key, "kvPath", kvPath)
         }
+    } else {
+        s.logger.Warn("No existing secrets found in KV, preserved keys will be removed", "preservedKeys", preservedSecretKeys, "kvPath", kvPath)
+        break
     }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@agent-manager-service/services/agent_manager.go` around lines 1601 - 1625,
When merging preserved secrets in the block that calls
s.secretMgmtClient.GetSecret (handling secretmanagersvc.ErrSecretNotFound), add
explicit warning logs: log a warning if GetSecret returned ErrSecretNotFound
(include kvPath and preservedSecretKeys) and log a per-key warning inside the
loop when a preserved key from preservedSecretKeys is not found in existingData
(include the missing key and kvPath), while still continuing to build
dataToWrite from newSecretData so behavior remains unchanged.

1559-1582: Best-effort cleanup may leave orphaned resources.

When totalSecretCount == 0, both KV deletion and SecretReference deletion failures are logged as warnings but don't cause the function to return an error. This means a deployment could succeed while leaving orphaned secrets in KV or orphaned SecretReference CRs.

This appears intentional to avoid blocking deployments, but consider whether these failures should be surfaced differently (e.g., returned as a non-fatal warning to the caller).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@agent-manager-service/services/agent_manager.go` around lines 1559 - 1582,
The cleanup path when totalSecretCount == 0 currently logs failures from
s.secretMgmtClient.DeleteSecretByPath and s.ocClient.DeleteSecretReference as
warnings and still returns nil, which can leave orphaned resources; change this
to collect any delete errors (from location.KVPath(), DeleteSecretByPath and
DeleteSecretReference for secretRefName) into a single returned error (or a
wrapped/multi-error) so the caller can see cleanup failures while still
attempting both deletions, e.g., attempt KV deletion then CR deletion,
accumulate any non-nil errs and return a combined error instead of always
returning nil after logging.
deployments/scripts/setup-prerequisites.sh (1)

88-101: Avoid checking a fixed admin password into the fake secret store.

This stores the OpenSearch credential in cleartext in a cluster-scoped resource, and every local cluster gets the same admin / ThisIsTheOpenSearchPassword1 pair. At minimum, source these values from env.sh or generate a per-cluster value during setup instead of committing them here.

🔐 One way to externalize the values
+: "${OPENSEARCH_USERNAME:?set OPENSEARCH_USERNAME in env.sh}"
+: "${OPENSEARCH_PASSWORD:?set OPENSEARCH_PASSWORD in env.sh}"
+
 if kubectl --context "${CLUSTER_CONTEXT}" apply -f - <<EOF
 apiVersion: external-secrets.io/v1
 kind: ClusterSecretStore
 metadata:
   name: default
 spec:
   provider:
     fake:
       data:
       - key: opensearch-username
-        value: "admin"
+        value: "${OPENSEARCH_USERNAME}"
       - key: opensearch-password
-        value: "ThisIsTheOpenSearchPassword1"
+        value: "${OPENSEARCH_PASSWORD}"
 EOF
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deployments/scripts/setup-prerequisites.sh` around lines 88 - 101, The
ClusterSecretStore YAML written by the kubectl apply here-doc contains hardcoded
opensearch-username and opensearch-password values; change the script so the
values are not committed: read OPENSEARCH_USERNAME/OPENSEARCH_PASSWORD from an
external env file (e.g., source env.sh) or generate a per-cluster random
password at runtime (e.g., create a random value and export it), then inject
those shell variables into the here-doc used by the kubectl --context
"${CLUSTER_CONTEXT}" apply -f - <<EOF so the YAML uses "${OPENSEARCH_USERNAME}"
and "${OPENSEARCH_PASSWORD}" (and persist or print the generated password for
operator use) instead of the fixed "admin"/"ThisIsTheOpenSearchPassword1".
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@agent-manager-service/clients/openchoreosvc/client/deployments.go`:
- Around line 317-320: The code currently sets lastDeployedAt from
binding.Metadata.CreationTimestamp (used for LastDeployedAt) which doesn't
change on redeploy; instead inspect binding.Status.Conditions to find the most
recent transition time for the deployment condition (e.g., condition.Type ==
"Ready" or the appropriate release condition) and use that condition's
LastTransitionTime as lastDeployedAt; update the logic around where
lastDeployedAt is assigned (replace use of binding.Metadata.CreationTimestamp)
and ensure it falls back to CreationTimestamp only if no suitable
Status.Conditions entry is found — reference binding.Status.Conditions,
LastTransitionTime, binding.Metadata.CreationTimestamp, and
UpdateReleaseBindingWithResponse to locate related flows.

In
`@deployments/helm-charts/wso2-amp-build-extension/templates/cluster-workflow-templates/balleina-buildpack-build.yaml`:
- Around line 38-39: The podman readiness loop (starting the background "podman
system service --time=0 &" and the "until podman info --format
'{{.Host.RemoteSocket.Exists}}' ... | grep -q true; do sleep 1; done" loop) can
hang forever; modify this block to bound the wait with a timeout (e.g., track
elapsed seconds or use a wait-until-with-timeout pattern) and exit non‑zero with
a clear error message when the timeout elapses so the workflow step fails fast
and surfaces the startup problem.

In
`@deployments/helm-charts/wso2-amp-build-extension/templates/cluster-workflow-templates/google-buildpack-build.yaml`:
- Line 75: The CMD_ARGS assignment currently contains unescaped inner double
quotes around $RUN_IMG which break the quoted string; update the CMD_ARGS value
(the CMD_ARGS variable assignment) to either remove the inner quotes around
$RUN_IMG or escape them (so the outer quotes remain intact), e.g., use
--run-image $RUN_IMG or --run-image \"$RUN_IMG\" to ensure the argument string
is constructed correctly.

In
`@deployments/helm-charts/wso2-amp-platform-resources-extension/templates/component-workflows/amp-ballerina-buildpack.yaml`:
- Line 55: The ExternalSecret resource is created in ${metadata.namespace} while
the workflow and the publish-image step run in ${metadata.namespaceName},
causing a namespace mismatch; update the ExternalSecret's namespace field to use
${metadata.namespaceName} (or otherwise ensure both the workflow and the
ExternalSecret use the same namespace variable) so the publish-image step can
access the secret; locate the ExternalSecret resource definition and the
publish-image step in the template and make the namespace variable consistent
(reference ${metadata.namespaceName} in the ExternalSecret or change the
workflow namespace to ${metadata.namespace}) to resolve the scope mismatch.

In `@deployments/scripts/setup-prerequisites.sh`:
- Around line 46-49: The helm_install_if_not_exists helper currently omits kube
context so Helm may act on the caller's active context; update the function
(helm_install_if_not_exists) to add --kube-context "${CLUSTER_CONTEXT}" to the
helm status check and to the helm install command so both commands explicitly
target the verified ${CLUSTER_CONTEXT}; ensure both the conditional "helm status
..." invocation and the "helm install ..." invocation include the --kube-context
"${CLUSTER_CONTEXT}" argument.

---

Duplicate comments:
In
`@deployments/helm-charts/wso2-amp-platform-resources-extension/templates/component-workflows/amp-google-cloud-buildpacks.yaml`:
- Around line 148-149: The ExternalSecret is created in ${metadata.namespace}
while the Workflow runs in ${metadata.namespaceName}, causing a namespace
mismatch that prevents the workflow pods from mounting the secret; locate the
ExternalSecret resource (name: ${metadata.workflowRunName}-registry-push-secret)
and make its namespace match the workflow namespace placeholder (use
${metadata.namespaceName} instead of ${metadata.namespace}), and also inspect
the publish-image template (registry-push-secret reference) to ensure the secret
mount is not set optional: true—set optional to false or remove the optional
flag so image push fails loudly if the secret is unavailable.

---

Nitpick comments:
In `@agent-manager-service/clients/openchoreosvc/client/secret_references.go`:
- Around line 152-163: The dataSources and labels construction in
UpdateSecretReference duplicates logic in CreateSecretReference; extract two
small helpers—e.g., buildSecretDataSources(kvPath string, keys []string)
[]gen.SecretDataSource and buildSecretReferenceLabels(projectName, componentName
string) map[string]string—move the loop that creates gen.SecretDataSource (using
RemoteReference with Property set to each key) into buildSecretDataSources and
the label map creation into buildSecretReferenceLabels, then replace the inline
code in both UpdateSecretReference and CreateSecretReference with calls to those
helpers to remove duplication and keep behavior identical.

In `@agent-manager-service/services/agent_manager.go`:
- Around line 1601-1625: When merging preserved secrets in the block that calls
s.secretMgmtClient.GetSecret (handling secretmanagersvc.ErrSecretNotFound), add
explicit warning logs: log a warning if GetSecret returned ErrSecretNotFound
(include kvPath and preservedSecretKeys) and log a per-key warning inside the
loop when a preserved key from preservedSecretKeys is not found in existingData
(include the missing key and kvPath), while still continuing to build
dataToWrite from newSecretData so behavior remains unchanged.
- Around line 1559-1582: The cleanup path when totalSecretCount == 0 currently
logs failures from s.secretMgmtClient.DeleteSecretByPath and
s.ocClient.DeleteSecretReference as warnings and still returns nil, which can
leave orphaned resources; change this to collect any delete errors (from
location.KVPath(), DeleteSecretByPath and DeleteSecretReference for
secretRefName) into a single returned error (or a wrapped/multi-error) so the
caller can see cleanup failures while still attempting both deletions, e.g.,
attempt KV deletion then CR deletion, accumulate any non-nil errs and return a
combined error instead of always returning nil after logging.

In `@deployments/scripts/setup-prerequisites.sh`:
- Around line 88-101: The ClusterSecretStore YAML written by the kubectl apply
here-doc contains hardcoded opensearch-username and opensearch-password values;
change the script so the values are not committed: read
OPENSEARCH_USERNAME/OPENSEARCH_PASSWORD from an external env file (e.g., source
env.sh) or generate a per-cluster random password at runtime (e.g., create a
random value and export it), then inject those shell variables into the here-doc
used by the kubectl --context "${CLUSTER_CONTEXT}" apply -f - <<EOF so the YAML
uses "${OPENSEARCH_USERNAME}" and "${OPENSEARCH_PASSWORD}" (and persist or print
the generated password for operator use) instead of the fixed
"admin"/"ThisIsTheOpenSearchPassword1".

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a25056fd-f677-4557-954e-33c572cd2ccd

📥 Commits

Reviewing files that changed from the base of the PR and between c548af6 and 54298b3.

📒 Files selected for processing (11)
  • agent-manager-service/clients/openchoreosvc/client/deployments.go
  • agent-manager-service/clients/openchoreosvc/client/secret_references.go
  • agent-manager-service/services/agent_manager.go
  • deployments/helm-charts/wso2-amp-build-extension/templates/cluster-workflow-templates/balleina-buildpack-build.yaml
  • deployments/helm-charts/wso2-amp-build-extension/templates/cluster-workflow-templates/google-buildpack-build.yaml
  • deployments/helm-charts/wso2-amp-build-extension/templates/cluster-workflow-templates/publish-image.yaml
  • deployments/helm-charts/wso2-amp-platform-resources-extension/templates/component-workflows/amp-ballerina-buildpack.yaml
  • deployments/helm-charts/wso2-amp-platform-resources-extension/templates/component-workflows/amp-docker.yaml
  • deployments/helm-charts/wso2-amp-platform-resources-extension/templates/component-workflows/amp-google-cloud-buildpacks.yaml
  • deployments/scripts/setup-k3d.sh
  • deployments/scripts/setup-prerequisites.sh
🚧 Files skipped from review as they are similar to previous changes (1)
  • deployments/helm-charts/wso2-amp-build-extension/templates/cluster-workflow-templates/publish-image.yaml

Comment thread agent-manager-service/clients/openchoreosvc/client/deployments.go Outdated
Comment thread deployments/scripts/setup-prerequisites.sh
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (1)
agent-manager-service/clients/openchoreosvc/client/deployments.go (1)

67-95: ⚠️ Potential issue | 🔴 Critical

Allocate an empty env slice for the clear-all path.

Line 69 still uses var envVars []gen.EnvVar, so an explicit empty req.Env leaves the slice nil. That makes workload.Spec.Container.Env = &envVars serialize as null rather than [], which breaks the advertised “empty slice means clear all” behavior and can leave stale env vars behind.

Patch sketch
-	if req.Env != nil {
-		var envVars []gen.EnvVar
+	if req.Env != nil {
+		envVars := make([]gen.EnvVar, 0, len(req.Env))
 		for _, env := range req.Env {
 			genEnvVar := gen.EnvVar{
 				Key: env.Key,
 			}
In Go's encoding/json package, how are a nil slice, an empty slice, and a non-nil pointer to a nil slice marshaled to JSON?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@agent-manager-service/clients/openchoreosvc/client/deployments.go` around
lines 67 - 95, The env-building code in deployments.go uses "var envVars
[]gen.EnvVar" which leaves envVars nil when req.Env is an explicit empty slice;
allocate an actual empty slice instead so clearing envs serializes as [] not
null. Replace the declaration in the block that builds envVars (around the loop
that creates genEnvVar and sets workload.Spec.Container.Env) with an explicit
non-nil empty slice (e.g., make([]gen.EnvVar, 0)) so workload.Spec.Container.Env
= &envVars yields [] in JSON; update only the envVars declaration inside that
branch of the function handling req.Env.
🧹 Nitpick comments (4)
deployments/scripts/utils.sh (4)

24-31: Add error handling for helm install failure.

If helm install fails, the function still prints the success message. Consider checking the exit status.

Proposed fix
     echo "📦 Installing $release_name..."
-    helm install "$release_name" "$chart" \
+    if helm install "$release_name" "$chart" \
         --namespace "$namespace" \
         --create-namespace \
         --kube-context "${CLUSTER_CONTEXT}" \
-        "${extra_args[@]}"
-    echo "✅ $release_name installed successfully"
+        "${extra_args[@]}"; then
+        echo "✅ $release_name installed successfully"
+    else
+        echo "❌ Failed to install $release_name"
+        return 1
+    fi
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deployments/scripts/utils.sh` around lines 24 - 31, The helm install block
prints success regardless of failure; update the function that runs helm install
(the block invoking helm install with "$release_name" "$chart" and args using
$namespace, ${CLUSTER_CONTEXT}, and "${extra_args[@]}") to check the exit status
($?) or use set -o pipefail && command || capture exit code, and if helm fails,
print a clear error message including $release_name and the helm exit
code/output and exit non-zero (or return the error) instead of printing the
success line; only print "✅ $release_name installed successfully" when the
install command succeeded.

42-46: Consider using jq for JSON parsing.

Parsing JSON with grep/sed is fragile and could break with formatting changes. If jq is available (common in deployment environments), it's more robust.

Proposed fix using jq
     local nodes
-    nodes=$(echo "$json_output" \
-        | grep -o '"name"[[:space:]]*:[[:space:]]*"[^"]*"' \
-        | sed 's/"name"[[:space:]]*:[[:space:]]*"//;s/"$//' \
-        | grep "^k3d-${cluster_name}-")
+    nodes=$(echo "$json_output" | jq -r '.[].name' | grep "^k3d-${cluster_name}-")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deployments/scripts/utils.sh` around lines 42 - 46, The current extraction of
node names from json_output using grep/sed into the nodes variable is fragile;
replace that pipeline with a jq-based extraction that reads json_output and
selects .items[].metadata.name, then filters names that start with
"k3d-${cluster_name}-" to populate the nodes variable; update the assignment to
nodes and ensure json_output and cluster_name are used as inputs to the jq
filter so the new command produces the same newline-separated list of node
names.

308-309: Consider documenting or mitigating eval usage.

Using eval "$func" can be risky if inputs are not controlled. While this is an internal utility where tasks come from trusted code, consider adding a comment about this assumption or using a safer invocation pattern.

Alternative using direct invocation (if functions don't require complex argument parsing)
         # Run function in background, capturing output
-        eval "$func" > "$log_file" 2>&1 &
+        # Note: Tasks must be trusted as they are executed directly
+        $func > "$log_file" 2>&1 &

If functions need complex arguments, the current eval approach is acceptable for internal scripts, but consider documenting the trust requirement in the function header.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deployments/scripts/utils.sh` around lines 308 - 309, The use of eval "$func"
> "$log_file" 2>&1 & is potentially unsafe — update the utils.sh location where
the variables func and log_file are used: either (A) add a clear comment above
that line documenting the trust assumption (that func comes only from
internal/trusted code) and why eval is acceptable, or (B) replace eval with a
safer direct invocation pattern (call the target function by name with explicit
arguments or use a command array to avoid shell parsing) so arbitrary input
can't be executed; pick one approach and apply it to the background-run helper
that spawns commands.

65-101: Quote variable expansions and document external dependencies.

Variables ${CLUSTER_NAME} and ${CLUSTER_CONTEXT} are used unquoted in multiple places, which could cause word-splitting issues. Additionally, these external dependencies should be validated or documented.

Proposed fix
 # Util: Refresh kubeconfig for k3d cluster
+# Requires: CLUSTER_NAME to be set
 refresh_kubeconfig() {
     echo "🔄 Refreshing kubeconfig..."
-    k3d kubeconfig merge ${CLUSTER_NAME} --kubeconfig-merge-default --kubeconfig-switch-context
+    k3d kubeconfig merge "${CLUSTER_NAME}" --kubeconfig-merge-default --kubeconfig-switch-context
 }

 # Util: Wait for cluster to be ready (max 30 attempts, 2s interval)
+# Requires: CLUSTER_CONTEXT to be set
 wait_for_cluster() {
     echo "⏳ Waiting for cluster to be ready..."
     for i in {1..30}; do
-        if kubectl cluster-info --context ${CLUSTER_CONTEXT} --request-timeout=5s &>/dev/null; then
+        if kubectl cluster-info --context "${CLUSTER_CONTEXT}" --request-timeout=5s &>/dev/null; then
             echo "✅ Cluster is now ready"
             return 0
         fi
@@ -85,15 +88,16 @@
 }

 # Util: Ensure cluster is accessible (refresh kubeconfig + wait)
+# Requires: CLUSTER_NAME and CLUSTER_CONTEXT to be set
 ensure_cluster_accessible() {
     refresh_kubeconfig

     echo "🔍 Checking cluster accessibility..."
-    if kubectl cluster-info --context ${CLUSTER_CONTEXT} --request-timeout=10s &>/dev/null; then
+    if kubectl cluster-info --context "${CLUSTER_CONTEXT}" --request-timeout=10s &>/dev/null; then
         echo "✅ Cluster is running and accessible"
         return 0
     fi

     echo "⚠️  Cluster not accessible. Restarting..."
-    k3d cluster stop ${CLUSTER_NAME} 2>/dev/null || true
-    k3d cluster start ${CLUSTER_NAME}
+    k3d cluster stop "${CLUSTER_NAME}" 2>/dev/null || true
+    k3d cluster start "${CLUSTER_NAME}"

     refresh_kubeconfig
     wait_for_cluster
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deployments/scripts/utils.sh` around lines 65 - 101, The functions
refresh_kubeconfig, wait_for_cluster, and ensure_cluster_accessible use unquoted
expansions (${CLUSTER_NAME} and ${CLUSTER_CONTEXT}) and lack validation of
external dependencies; update these functions to quote all variable expansions
(e.g., "${CLUSTER_NAME}" and "${CLUSTER_CONTEXT}"), add a pre-check that
required env vars are set (return with a clear error if [ -z "${CLUSTER_NAME}" ]
or [ -z "${CLUSTER_CONTEXT}" ]), and verify external tools are present (use
command -v kubectl and command -v k3d and error out if missing); also add a
short comment near these functions documenting the external dependencies (k3d,
kubectl) and that CLUSTER_NAME/CLUSTER_CONTEXT must be exported.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@agent-manager-service/clients/openchoreosvc/client/deployments.go`:
- Around line 247-249: The helper determineDeploymentStatus incorrectly returns
DeploymentStatusNotDeployed whenever binding.Status or binding.Status.Conditions
is nil; update it so only a nil binding (no ReleaseBinding found) returns
DeploymentStatusNotDeployed, while an existing ReleaseBinding with a nil or
empty Status/Conditions should return the pending state (e.g.,
DeploymentStatusPending or equivalent) so fresh/reconciling deployments are
shown as pending; locate function determineDeploymentStatus and adjust the nil
checks around gen.ReleaseBinding, binding.Status, and binding.Status.Conditions
and swap the returned constant accordingly.

In `@deployments/scripts/utils.sh`:
- Around line 1-9: Add a shebang line (#!/usr/bin/env bash) at the top of the
script to document the expected shell and satisfy linters, and update the
check_command() function's install hint so it is portable: replace the hardcoded
"brew install" suggestion with a generic, OS-agnostic message (e.g., "install
via your package manager: apt/yum/dnf/brew") or include conditional guidance
based on uname/OSTYPE; ensure the function still exits with non-zero status on
missing commands.
- Line 207: The YAML key assignment uses an unquoted shell variable observer_url
(observerURL: $observer_url), which can break if the URL contains special
characters; update the template to quote the variable value so the generated
YAML contains observerURL: "$observer_url" (or single quotes if preferred) to
ensure proper quoting/escaping when rendered by the shell and YAML parser.
- Around line 268-285: The script currently assigns CA_CRT, TLS_CRT and TLS_KEY
from kubectl get and may proceed with empty values if the source ConfigMap or
Secret is missing; modify the block that retrieves CA_CRT (kubectl get configmap
cluster-gateway-ca) and TLS_CRT/TLS_KEY (kubectl get secret cluster-gateway-ca)
to check for command failures and empty values: capture kubectl exit status and
validate that CA_CRT, TLS_CRT and TLS_KEY are non-empty after base64 decoding,
log a clear error (e.g. via echo or a logging function) and exit non-zero if any
are missing, and avoid applying the new ConfigMap/Secret when validation fails
so that create secret/configmap steps (the kubectl create ... | kubectl apply -f
- lines) are only executed when CA_CRT, TLS_CRT and TLS_KEY are present and
valid.

---

Duplicate comments:
In `@agent-manager-service/clients/openchoreosvc/client/deployments.go`:
- Around line 67-95: The env-building code in deployments.go uses "var envVars
[]gen.EnvVar" which leaves envVars nil when req.Env is an explicit empty slice;
allocate an actual empty slice instead so clearing envs serializes as [] not
null. Replace the declaration in the block that builds envVars (around the loop
that creates genEnvVar and sets workload.Spec.Container.Env) with an explicit
non-nil empty slice (e.g., make([]gen.EnvVar, 0)) so workload.Spec.Container.Env
= &envVars yields [] in JSON; update only the envVars declaration inside that
branch of the function handling req.Env.

---

Nitpick comments:
In `@deployments/scripts/utils.sh`:
- Around line 24-31: The helm install block prints success regardless of
failure; update the function that runs helm install (the block invoking helm
install with "$release_name" "$chart" and args using $namespace,
${CLUSTER_CONTEXT}, and "${extra_args[@]}") to check the exit status ($?) or use
set -o pipefail && command || capture exit code, and if helm fails, print a
clear error message including $release_name and the helm exit code/output and
exit non-zero (or return the error) instead of printing the success line; only
print "✅ $release_name installed successfully" when the install command
succeeded.
- Around line 42-46: The current extraction of node names from json_output using
grep/sed into the nodes variable is fragile; replace that pipeline with a
jq-based extraction that reads json_output and selects .items[].metadata.name,
then filters names that start with "k3d-${cluster_name}-" to populate the nodes
variable; update the assignment to nodes and ensure json_output and cluster_name
are used as inputs to the jq filter so the new command produces the same
newline-separated list of node names.
- Around line 308-309: The use of eval "$func" > "$log_file" 2>&1 & is
potentially unsafe — update the utils.sh location where the variables func and
log_file are used: either (A) add a clear comment above that line documenting
the trust assumption (that func comes only from internal/trusted code) and why
eval is acceptable, or (B) replace eval with a safer direct invocation pattern
(call the target function by name with explicit arguments or use a command array
to avoid shell parsing) so arbitrary input can't be executed; pick one approach
and apply it to the background-run helper that spawns commands.
- Around line 65-101: The functions refresh_kubeconfig, wait_for_cluster, and
ensure_cluster_accessible use unquoted expansions (${CLUSTER_NAME} and
${CLUSTER_CONTEXT}) and lack validation of external dependencies; update these
functions to quote all variable expansions (e.g., "${CLUSTER_NAME}" and
"${CLUSTER_CONTEXT}"), add a pre-check that required env vars are set (return
with a clear error if [ -z "${CLUSTER_NAME}" ] or [ -z "${CLUSTER_CONTEXT}" ]),
and verify external tools are present (use command -v kubectl and command -v k3d
and error out if missing); also add a short comment near these functions
documenting the external dependencies (k3d, kubectl) and that
CLUSTER_NAME/CLUSTER_CONTEXT must be exported.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 769e9405-9437-4b3e-9315-d227c40d668d

📥 Commits

Reviewing files that changed from the base of the PR and between 54298b3 and 9129c2b.

📒 Files selected for processing (3)
  • agent-manager-service/clients/openchoreosvc/client/deployments.go
  • deployments/helm-charts/wso2-amp-build-extension/templates/cluster-workflow-templates/google-buildpack-build.yaml
  • deployments/scripts/utils.sh
🚧 Files skipped from review as they are similar to previous changes (1)
  • deployments/helm-charts/wso2-amp-build-extension/templates/cluster-workflow-templates/google-buildpack-build.yaml

Comment thread agent-manager-service/clients/openchoreosvc/client/deployments.go
Comment thread deployments/scripts/utils.sh
Comment thread deployments/scripts/utils.sh
Comment thread deployments/scripts/utils.sh
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
agent-manager-service/clients/openchoreosvc/client/components.go (2)

302-313: ⚠️ Potential issue | 🔴 Critical

Avoid dereferencing req.InputInterface for chat agents.

This branch assumes req.InputInterface is always present and will panic on req.InputInterface.BasePath if the request relies on the chat-agent defaults instead of sending an explicit input-interface block.

Suggested fix
 	if req.AgentType.Type == string(utils.AgentTypeAPI) && req.AgentType.SubType == string(utils.AgentSubTypeChatAPI) {
+		basePath := ""
+		if req.InputInterface != nil {
+			basePath = req.InputInterface.BasePath
+		}
 		schemaContent, err := getDefaultChatAPISchema()
 		if err != nil {
 			return nil, fmt.Errorf("failed to read Chat API schema: %w", err)
 		}
 		endpoints = append(endpoints, map[string]any{
 			"name":          fmt.Sprintf("%s-endpoint", req.Name),
 			"port":          config.GetConfig().DefaultChatAPI.DefaultHTTPPort,
 			"type":          string(utils.InputInterfaceTypeHTTP),
-			"basePath":      req.InputInterface.BasePath,
+			"basePath":      basePath,
 			"visibility":    DefaultEndpointVisibility,
 			"schemaType":    SchemaTypeREST,
 			"schemaContent": schemaContent,
 		})
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@agent-manager-service/clients/openchoreosvc/client/components.go` around
lines 302 - 313, The code appends a chat-agent endpoint but dereferences
req.InputInterface.BasePath without checking for nil; update the endpoint
construction in the branch that handles AgentTypeAPI/SubTypeChatAPI to first
check if req.InputInterface != nil and use req.InputInterface.BasePath,
otherwise fall back to the chat default base path (e.g.
config.GetConfig().DefaultChatAPI.DefaultBasePath) when building the map used in
endpoints; keep the existing use of getDefaultChatAPISchema(),
DefaultEndpointVisibility and SchemaTypeREST.

1046-1089: ⚠️ Potential issue | 🟠 Major

Replace stored env vars instead of merging them.

This keeps every previously persisted key that is missing from envVars, so removals never stick. Deleting an env var in the deploy flow will leave it in the component and it will reappear on the next build/deploy.

Suggested fix
-	// Get existing environment variables
-	existingEnvVars := make([]map[string]any, 0)
-	if envVarsInterface, ok := workflowParams["environmentVariables"].([]interface{}); ok {
-		for _, env := range envVarsInterface {
-			if envMap, ok := env.(map[string]interface{}); ok {
-				existingEnvVars = append(existingEnvVars, envMap)
-			}
-		}
-	}
-
-	// Build merged environment variables map
-	envMap := make(map[string]map[string]any)
-	for _, env := range existingEnvVars {
-		if name, ok := env["name"].(string); ok {
-			envMap[name] = env
-		}
-	}
+	// Replace the stored environment variables with the desired set from the caller.
+	updatedEnvVars := make([]map[string]any, 0, len(envVars))
 	for _, newEnv := range envVars {
 		envVar := map[string]any{
 			"name": newEnv.Key,
 		}
 		if newEnv.ValueFrom != nil && newEnv.ValueFrom.SecretKeyRef != nil {
@@
 		} else {
 			// Plain value
 			envVar["value"] = newEnv.Value
 		}
-		envMap[newEnv.Key] = envVar
+		updatedEnvVars = append(updatedEnvVars, envVar)
 	}
-
-	// Convert map to slice
-	mergedEnvVars := make([]map[string]any, 0, len(envMap))
-	for _, env := range envMap {
-		mergedEnvVars = append(mergedEnvVars, env)
-	}
-
-	// Update workflow parameters
-	workflowParams["environmentVariables"] = mergedEnvVars
+	workflowParams["environmentVariables"] = updatedEnvVars
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@agent-manager-service/clients/openchoreosvc/client/components.go` around
lines 1046 - 1089, The current logic builds envMap from existingEnvVars then
overlays new envVars, which preserves any previously persisted keys that are
omitted (so deletions don't stick); instead, stop pre-populating envMap from
existingEnvVars and treat envVars as the source of truth: build
envMap/mergedEnvVars solely from envVars (using newEnv.Key and value/valueFrom
handling as in the block that creates envVar) and then assign
workflowParams["environmentVariables"] = mergedEnvVars; update/remove references
to existingEnvVars to avoid retaining old keys (refer to variables
existingEnvVars, envVars, envMap, mergedEnvVars and the
workflowParams["environmentVariables"] assignment).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@agent-manager-service/tests/apitestutils/mock_clients.go`:
- Around line 161-163: The default mock's GetSecretFunc currently returns a
generic error which breaks the preserve-existing-secret flow; change the mock so
GetSecretFunc returns nil and ErrSecretNotFound (the package-level sentinel for
missing secrets) instead of fmt.Errorf("error") so tests model a missing secret
rather than a fatal read error. Ensure you reference and return the existing
ErrSecretNotFound symbol from the codebase in the GetSecretFunc implementation.

---

Outside diff comments:
In `@agent-manager-service/clients/openchoreosvc/client/components.go`:
- Around line 302-313: The code appends a chat-agent endpoint but dereferences
req.InputInterface.BasePath without checking for nil; update the endpoint
construction in the branch that handles AgentTypeAPI/SubTypeChatAPI to first
check if req.InputInterface != nil and use req.InputInterface.BasePath,
otherwise fall back to the chat default base path (e.g.
config.GetConfig().DefaultChatAPI.DefaultBasePath) when building the map used in
endpoints; keep the existing use of getDefaultChatAPISchema(),
DefaultEndpointVisibility and SchemaTypeREST.
- Around line 1046-1089: The current logic builds envMap from existingEnvVars
then overlays new envVars, which preserves any previously persisted keys that
are omitted (so deletions don't stick); instead, stop pre-populating envMap from
existingEnvVars and treat envVars as the source of truth: build
envMap/mergedEnvVars solely from envVars (using newEnv.Key and value/valueFrom
handling as in the block that creates envVar) and then assign
workflowParams["environmentVariables"] = mergedEnvVars; update/remove references
to existingEnvVars to avoid retaining old keys (refer to variables
existingEnvVars, envVars, envMap, mergedEnvVars and the
workflowParams["environmentVariables"] assignment).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 83215257-1f12-42f3-a2ed-6cd9ece18656

📥 Commits

Reviewing files that changed from the base of the PR and between 9129c2b and 6f95604.

📒 Files selected for processing (6)
  • agent-manager-service/clients/clientmocks/openchoreo_client_fake.go
  • agent-manager-service/clients/openchoreosvc/client/client.go
  • agent-manager-service/clients/openchoreosvc/client/components.go
  • agent-manager-service/services/agent_manager.go
  • agent-manager-service/tests/apitestutils/mock_clients.go
  • agent-manager-service/tests/create_agent_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • agent-manager-service/tests/create_agent_test.go

Comment thread agent-manager-service/tests/apitestutils/mock_clients.go
@hanzjk hanzjk changed the title Save env variables as Secrets Configure secrets as env variables Mar 9, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
agent-manager-service/clients/openchoreosvc/client/components.go (1)

302-315: ⚠️ Potential issue | 🔴 Critical

Guard the chat-agent endpoint builder against a nil InputInterface.

The custom-API branch already checks req.InputInterface != nil, but the chat-agent branch dereferences it on Line 311. mapInputInterface in agent-manager-service/services/agent_manager.go returns nil when the request omits that field, so the normal chat-agent create flow can panic here before the component request is sent.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@agent-manager-service/clients/openchoreosvc/client/components.go` around
lines 302 - 315, The chat-agent endpoint builder dereferences
req.InputInterface.BasePath without guarding for nil; change the branch handling
AgentTypeAPI/AgentSubTypeChatAPI (the block that calls getDefaultChatAPISchema
and appends to endpoints) to first check req.InputInterface != nil and either
use a safe default (e.g., empty string) or return a clear error if
InputInterface is required; update the map entry for "basePath" to use the
guarded value so the append to endpoints cannot panic (this mirrors the
custom-API branch and the behavior of mapInputInterface).
♻️ Duplicate comments (2)
agent-manager-service/services/agent_manager.go (2)

91-137: ⚠️ Potential issue | 🟠 Major

Stop translating ErrNotFound before the parent resource is known to exist.

These helpers still collapse any generic not-found into the leaf error, and several changed callsites use them before validating the project. That now misreports a missing project as ErrAgentNotFound or ErrDeploymentPipelineNotFound in paths like GetAgent, GetBuild, GetAgentConfigurations, ListAgentBuilds, CreateAgent, and generateAgentAPIKey. Either validate the parent first or only translate after the parent lookup has already succeeded.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@agent-manager-service/services/agent_manager.go` around lines 91 - 137, The
translate* helpers (translateAgentError, translateBuildError,
translatePipelineError, etc.) are converting generic utils.ErrNotFound into
leaf-specific errors too early and causing misreports when parent resources
(e.g., project or organization) haven't been validated; update the call sites
(GetAgent, GetBuild, GetAgentConfigurations, ListAgentBuilds, CreateAgent,
generateAgentAPIKey) to first verify the parent resource exists (e.g., perform
the project/organization lookup and ensure it succeeded) before calling the
repository method whose error you then pass into the appropriate
translateXError, or alternatively stop using translateXError before parent
validation and only apply translation after the parent lookup returns nil;
ensure translateAgentError, translateBuildError, and translatePipelineError are
only applied post-parent-validation so a missing parent is reported correctly.

1342-1350: ⚠️ Potential issue | 🔴 Critical

Treat omitted env as “no change”, not “clear everything”.

When req.Env == nil, this still calls processEnvVars, which drives syncSecrets down the cleanup path. The later ReplaceComponentEnvVars then persists the empty/tracing-only slice back to the component, so an image-only deploy can wipe stored secrets and user env vars.

🛠️ Suggested fix
-	// Always call processEnvVars to ensure secrets cleanup happens when all env vars are removed
-	envVars, err := s.processEnvVars(ctx, orgName, projectName, lowestEnv, agentName, req.Env)
-	if err != nil {
-		s.logger.Error("Failed to process environment variables", "agentName", agentName, "error", err)
-		return "", fmt.Errorf("failed to process environment variables: %w", err)
-	}
-	deployReq.Env = envVars
+	if req.Env != nil {
+		envVars, err := s.processEnvVars(ctx, orgName, projectName, lowestEnv, agentName, req.Env)
+		if err != nil {
+			s.logger.Error("Failed to process environment variables", "agentName", agentName, "error", err)
+			return "", fmt.Errorf("failed to process environment variables: %w", err)
+		}
+		deployReq.Env = envVars
+	}
@@
-	s.logger.Debug("Replacing component workflow parameters with environment variables", "agentName", agentName, "envVarCount", len(deployReq.Env))
-	if err := s.ocClient.ReplaceComponentEnvVars(ctx, orgName, projectName, agentName, deployReq.Env); err != nil {
-		s.logger.Warn("Failed to replace component workflow parameters with env vars", "agentName", agentName, "error", err)
-		// Continue with deploy even if this fails - env vars will still be applied to the workload
+	if req.Env != nil {
+		s.logger.Debug("Replacing component workflow parameters with environment variables", "agentName", agentName, "envVarCount", len(deployReq.Env))
+		if err := s.ocClient.ReplaceComponentEnvVars(ctx, orgName, projectName, agentName, deployReq.Env); err != nil {
+			s.logger.Warn("Failed to replace component workflow parameters with env vars", "agentName", agentName, "error", err)
+			// Continue with deploy even if this fails - env vars will still be applied to the workload
+		}
 	}

Also applies to: 1422-1428

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@agent-manager-service/services/agent_manager.go` around lines 1342 - 1350,
The current flow treats a nil req.Env as an explicit clearance because
processEnvVars is always called; change it so processEnvVars is only invoked
when req.Env is non-nil (i.e., the caller explicitly provided env changes); if
req.Env == nil skip processing and leave deployReq.Env unchanged so
ReplaceComponentEnvVars won't persist an empty/tracing-only slice and trigger
secret cleanup. Apply the same guard wherever you call processEnvVars (the other
occurrence around the ReplaceComponentEnvVars flow noted in the comment) so only
explicit empty slices (not nil) cause env/secrets replacement.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@agent-manager-service/services/agent_manager.go`:
- Around line 1615-1639: When preserving secret keys, the code currently drops
preserved keys that are absent from existingData (building dataToWrite) without
erroring, leaving secretKeys referencing missing keys; update the merge logic
(the block that iterates preservedSecretKeys and populates dataToWrite using
existingData) to detect any preserved key that is not present in existingData
and return an error (or propagate a clear failure) instead of silently omitting
it; apply the same check in the subsequent similar merge block referenced by
preservedSecretKeys/secretKeys so that missing preserved keys are surfaced
before continuing with upsert.

---

Outside diff comments:
In `@agent-manager-service/clients/openchoreosvc/client/components.go`:
- Around line 302-315: The chat-agent endpoint builder dereferences
req.InputInterface.BasePath without guarding for nil; change the branch handling
AgentTypeAPI/AgentSubTypeChatAPI (the block that calls getDefaultChatAPISchema
and appends to endpoints) to first check req.InputInterface != nil and either
use a safe default (e.g., empty string) or return a clear error if
InputInterface is required; update the map entry for "basePath" to use the
guarded value so the append to endpoints cannot panic (this mirrors the
custom-API branch and the behavior of mapInputInterface).

---

Duplicate comments:
In `@agent-manager-service/services/agent_manager.go`:
- Around line 91-137: The translate* helpers (translateAgentError,
translateBuildError, translatePipelineError, etc.) are converting generic
utils.ErrNotFound into leaf-specific errors too early and causing misreports
when parent resources (e.g., project or organization) haven't been validated;
update the call sites (GetAgent, GetBuild, GetAgentConfigurations,
ListAgentBuilds, CreateAgent, generateAgentAPIKey) to first verify the parent
resource exists (e.g., perform the project/organization lookup and ensure it
succeeded) before calling the repository method whose error you then pass into
the appropriate translateXError, or alternatively stop using translateXError
before parent validation and only apply translation after the parent lookup
returns nil; ensure translateAgentError, translateBuildError, and
translatePipelineError are only applied post-parent-validation so a missing
parent is reported correctly.
- Around line 1342-1350: The current flow treats a nil req.Env as an explicit
clearance because processEnvVars is always called; change it so processEnvVars
is only invoked when req.Env is non-nil (i.e., the caller explicitly provided
env changes); if req.Env == nil skip processing and leave deployReq.Env
unchanged so ReplaceComponentEnvVars won't persist an empty/tracing-only slice
and trigger secret cleanup. Apply the same guard wherever you call
processEnvVars (the other occurrence around the ReplaceComponentEnvVars flow
noted in the comment) so only explicit empty slices (not nil) cause env/secrets
replacement.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 6a42183d-a9a7-4842-87c8-09205660dae2

📥 Commits

Reviewing files that changed from the base of the PR and between 6f95604 and df00e94.

📒 Files selected for processing (4)
  • agent-manager-service/clients/clientmocks/openchoreo_client_fake.go
  • agent-manager-service/clients/openchoreosvc/client/client.go
  • agent-manager-service/clients/openchoreosvc/client/components.go
  • agent-manager-service/services/agent_manager.go

Comment thread agent-manager-service/services/agent_manager.go
@hanzjk hanzjk force-pushed the secrets-ui-changes branch from 9136630 to 818e201 Compare March 9, 2026 14:47
@menakaj menakaj merged commit acff78b into wso2:main Mar 12, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

UI modifications for saving env variables as secrets

2 participants