Conversation
e1179e0 to
6347181
Compare
6347181 to
0ece817
Compare
WalkthroughAdds a testrunner CLI and worker binary, refactors Makefile build/docker targets, extends configuration and DB schema for plugin/policy support, implements S3-backed artifacts and extensive Kubernetes orchestration, and adds many k8s/minikube/production manifests plus unit and integration test utilities. Changes
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Note
Due to the large number of review comments, Critical severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/queue/tasks.go (1)
7-17:⚠️ Potential issue | 🟠 MajorPluginAPIKey should not be passed as plaintext in the queue payload.
The API key is JSON-marshaled and stored unencrypted in the queue backend for 24 hours (lines 20, 29 in producer.go). Store a secret reference, encrypted token, or temporary credential instead.
Evidence
handler.go:79: Raw API key from request is placed directly into payloadproducer.go:20: Payload is JSON-marshaled without encryptionproducer.go:29: Task retained for 24 hours in backendtasks.go:17:PluginAPIKeyis a plain string fieldconsumer.go:42: Unmarshaled into plain string (no decryption)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/queue/tasks.go` around lines 7 - 17, The TestRunPayload currently includes PluginAPIKey as a plaintext string; change the payload to carry a secret reference or an encrypted token instead (e.g. rename PluginAPIKey -> PluginAPIKeyRef or PluginAPIKeyEncrypted) and update all call sites: in the request handler (handler.go) store a secret reference or encrypt the key before placing it on the payload, in producer.go continue JSON-marshalling but with the reference/encrypted value, and in consumer.go retrieve/decrypt the secret at runtime using the reference (or fetch from the secrets manager) before using the API key; ensure all symbols referencing PluginAPIKey (TestRunPayload, handler code that assigns it, producer code that marshals it, consumer code that unmarshals it) are updated accordingly and any temporary credentials are short-lived.
🟠 Major comments (27)
deploy/minikube/postgres.yaml-20-26 (1)
20-26:⚠️ Potential issue | 🟠 MajorAvoid hard-coded Postgres credentials in the manifest.
These values will be committed to the repo and can leak into CI logs. Prefer a Secret and reference it viavalueFrom.🔐 Suggested change (reference a Secret)
env: - name: POSTGRES_USER - value: vultisig + valueFrom: + secretKeyRef: + name: postgres-credentials + key: username - name: POSTGRES_PASSWORD - value: vultisig + valueFrom: + secretKeyRef: + name: postgres-credentials + key: password - name: POSTGRES_DB value: plugin-tests🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deploy/minikube/postgres.yaml` around lines 20 - 26, Replace the hard-coded env values for POSTGRES_USER, POSTGRES_PASSWORD, and POSTGRES_DB in the Deployment/Pod spec with references to a Kubernetes Secret: create a Secret containing keys for user, password, and db, then change each env entry to use valueFrom.secretKeyRef pointing at that Secret (keeping the env names POSTGRES_USER, POSTGRES_PASSWORD, POSTGRES_DB) so credentials are not stored in the manifest or logs.deploy/minikube/server.yaml-22-41 (1)
22-41:⚠️ Potential issue | 🟠 MajorMove DB/S3 credentials into Secrets to match production environment.
Inline credentials in deploy/minikube/server.yaml are a security risk and should be sourced from Kubernetes Secrets. The production environment already follows this pattern with separate Secret resources for
postgres,redis, andminio—minikube should match this approach.🔐 Suggested change (align with production pattern)
- name: PLUGIN_TESTS_API_DATABASE_DSN - value: "postgres://vultisig:vultisig@postgres:5432/plugin-tests?sslmode=disable" + valueFrom: + secretKeyRef: + name: postgres + key: dsn - name: PLUGIN_TESTS_API_ARTIFACT_S3_ACCESS_KEY - value: "minioadmin" + valueFrom: + secretKeyRef: + name: minio + key: root-user - name: PLUGIN_TESTS_API_ARTIFACT_S3_SECRET_KEY - value: "minioadmin" + valueFrom: + secretKeyRef: + name: minio + key: root-passwordCreate or update
deploy/minikube/secrets.yamlwith Secret definitions forpostgresandminio(following the structure indeploy/production/secrets.yaml).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deploy/minikube/server.yaml` around lines 22 - 41, The YAML currently hardcodes DB and S3 credentials (env vars like PLUGIN_TESTS_API_DATABASE_DSN, PLUGIN_TESTS_API_QUEUE_REDIS_HOST/PORT, PLUGIN_TESTS_API_ARTIFACT_S3_ACCESS_KEY, PLUGIN_TESTS_API_ARTIFACT_S3_SECRET_KEY, PLUGIN_TESTS_API_ARTIFACT_S3_BUCKET) in deploy/minikube/server.yaml; instead create a deploy/minikube/secrets.yaml defining Kubernetes Secrets (matching production Secret names such as the postgres and minio secrets) with the corresponding keys (e.g., database dsn, redis host/port if desired, s3 access_key/secret_key/bucket) and then update deploy/minikube/server.yaml to remove inline values and use valueFrom: secretKeyRef (or envFrom) to reference those Secret keys by name (ensure Secret keys match the env var names expected by the app).deploy/minikube/redis.yaml-16-33 (1)
16-33:⚠️ Potential issue | 🟠 MajorInclude complete securityContext with runAsUser for Redis.
The redis:7 image creates aredisuser (UID 999) and runs as non-root via the entrypoint, but you must also specifyrunAsUser: 999andrunAsGroup: 999in securityContext—setting onlyrunAsNonRoot: truewithout the UID may cause permission failures on the/datavolume.🛡️ Corrected hardening
- name: redis image: redis:7 imagePullPolicy: IfNotPresent + securityContext: + runAsNonRoot: true + runAsUser: 999 + runAsGroup: 999 + allowPrivilegeEscalation: false🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deploy/minikube/redis.yaml` around lines 16 - 33, The Redis container spec (container name "redis") is missing a complete securityContext and only using runAsNonRoot which can still lead to /data permission issues; add a securityContext on the container (or pod) for the "redis" container that sets runAsUser: 999, runAsGroup: 999 and runAsNonRoot: true (optionally also set fsGroup if using a shared volume) so the process runs as the redis UID/GID and can write to the mounted /data volume.deploy/minikube/minio.yaml-15-40 (1)
15-40:⚠️ Potential issue | 🟠 MajorAdd container
securityContextfor MinIO.Defaults allow privilege escalation and may run as root. Set a restrictive security context to align with least-privilege and address static-analysis findings.
🔒 Suggested securityContext
containers: - name: minio image: minio/minio:latest + securityContext: + allowPrivilegeEscalation: false + runAsNonRoot: true + runAsUser: 10001🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deploy/minikube/minio.yaml` around lines 15 - 40, Add a restrictive container-level securityContext for the MinIO container (the container with name "minio") to prevent running as root and to disable privilege escalation: set runAsNonRoot: true and a non-root runAsUser (e.g., 1000), set allowPrivilegeEscalation: false, drop all Linux capabilities, and enable readOnlyRootFilesystem: true (and optionally set runAsGroup/fsGroup at the pod level if needed); place this securityContext under the "minio" container spec so it applies to the container while leaving existing readinessProbe, ports, env and resource settings intact.scripts/fixture-gen/main.go-18-20 (1)
18-20:⚠️ Potential issue | 🟠 MajorDon’t embed a default vault password in the CLI.
Hardcoding a real-looking password risks accidental exposure and incorrect usage. Default to empty and require explicit input when decryption is needed.
🔐 Suggested change
- password := flag.String("password", "Saggy@Commotion@Occupier@Registry1", "vault encryption password") + password := flag.String("password", "", "vault encryption password (required if encrypted)")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/fixture-gen/main.go` around lines 18 - 20, Remove the hardcoded default password by changing the password flag default to an empty string (update the password := flag.String("password", "", "...") declaration and its help text) and add a post-flag.Parse() validation that checks the password variable when a vultFile is provided (vultFile)—if decryption is required and password is empty, log an error and exit so callers must explicitly supply the vault password rather than relying on an embedded default.internal/testrunner/jwt.go-17-29 (1)
17-29:⚠️ Potential issue | 🟠 MajorValidate
expireHoursto prevent immediately-expired tokens.
expireHours <= 0yields tokens that are already expired (or expiring instantly), which will break auth flows unpredictably.✅ Suggested guard
func GenerateJWT(secret, pubkey, tokenID string, expireHours int) (string, error) { if secret == "" || pubkey == "" { return "", fmt.Errorf("secret and pubkey are required") } + if expireHours <= 0 { + return "", fmt.Errorf("expireHours must be greater than zero") + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/testrunner/jwt.go` around lines 17 - 29, GenerateJWT currently accepts expireHours which if <= 0 produces an already-expired token; update GenerateJWT to validate expireHours and reject or normalize invalid values. Inside GenerateJWT (function GenerateJWT), check the expireHours parameter at the start and either return a clear error (e.g., "expireHours must be > 0") or set a sensible default positive TTL before computing expirationTime; ensure the validation happens before computing expirationTime and creating Claims so no token is ever created with ExpiresAt in the past.deploy/minikube/minio.yaml-21-25 (1)
21-25:⚠️ Potential issue | 🟠 MajorAvoid default MinIO root credentials in the manifest.
minioadmin/minioadminis a known default and should not be committed. Use a Secret (or external secret manager) and rotate.🔐 Suggested change
- - name: MINIO_ROOT_USER - value: minioadmin - - name: MINIO_ROOT_PASSWORD - value: minioadmin + - name: MINIO_ROOT_USER + valueFrom: + secretKeyRef: + name: minio-secrets + key: root_user + - name: MINIO_ROOT_PASSWORD + valueFrom: + secretKeyRef: + name: minio-secrets + key: root_password🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deploy/minikube/minio.yaml` around lines 21 - 25, The manifest currently hardcodes MINIO_ROOT_USER and MINIO_ROOT_PASSWORD as "minioadmin" — remove those literal env entries and instead reference a Kubernetes Secret; create a Secret (e.g., name minio-creds) holding keys MINIO_ROOT_USER and MINIO_ROOT_PASSWORD and update the env entries in deploy/minikube/minio.yaml to use valueFrom.secretKeyRef for MINIO_ROOT_USER and MINIO_ROOT_PASSWORD (or use envFrom: secretRef) so no credentials are committed; ensure you generate strong values and document rotation/storage in your secret manager.deploy/minikube/worker.yaml-16-81 (1)
16-81:⚠️ Potential issue | 🟠 MajorSet a restrictive
securityContextfor the worker container.The pod currently relies on default privileges. Add
allowPrivilegeEscalation: falseand run as non-root to match least-privilege guidance and avoid Checkov findings.🔒 Suggested securityContext
containers: - name: worker image: plugin-tests-worker:latest + securityContext: + allowPrivilegeEscalation: false + runAsNonRoot: true + runAsUser: 10001🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deploy/minikube/worker.yaml` around lines 16 - 81, The worker container currently runs with default privileges; add a restrictive securityContext under the container named "worker" to prevent privilege escalation and ensure non-root execution: set allowPrivilegeEscalation: false, runAsNonRoot: true, specify a non-root runAsUser (e.g. 1000), and drop all Linux capabilities via capabilities: { drop: ["ALL"] } (optionally add seccompProfile: { type: "RuntimeDefault" }) to the container spec to enforce least-privilege.deploy/minikube/worker.yaml-23-64 (1)
23-64:⚠️ Potential issue | 🟠 MajorMove secrets/credentials to Kubernetes Secrets.
DB DSN, encryption/JWT secrets, API keys, and S3 credentials are hardcoded in the manifest. This risks leakage and makes rotation difficult. Use
valueFrom.secretKeyRef(or external secret manager) and keep plaintext out of the Deployment.🔐 Suggested pattern (apply to all sensitive env vars)
- - name: PLUGIN_TESTS_WORKER_DATABASE_DSN - value: "postgres://vultisig:vultisig@postgres:5432/plugin-tests?sslmode=disable" + - name: PLUGIN_TESTS_WORKER_DATABASE_DSN + valueFrom: + secretKeyRef: + name: worker-secrets + key: database_dsn ... - - name: PLUGIN_TESTS_WORKER_K8S_JWT_SECRET - value: mysecret + - name: PLUGIN_TESTS_WORKER_K8S_JWT_SECRET + valueFrom: + secretKeyRef: + name: worker-secrets + key: jwt_secret🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deploy/minikube/worker.yaml` around lines 23 - 64, The manifest exposes secrets like PLUGIN_TESTS_WORKER_DATABASE_DSN, PLUGIN_TESTS_WORKER_K8S_ENCRYPTION_SECRET, PLUGIN_TESTS_WORKER_K8S_JWT_SECRET, PLUGIN_TESTS_WORKER_K8S_PLUGIN_API_KEY, PLUGIN_TESTS_WORKER_ARTIFACT_S3_ACCESS_KEY and PLUGIN_TESTS_WORKER_ARTIFACT_S3_SECRET_KEY; replace each env var's value with valueFrom.secretKeyRef referencing keys in a Kubernetes Secret (create a Secret containing dsn, encryption_secret, jwt_secret, plugin_api_key, s3_access_key, s3_secret_key or similar), update the Deployment to use those secretKeyRefs for the corresponding env names (e.g., PLUGIN_TESTS_WORKER_DATABASE_DSN -> secretKeyRef: name: plugin-tests-secrets, key: database_dsn), and ensure any plaintext copies are removed from the manifest and documented for rotation or external secret manager use.go.mod-5-206 (1)
5-206:⚠️ Potential issue | 🟠 MajorUpgrade vulnerable dependencies to patched versions.
Static analysis correctly identifies high-severity vulnerabilities in key dependencies:
github.com/ethereum/go-ethereum(v1.15.11): Remote DoS via p2p messages (GO-2026-4314, GO-2026-4315). Upgrade to v1.16.8+.
github.com/cometbft/cometbft(v0.38.17 indirect): Two advisories unpatched—upgrade to v0.38.19 (GO-2025-4025) and v0.38.21 (GO-2026-4361).
github.com/cosmos/cosmos-sdk(v0.50.11 indirect): Three advisories unpatched—upgrade to v0.50.12 (GO-2025-3476), v0.50.13 (GO-2025-3516), and v0.50.14 (GO-2025-3803), affecting group, staking, and distribution modules.
github.com/consensys/gnark-crypto(v0.16.0 indirect): OOM/DoS via unchecked vector deserialization (GO-2025-4087). Upgrade to v0.18.1+ or v0.19.2+.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@go.mod` around lines 5 - 206, The go.mod pins vulnerable dependencies: bump github.com/ethereum/go-ethereum to v1.16.8 or newer, update github.com/cometbft/cometbft to at least v0.38.21 (or v0.38.19+ as needed), upgrade github.com/cosmos/cosmos-sdk to v0.50.14, and raise github.com/consensys/gnark-crypto to v0.18.1+ or v0.19.2+; then run go get <module>@<version> (or edit the require blocks), run go mod tidy to fix indirect deps, rebuild and run tests to ensure compatibility, and address any API changes surfaced by functions/types in those modules.internal/testrunner/schema/schema.sql-36-47 (1)
36-47:⚠️ Potential issue | 🟠 MajorAdd an FK for
plugin_policies.plugin_id.Without a foreign key, orphaned policies can be created. Consider referencing
plugins(id)(and cascading on delete if that’s the intended lifecycle).🛠️ Suggested constraint
CREATE TABLE plugin_policies ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), public_key TEXT NOT NULL, - plugin_id plugin_id NOT NULL, + plugin_id plugin_id NOT NULL REFERENCES plugins(id) ON DELETE CASCADE, plugin_version TEXT NOT NULL, policy_version INTEGER NOT NULL, signature TEXT NOT NULL, recipe TEXT NOT NULL, active BOOLEAN NOT NULL DEFAULT true, created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW() );🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/testrunner/schema/schema.sql` around lines 36 - 47, Add a foreign key constraint on plugin_policies.plugin_id referencing plugins(id) to prevent orphaned policy rows; modify the CREATE TABLE or add an ALTER TABLE for plugin_policies to reference plugins(id) (optionally WITH ON DELETE CASCADE if policies should be removed when a plugin is deleted) and ensure the constraint name is descriptive (e.g., plugin_policies_plugin_id_fkey).deploy/production/ingress.yaml-6-48 (1)
6-48:⚠️ Potential issue | 🟠 MajorTLS secret/host mismatch with cert-manager resources.
The Ingress uses
secretName: server-tlsand hoststests.plugins.vultisig.com/tests-dev.plugins.vultisig.com, but the cert-manager manifests definesecretName: wildcard-test-plugins-tlsand wildcard DNS under*.test.plugins.../*.test-dev.plugins.... This mismatch will prevent valid TLS issuance/termination. Please align the Ingress hosts and TLS secret with the Certificate resources (or update the Certificates to match these hosts).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deploy/production/ingress.yaml` around lines 6 - 48, The Ingress TLS settings are mismatched with your cert-manager Certificate resources: the Ingress uses secretName: server-tls and hosts tests.plugins.vultisig.com / tests-dev.plugins.vultisig.com while the Certificates use secretName: wildcard-test-plugins-tls and wildcard DNS entries; fix by making the Ingress TLS secretName and host entries match the Certificate resources (or update the Certificate resources to use secretName: server-tls and explicit hosts tests.plugins.vultisig.com and tests-dev.plugins.vultisig.com), targeting the Ingress named "server" in namespace "plugin-tests-dev" and the cert-manager secret names (wildcard-test-plugins-tls or server-tls) so cert-manager can issue/terminate TLS correctly.deploy/production/secrets.yaml-1-133 (1)
1-133:⚠️ Potential issue | 🟠 MajorProduction secrets are placeholders—add guardrails.
These manifests include
"CHANGE_ME"placeholders for production namespaces. If applied as-is, services will start with invalid credentials. Consider moving this to a*-example.yaml, using ExternalSecrets/SealedSecrets, or enforcing a CI gate that rejectsCHANGE_MEvalues before deployment.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deploy/production/secrets.yaml` around lines 1 - 133, The production Secret manifests contain "CHANGE_ME" placeholders for Secret resources (names: postgres, redis, minio, encryption, jwt, vault, ghcr-pull-secret) and must not be deployable as-is; update by moving these manifests to a *-example.yaml (or remove production entries), replace the production Secret definitions with references to SealedSecrets/ExternalSecrets (or Kubernetes Secret templates that expect real values from your secret backend), and add a CI pre-deploy validation that scans Secret fields (e.g., stringData.dsn, stringData.password, stringData.uri, stringData["root-user"], stringData["root-password"], stringData.secret, stringData["vault-b64"], data[".dockerconfigjson"]) to fail the pipeline if any value equals "CHANGE_ME" or "CHANGE_ME_BASE64" to prevent accidental production rollout.deploy/production/redis.yaml-16-39 (1)
16-39:⚠️ Potential issue | 🟠 MajorHarden Redis container security context.
The deployment doesn’t constrain privilege escalation or root usage. Add a security context to reduce blast radius.
🔒 Suggested securityContext hardening
spec: containers: - name: redis image: redis:7 + securityContext: + allowPrivilegeEscalation: false + runAsNonRoot: true + runAsUser: 1000 command: ["redis-server", "--requirepass", "$(REDIS_PASSWORD)"]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deploy/production/redis.yaml` around lines 16 - 39, Add a hardened securityContext to the Redis container (the entry under containers with name: redis) to prevent privilege escalation and running as root: set securityContext.allowPrivilegeEscalation: false, securityContext.runAsNonRoot: true with securityContext.runAsUser: 1000 (or a non-root UID used in your environment), securityContext.readOnlyRootFilesystem: true, and drop capabilities via securityContext.capabilities.drop: ["ALL"]; apply these keys under the same container spec (next to env/ports/readinessProbe) so the redis container is constrained from escalating privileges or using root.deploy/production/server.yaml-16-76 (1)
16-76:⚠️ Potential issue | 🟠 MajorHarden server container security context.
The deployment doesn’t constrain privilege escalation or root usage. Add a securityContext to reduce risk.
🔒 Suggested securityContext hardening
containers: - name: server image: ghcr.io/vultisig/plugin-tests/server:latest + securityContext: + allowPrivilegeEscalation: false + runAsNonRoot: true + runAsUser: 1000 command: ["/app/main"]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deploy/production/server.yaml` around lines 16 - 76, Add a securityContext to harden the server container: under the container with name "server" set a container-level securityContext that disables privilege escalation (allowPrivilegeEscalation: false), drops all capabilities (capabilities.drop: ["ALL"]), enforces a non-root user (runAsNonRoot: true and runAsUser: a non-zero UID), and enables a read-only root filesystem (readOnlyRootFilesystem: true); optionally also add a pod-level securityContext to set fsGroup/runAsUser defaults if needed and ensure image user matches the chosen runAsUser before deploying.deploy/production/rbac.yaml-13-41 (1)
13-41:⚠️ Potential issue | 🟠 MajorClusterRole permissions are overly broad for a worker.
This role can create/delete namespaces and create secrets and workloads cluster‑wide. If the worker is compromised, it can affect the whole cluster. Please scope to the minimum required resources/verbs (prefer namespaced Roles/RoleBindings) and avoid namespace delete unless strictly necessary.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deploy/production/rbac.yaml` around lines 13 - 41, The ClusterRole plugin-tests-worker grants overly broad, cluster-wide permissions (e.g., creating/deleting namespaces, creating secrets and workloads); change it to the minimum necessary by replacing the ClusterRole with a namespaced Role (or create a separate Role per target namespace) and use RoleBinding(s) instead of ClusterRole/ClusterRoleBinding, remove the "delete" verb for namespaces, remove cluster-wide "namespaces" create permission, narrow "secrets" and workload permissions to only required verbs and specific resources (e.g., limit to get/create if needed) and restrict resources to the specific namespaces the worker operates in; update the role named plugin-tests-worker and its rules for apiGroups [""] / ["apps"] / ["batch"] / ["networking.k8s.io"] accordingly so it only allows the exact verbs and resources required.deploy/production/worker.yaml-20-22 (1)
20-22:⚠️ Potential issue | 🟠 MajorAvoid
:latestfor production images.Using
latestmakes deployments non‑reproducible and can introduce untested changes. Pin to a version or digest produced by your build pipeline.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deploy/production/worker.yaml` around lines 20 - 22, The deployment uses an unpinned image for the container named "worker" (image: ghcr.io/vultisig/plugin-tests/worker:latest); change the image reference to a reproducible tag or digest provided by your build pipeline (for example replace ":latest" with a CI-produced semantic version tag or an immutable digest) so the "worker" container image is pinned and deployments are reproducible.deploy/production/postgres.yaml-16-48 (1)
16-48:⚠️ Potential issue | 🟠 MajorHarden the Postgres pod/container security context.
This pod runs with default privileges. Recommend explicitly setting non‑root execution and disabling privilege escalation to reduce blast radius.
🔒 Suggested hardening
spec: + securityContext: + runAsNonRoot: true + runAsUser: 999 + runAsGroup: 999 + fsGroup: 999 containers: - name: postgres image: postgres:15 + securityContext: + allowPrivilegeEscalation: false + capabilities: + drop: ["ALL"]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deploy/production/postgres.yaml` around lines 16 - 48, The Postgres container is running with default privileges; update the deployment to add securityContext settings to the Pod and the container for the postgres container (name: postgres) to enforce non‑root and restrict privileges: set pod-level securityContext runAsNonRoot: true and a runAsUser (e.g., 999), and add container-level securityContext with runAsNonRoot: true, runAsUser matching the pod, allowPrivilegeEscalation: false, capabilities: drop: ["ALL"], and readOnlyRootFilesystem: true (and optionally seccompProfile:type: RuntimeDefault) to the container spec to harden the Postgres pod.deploy/production/worker.yaml-15-23 (1)
15-23:⚠️ Potential issue | 🟠 MajorHarden the worker pod/container security context.
Default privileges allow escalation and root execution. Please lock this down explicitly.
🔒 Suggested hardening
spec: + securityContext: + runAsNonRoot: true containers: - name: worker + securityContext: + allowPrivilegeEscalation: false + capabilities: + drop: ["ALL"]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deploy/production/worker.yaml` around lines 15 - 23, The worker pod currently has no security context; add explicit pod- and container-level hardened settings to prevent root and privilege escalation: under spec add a podSecurityContext with fsGroup and runAsNonRoot if needed, and inside the containers entry for name: worker add a securityContext that sets runAsNonRoot: true, runAsUser to a non-root uid, allowPrivilegeEscalation: false, readOnlyRootFilesystem: true, drop capabilities (capabilities.drop: ["ALL"]), and a seccompProfile (e.g., type: RuntimeDefault) to lock down syscalls; ensure these keys are placed under the existing spec and the containers -> name: worker block so Kubernetes will enforce them.internal/testrunner/tests.go-108-115 (1)
108-115:⚠️ Potential issue | 🟠 MajorAdd HTTP status assertions to avoid false positives.
Several tests only check connectivity and ignore non‑2xx responses, so failures can be missed. Please assert expected status codes (e.g., 2xx for health/endpoints; specific expected codes where appropriate).
✅ Example pattern to apply
resp, err := s.pluginCli.GET("/healthz") if err != nil { return fmt.Errorf("plugin unreachable: %w", err) } defer resp.Body.Close() + if resp.StatusCode < 200 || resp.StatusCode >= 300 { + return fmt.Errorf("expected 2xx, got %d", resp.StatusCode) + } return nilAlso applies to: 203-224, 238-263
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/testrunner/tests.go` around lines 108 - 115, The test only verifies connectivity for PluginHealth but not the HTTP status, which can mask non-2xx responses; update the s.run("PluginHealth", ...) handler to assert the response status (e.g., ensure resp.StatusCode is in the 2xx range or equals http.StatusOK) and return an error including the status code (and body snippet if useful) when it is not expected; apply the same pattern to the other test handlers that call s.pluginCli.GET/POST (the blocks around lines 203-224 and 238-263) so every HTTP call checks resp.StatusCode and fails the test on non-success codes.internal/api/handler.go-184-194 (1)
184-194:⚠️ Potential issue | 🟠 MajorValidate
statusfilter values to avoid DB enum errors.
buildFilterParamsaccepts any string and passes it to the DB layer; invalid values will likely throw DB errors and return 500. Validate against the known status set and return a 400 on invalid input.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/api/handler.go` around lines 184 - 194, buildFilterParams currently accepts any status string and forwards it to the DB causing enum errors; update it to validate the incoming status against the known set of allowed values (use the existing queries.TestRunStatus constants or an explicit map/set of allowed strings) before constructing queries.NullTestRunStatus. Change buildFilterParams to return (filterParams, error) (or an additional validation error) so callers can return a 400 Bad Request on invalid status; inside the function, if the trimmed c.QueryParam("status") is not in the allowed set, return a descriptive validation error so the handler can respond with HTTP 400 instead of allowing an invalid queries.TestRunStatus to reach the DB. Ensure you still construct queries.NullTestRunStatus only when the value is valid.deploy/production/minio.yaml-16-46 (1)
16-46:⚠️ Potential issue | 🟠 MajorHarden the MinIO container security context.
The container currently allows privilege escalation and runs as root. Add a restrictive securityContext to reduce blast radius in production.
Note: The upstream
minio/minio:latestimage defaults to root (UID 0), so validate that running as non-root is compatible with your storage setup. IncludefsGroup: 1000for proper volume mount permissions if using the suggested UID.🔒 Example hardening
containers: - name: minio image: minio/minio:latest + securityContext: + allowPrivilegeEscalation: false + runAsNonRoot: true + runAsUser: 1000 + runAsGroup: 1000 + fsGroup: 1000 + seccompProfile: + type: RuntimeDefault🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deploy/production/minio.yaml` around lines 16 - 46, Add a restrictive securityContext to the MinIO container block (container name "minio") to prevent privilege escalation and avoid running as root: set securityContext.allowPrivilegeEscalation to false, set securityContext.runAsNonRoot to true (and runAsUser/runAsGroup to the suggested UID/GID if compatible, e.g., 1000), and add pod-level securityContext.fsGroup: 1000 for proper volume permissions; ensure these fields are added next to the existing container definition and validate the image supports the chosen UID before deploying.internal/worker/manifests.go-238-291 (1)
238-291:⚠️ Potential issue | 🟠 MajorSecrets are embedded in ConfigMaps/env vars.
JWT secrets, encryption secrets, and S3 credentials are rendered into ConfigMaps and plain env vars, which is a security risk in shared clusters. Consider moving them to Kubernetes Secrets and referencing viasecretKeyRef/ mounted volumes.Also applies to: 510-544
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/worker/manifests.go` around lines 238 - 291, verifierConfigMap and workerConfigMap currently serialize sensitive values (v.Server.JWTSecret, v.EncryptionSecret, w.VaultService.EncryptionSecret, BlockStorage.AccessKey/Secret, DB DSN, etc.) into ConfigMaps/envs; change these to read secrets from Kubernetes Secrets instead: create Secret objects for JWT/encryption/S3/DB credentials and then update verifierConfigMap and workerConfigMap to not inline sensitive fields but reference them via secretKeyRef (for env vars) or mount the Secret as a file and populate the JSON from those files at startup (or inject non-sensitive placeholders into the ConfigMap and assemble full config at runtime), ensuring json.MarshalIndent no longer includes secret values.internal/worker/consumer.go-40-136 (1)
40-136:⚠️ Potential issue | 🟠 MajorNamespace cleanup is fully disabled.
The deferred cleanup path is always skipped, which will leak namespaces/resources in steady state. Please re‑enable cleanup or gate it behind an explicit config/debug flag before release.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/worker/consumer.go` around lines 40 - 136, The deferred namespace cleanup is currently skipped; add a config flag (e.g., Consumer.cfg.EnableNamespaceCleanup bool) and modify the defer to run cleanup only when createdNS is true and cfg.EnableNamespaceCleanup is true: restore the commented cleanup block inside the defer but wrap it in a conditional that checks c.cfg.EnableNamespaceCleanup, create a timeout context (context.WithTimeout) for deleteNamespace(ctx, c.k8s, nsName), call deleteNamespace and log errors with log.WithError(delErr).Warn(...) or log.Info(...) on success; ensure the defer closes over nsName, c.k8s and log correctly and that the new config flag is honored when constructing Consumer.internal/testrunner/seeder.go-45-109 (1)
45-109:⚠️ Potential issue | 🟠 MajorGuard against a nil fixture before seeding the vault token.
SeedDatabasedereferencess.config.Fixturewithout validation, which will panic if config is incomplete.🛠️ Suggested fix
func (s *Seeder) SeedDatabase(ctx context.Context) error { + if s.config.Fixture == nil { + return fmt.Errorf("fixture is required for seeding") + } + if s.config.Fixture.Vault.PublicKey == "" { + return fmt.Errorf("fixture vault public_key is required") + } pool, err := pgxpool.New(ctx, s.config.DSN)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/testrunner/seeder.go` around lines 45 - 109, SeedDatabase currently dereferences s.config.Fixture and s.config.Fixture.Vault when calling queries.UpsertVaultToken, which panics if Fixture is nil; update Seeder.SeedDatabase to guard against this by checking if s.config.Fixture == nil || s.config.Fixture.Vault == nil before constructing the UpsertVaultToken params and either return a descriptive error (e.g., fmt.Errorf("missing Fixture.Vault in config")) or skip vault-token seeding with a logged warning; ensure the check is placed just before the now := time.Now() / tokenExpiry block so UpsertVaultToken is never called with a nil pointer.internal/testrunner/participant.go-159-181 (1)
159-181:⚠️ Potential issue | 🟠 MajorAdd a timeout to the delete‑existing‑plugin call.
http.DefaultClienthas no timeout, and the caller passescontext.Background()(line 73), so this can hang indefinitely on network issues. Add both a context deadline and client timeout:🛠️ Suggested fix
func deleteExistingPlugin(ctx context.Context, verifierURL, jwtToken, pluginID string, logger *logrus.Logger) error { + ctx, cancel := context.WithTimeout(ctx, 15*time.Second) + defer cancel() url := verifierURL + "/plugin/" + pluginID req, err := http.NewRequestWithContext(ctx, http.MethodDelete, url, nil) if err != nil { return fmt.Errorf("failed to create request: %w", err) } req.Header.Set("Authorization", "Bearer "+jwtToken) - resp, err := http.DefaultClient.Do(req) + client := &http.Client{Timeout: 15 * time.Second} + resp, err := client.Do(req)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/testrunner/participant.go` around lines 159 - 181, The deleteExistingPlugin function can hang because it uses http.DefaultClient with no timeout and the passed ctx may be background; wrap the incoming ctx with a deadline (e.g., context.WithTimeout and defer cancel) and use that derived context when creating the request, and create a local http.Client with a sensible Timeout (instead of http.DefaultClient) and call client.Do(req); keep existing error handling and resp.Body.Close() and still limit reading with maxErrorResponseBytes.internal/testrunner/seeder.go-120-149 (1)
120-149:⚠️ Potential issue | 🟠 MajorDistinguish bucket creation errors—ignore only "already owned" errors, not all failures.
Currently, both
CreateBucketcalls treat all errors identically (log at Debug level and continue), which masks credential/endpoint configuration issues and falsely reports successful seeding even if no buckets were created.The AWS SDK v1 distinguishes between
BucketAlreadyOwnedByYou(safe to ignore—bucket already exists in your account) andBucketAlreadyExists(critical—bucket name taken globally) via error code inspection. Other errors (credentials, network, endpoint) should propagate.Refactor to check error codes and return non-idempotent errors:
Example pattern (AWS SDK v1)
if err != nil { if aerr, ok := err.(awserr.Error); ok { switch aerr.Code() { case s3.ErrCodeBucketAlreadyOwnedByYou: // Idempotent: bucket already exists in your account return nil } } return fmt.Errorf("failed to create bucket: %w", err) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/testrunner/seeder.go` around lines 120 - 149, In SeedVaults, don't treat all CreateBucket errors as ignorable: after each s3Client.CreateBucket call in Seeder.SeedVaults, assert the error is an awserr.Error and inspect aerr.Code() and only ignore s3.ErrCodeBucketAlreadyOwnedByYou (idempotent); for any other code including s3.ErrCodeBucketAlreadyExists or non-awserr errors, return a wrapped error (e.g., fmt.Errorf("failed to create bucket %s: %w", bucketName, err)) so credential/endpoint/network issues propagate; apply this check for both CreateBucket calls that create s.config.S3.Bucket and "vultisig-dca".
🟡 Minor comments (7)
deploy/minikube/postgres.yaml-16-40 (1)
16-40:⚠️ Potential issue | 🟡 MinorComplete the securityContext hardening with explicit UID/GID for postgres:15.
The postgres:15 image creates apostgresuser (UID/GID 999) but doesn't set it as the default USER. UsingrunAsNonRoot: truealone will fail; you must specify the numeric UID explicitly.🛡️ Correct hardening for postgres:15
- name: postgres image: postgres:15 imagePullPolicy: IfNotPresent + securityContext: + allowPrivilegeEscalation: false + runAsNonRoot: true + runAsUser: 999 + runAsGroup: 999 + fsGroup: 999🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deploy/minikube/postgres.yaml` around lines 16 - 40, The container spec for the postgres container (name: postgres, image: postgres:15) must include an explicit securityContext to harden UID/GID; add a securityContext block on the postgres container with runAsUser: 999, runAsGroup: 999, fsGroup: 999 and keep runAsNonRoot: true so the pod runs as the numeric postgres user/group used by the image; update the container spec (the postgres container entry) to include these fields.internal/testrunner/mpc/wrappers.go-138-145 (1)
138-145:⚠️ Potential issue | 🟡 MinorGuard negative
indexbefore the uint32 cast in the edDSA path.When a negative
indexis cast touint32, it wraps silently (e.g., -1 becomes 4294967295), which can select the wrong party without any error. While current callers pass non-negative indices, defensive validation prevents this risk.🛡️ Suggested guard
func (w *Wrapper) SignSessionMessageReceiver(h Handle, message []byte, index int) (string, error) { if w.isEdDSA { + if index < 0 { + return "", fmt.Errorf("index must be non-negative") + } b, err := eddsaSession.SchnorrSignSessionMessageReceiver(eddsaSession.Handle(h), message, uint32(index)) return string(b), err } b, err := session.DklsSignSessionMessageReceiver(session.Handle(h), message, index) return string(b), err }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/testrunner/mpc/wrappers.go` around lines 138 - 145, The edDSA branch of SignSessionMessageReceiver casts index to uint32 without validating it, which silently wraps negative values; before calling eddsaSession.SchnorrSignSessionMessageReceiver validate index >= 0 and return an error if negative (do not perform the uint32 conversion when invalid), otherwise cast to uint32 and call eddsaSession.SchnorrSignSessionMessageReceiver(eddsaSession.Handle(h), message, uint32(index)); keep the non-edDSA path unchanged.internal/storage/postgres/sqlc/test_runs.sql-11-13 (1)
11-13:⚠️ Potential issue | 🟡 MinorGuard against empty-string filters being treated as real values.
sqlc.nargonly ignoresNULL. If callers pass""forplugin_idorstatus, these queries will filter by empty string and return no rows. Ensure callers coerce empty strings toNULL(e.g.,sql.NullString{Valid:false}), or adjust the SQL to treat''asNULL.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/storage/postgres/sqlc/test_runs.sql` around lines 11 - 13, The query in test_runs uses sqlc.narg('plugin_id') and sqlc.narg('status') which treat only NULL as absent, so empty-string filters will be applied; update the WHERE clauses to treat empty strings as NULL by wrapping the sqlc.narg calls with NULLIF(...,'') (and keep the status cast to test_run_status), e.g., use NULLIF(sqlc.narg('plugin_id')::text,'') in place of sqlc.narg('plugin_id')::text and NULLIF(sqlc.narg('status')::text,'')::test_run_status for the status check so empty strings are ignored the same as NULL.internal/api/templates/list.html-6-18 (1)
6-18:⚠️ Potential issue | 🟡 MinorAdd accessible labels for filter controls.
The
<select>elements lack labels; screen readers won’t have descriptive names. Addaria-labelor explicit<label for>.🧩 Minimal change
- <select name="plugin_id"> + <select name="plugin_id" aria-label="Plugin"> ... - <select name="status"> + <select name="status" aria-label="Status">🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/api/templates/list.html` around lines 6 - 18, The two filter selects in the form (select name="plugin_id" and select name="status" inside the filters form) lack accessible labels; add descriptive accessible names by either adding explicit <label for="..."> elements that reference new ids on the selects (e.g., id="plugin_id" and id="status") or by adding aria-label attributes (e.g., aria-label="Plugin filter" and aria-label="Status filter") to the select elements so screen readers can identify each control (update the template where range .PluginIDs and range .Statuses are rendered).cmd/testrunner/main.go-123-152 (1)
123-152:⚠️ Potential issue | 🟡 MinorInconsistent handling of
ENCRYPTION_SECRETacross subcommands.In
runSeed(line 55),ENCRYPTION_SECRETis fetched viarequireEnv(fatal on empty). InrunIntegration(line 133), it usesos.Getenvdirectly, making it silently optional. If the integration flow actually requires the encryption secret (e.g., for decrypt/verify steps), this will produce confusing downstream errors instead of a clear "variable not set" message at startup.If it's truly optional here, consider adding a comment explaining why. Otherwise, use
requireEnv.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cmd/testrunner/main.go` around lines 123 - 152, runIntegration currently reads ENCRYPTION_SECRET with os.Getenv while runSeed uses requireEnv, causing inconsistent behavior; decide whether the secret is required and then make handling consistent: if required, replace the os.Getenv call in runIntegration with requireEnv("ENCRYPTION_SECRET") and pass that into testrunner.InstallConfig.EncryptionSecret; if optional, add a clear comment above the os.Getenv call explaining why it's optional and ensure downstream code checks for empty EncryptionSecret before using it (see functions runIntegration, runSeed, requireEnv, and the InstallConfig.EncryptionSecret field).cmd/worker/main.go-71-75 (1)
71-75:⚠️ Potential issue | 🟡 Minor
srv.Runerror after graceful shutdown may triggerFatalexit.When
ctxis cancelled, the goroutine at line 74 callssrv.Shutdown(). Ifsrv.Run(mux)subsequently returns an error (e.g., asynq returns a sentinel error on shutdown),logger.Fatalat line 99 will callos.Exit(1), bypassing defers (likedb.Close()on line 46).Consider checking whether the error is due to intentional shutdown before calling
Fatal:Proposed approach
err = srv.Run(mux) if err != nil { + if ctx.Err() != nil { + logger.Info("worker stopped") + return + } logger.WithError(err).Fatal("worker failed") }Also applies to: 92-100
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cmd/worker/main.go` around lines 71 - 75, The srv.Run(mux) error handling should ignore expected shutdown errors so logger.Fatal doesn't call os.Exit and skip defers (e.g., db.Close). Update the post-run error path to check the returned error from srv.Run(mux) with errors.Is(..., context.Canceled) and also against the server shutdown sentinel (e.g., asynq.ErrServerClosed) before calling logger.Fatal; for shutdown-related errors log an Info or Debug and return normally so deferred cleanup (like db.Close) runs, but only use logger.Fatal for unexpected errors. Ensure you reference srv.Run(mux), srv.Shutdown(), ctx.Done(), logger.Fatal and db.Close when making the change.internal/testrunner/client.go-104-115 (1)
104-115:⚠️ Potential issue | 🟡 MinorProperly handle response bodies to prevent connection leaks in polling loops.
net/httpcan return a non-nil response even when an error occurs (e.g., from redirect policy failures, though those already have closed bodies). More importantly, when a response is successfully received but the status is not OK, the body should be fully consumed or explicitly closed before the next iteration to allow connection reuse.🛠️ Suggested fix
resp, err := c.GET("/plugins") + if resp != nil { + defer resp.Body.Close() + } if err == nil { - resp.Body.Close() if resp.StatusCode == http.StatusOK { return nil } }Alternatively, consider draining the body to enable HTTP keep-alive:
resp, err := c.GET("/plugins") if err == nil { + defer io.Copy(io.Discard, resp.Body) resp.Body.Close() if resp.StatusCode == http.StatusOK { return nil } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/testrunner/client.go` around lines 104 - 115, In TestClient.WaitForHealth, ensure you always close or fully drain resp.Body to avoid connection leaks: after calling c.GET("/plugins") handle the case where err != nil but resp != nil by draining (io.Copy(io.Discard, resp.Body)) and then resp.Body.Close(), and likewise when resp.StatusCode != http.StatusOK drain and close before sleeping; update the logic in WaitForHealth to call this drain+close pattern for any non-nil resp and still close bodies on the success path as well.
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (67)
Makefilecmd/server/main.gocmd/testrunner/main.gocmd/worker/main.goconfig/config.godeploy/minikube/minio.yamldeploy/minikube/namespace.yamldeploy/minikube/postgres.yamldeploy/minikube/rbac.yamldeploy/minikube/redis.yamldeploy/minikube/server.yamldeploy/minikube/worker.yamldeploy/production/cert-manager.yamldeploy/production/configmaps.yamldeploy/production/ingress.yamldeploy/production/minio.yamldeploy/production/namespace.yamldeploy/production/postgres.yamldeploy/production/rbac.yamldeploy/production/redis.yamldeploy/production/secrets.yamldeploy/production/server.yamldeploy/production/worker.yamlgo.modinternal/api/artifacts.gointernal/api/handler.gointernal/api/response.gointernal/api/results.gointernal/api/server.gointernal/api/templates/detail.htmlinternal/api/templates/layout.htmlinternal/api/templates/list.htmlinternal/queue/redis_test.gointernal/queue/tasks.gointernal/storage/postgres/migrations/20260216000001_create_test_runs.sqlinternal/storage/postgres/queries/test_runs.sql.gointernal/storage/postgres/schema/schema.sqlinternal/storage/postgres/sqlc/test_runs.sqlinternal/testrunner/client.gointernal/testrunner/db/db.gointernal/testrunner/db/models.gointernal/testrunner/db/seeder.sql.gointernal/testrunner/fixture.jsoninternal/testrunner/fixtures.gointernal/testrunner/jwt.gointernal/testrunner/jwt_test.gointernal/testrunner/mpc/wrappers.gointernal/testrunner/participant.gointernal/testrunner/policy.gointernal/testrunner/queries/seeder.sqlinternal/testrunner/schema/schema.sqlinternal/testrunner/seeder.gointernal/testrunner/sqlc.yamlinternal/testrunner/tests.gointernal/types/test_run_test.gointernal/worker/artifacts.gointernal/worker/consumer.gointernal/worker/janitor.gointernal/worker/k8s.gointernal/worker/k8s_test.gointernal/worker/manifests.gointernal/worker/manifests_test.gointernal/worker/naming.gointernal/worker/naming_test.gointernal/worker/runner.gointernal/worker/runner_test.goscripts/fixture-gen/main.go
💤 Files with no reviewable changes (2)
- internal/storage/postgres/migrations/20260216000001_create_test_runs.sql
- internal/storage/postgres/schema/schema.sql
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (4)
internal/worker/manifests_test.go (2)
161-170:TestSmokeJobhas an undocumented TTL-zero sentinel and missing container assertions.Two issues:
Passing
0for TTL and assertingnilonTTLSecondsAfterFinishedrelies on an implicit convention that0means "disabled" — which inverts Kubernetes semantics (K8s treatsttlSecondsAfterFinished: 0as "clean up immediately"). A brief comment or a named constant (e.g.,noTTL = 0) would make this contract explicit.Unlike
TestSeederJob, this sub-test never asserts the container name ("testrunner"), the container image ("testrunner:v1"), or thatc.Envis non-empty. These omissions leave the smoke-job builder under-tested.💡 Suggested additions
func TestSmokeJob(t *testing.T) { envVars := testrunnerEnvVars(testK8sCfg) job := smokeJob("ns", "testrunner:v1", "my-secret", nil, envVars, 0, nil) assert.Equal(t, "smoke", job.Name) require.Len(t, job.Spec.Template.Spec.Containers, 1) - assert.Equal(t, []string{"smoke"}, job.Spec.Template.Spec.Containers[0].Args) - assert.Nil(t, job.Spec.TTLSecondsAfterFinished) + c := job.Spec.Template.Spec.Containers[0] + assert.Equal(t, "testrunner", c.Name) + assert.Equal(t, "testrunner:v1", c.Image) + assert.Equal(t, []string{"smoke"}, c.Args) + assert.NotEmpty(t, c.Env) + // TTL=0 is treated as "no TTL" — job.Spec.TTLSecondsAfterFinished is left nil + // (K8s TTL=0 would mean immediate deletion, so 0 acts as a sentinel for disabled). + assert.Nil(t, job.Spec.TTLSecondsAfterFinished) require.Len(t, job.Spec.Template.Spec.ImagePullSecrets, 1) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/worker/manifests_test.go` around lines 161 - 170, TestSmokeJob relies on an undocumented TTL-zero sentinel and omits container assertions: introduce a named constant (e.g., noTTL = 0) and use it when calling smokeJob("ns", "testrunner:v1", "my-secret", nil, envVars, noTTL, nil) or add a brief comment near the call to document that 0 means "disabled" for the test, then update assertions around TTLSecondsAfterFinished on the returned job.Spec to reflect the intended "disabled" contract; additionally, extend the test assertions for the first container (job.Spec.Template.Spec.Containers[0]) to check its Name equals "testrunner", Image equals "testrunner:v1", and that Env (c.Env) is non-empty to match the coverage of TestSeederJob.
172-187:POSTGRES_DSNandMINIO_ENDPOINTare only checked for key presence, not value.Unlike
ENCRYPTION_SECRET,JWT_SECRET,PLUGIN_ENDPOINT, andVERIFIER_URL, which all assert specific values,POSTGRES_DSNandMINIO_ENDPOINTare validated only withassert.Contains. Adding value assertions (e.g., confirming the DSN points to the in-namespacepostgreshost and Minio tohttp://minio:9000) would catch mis-wiring between the manifest builder and the expected in-cluster topology.💡 Suggested additions
assert.Contains(t, envMap, "POSTGRES_DSN") +assert.Contains(t, envMap["POSTGRES_DSN"], "postgres") // should target the in-namespace postgres service assert.Contains(t, envMap, "MINIO_ENDPOINT") +assert.Equal(t, "http://minio:9000", envMap["MINIO_ENDPOINT"])🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/worker/manifests_test.go` around lines 172 - 187, The test TestTestrunnerEnvVars currently only checks that POSTGRES_DSN and MINIO_ENDPOINT keys exist; update it to also assert their expected values by reading envMap (built from testrunnerEnvVars) — e.g., assert that envMap["MINIO_ENDPOINT"] equals "http://minio:9000" and that envMap["POSTGRES_DSN"] targets the in-cluster postgres host (assert.Contains t, envMap["POSTGRES_DSN"], "postgres:5432" or assert.Contains t, envMap["POSTGRES_DSN"], "@postgres") so the manifest builder wiring is validated.internal/api/handler.go (2)
22-28: Normalize new request fields alongside existing trims.
KeepsPluginEndpoint/PluginAPIKeyconsistent with other request fields in case they’re used elsewhere later.♻️ Suggested change
req.Version = strings.TrimSpace(req.Version) req.RequestedBy = strings.TrimSpace(req.RequestedBy) +req.PluginEndpoint = strings.TrimSpace(req.PluginEndpoint) +req.PluginAPIKey = strings.TrimSpace(req.PluginAPIKey)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/api/handler.go` around lines 22 - 28, The new fields PluginEndpoint and PluginAPIKey on CreateTestRunRequest must be normalized (trimmed) like the other request fields; find where the request is parsed/validated (code handling CreateTestRunRequest, e.g., the CreateTestRun handler or a normalize/validate helper) and apply the same strings.TrimSpace normalization to PluginEndpoint and PluginAPIKey as is done for PluginID, ProposalID, Version, and RequestedBy so they remain consistent if used elsewhere.
188-209: Consider case-insensitivestatusfiltering for client ergonomics.
Allowingqueued/Queuedavoids unnecessary 400s without changing behavior.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/api/handler.go` around lines 188 - 209, The status filter in buildFilterParams currently rejects values with different casing because it looks up the raw query string in validStatuses; normalize the incoming status (e.g., v := strings.ToUpper(strings.TrimSpace(c.QueryParam("status")))) before validating against validStatuses and before constructing queries.NullTestRunStatus/queries.TestRunStatus so inputs like "queued" or "Queued" are accepted while preserving the same stored enum values; update the validation and the TestRunStatus assignment to use the normalized uppercase value.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/api/handler.go`:
- Around line 72-80: The payload currently stores PluginAPIKey in plain text on
TestRunPayload.PluginAPIKey before queuing; remove sensitive data from the
queued payload or encrypt it before marshaling. Fix by (A) deleting PluginAPIKey
from queue payload construction in the handler that builds TestRunPayload and
instead store the key in a secure store (e.g., secrets table or vault) keyed by
RunID/PluginID, then have the consumer lookup the secret after dequeuing; or (B)
if you must include it, encrypt TestRunPayload.PluginAPIKey with a service
encryption key prior to enqueueing and decrypt in the consumer, ensuring
trimming/validation (strings.TrimSpace) happens before encryption and never
logging the decrypted value. Ensure changes touch the code paths that build and
enqueue TestRunPayload and the consumer path that processes it so the consumer
can retrieve/decrypt the secret by RunID/PluginID.
- Around line 151-155: The offset parsed via strconv.Atoi is cast to int32 and
can overflow; add const maxQueryOffset = 1<<31 - 1 and validate the parsed
offset before casting: if parsed > maxQueryOffset return a 400 error (bad
request) saying offset too large. Apply this check in both places where offset
is parsed and cast to int32 (the code that sets QueryOffset and the other
occurrence in results.go), ensuring you reject oversized offsets instead of
allowing the int32 cast to wrap.
---
Nitpick comments:
In `@internal/api/handler.go`:
- Around line 22-28: The new fields PluginEndpoint and PluginAPIKey on
CreateTestRunRequest must be normalized (trimmed) like the other request fields;
find where the request is parsed/validated (code handling CreateTestRunRequest,
e.g., the CreateTestRun handler or a normalize/validate helper) and apply the
same strings.TrimSpace normalization to PluginEndpoint and PluginAPIKey as is
done for PluginID, ProposalID, Version, and RequestedBy so they remain
consistent if used elsewhere.
- Around line 188-209: The status filter in buildFilterParams currently rejects
values with different casing because it looks up the raw query string in
validStatuses; normalize the incoming status (e.g., v :=
strings.ToUpper(strings.TrimSpace(c.QueryParam("status")))) before validating
against validStatuses and before constructing
queries.NullTestRunStatus/queries.TestRunStatus so inputs like "queued" or
"Queued" are accepted while preserving the same stored enum values; update the
validation and the TestRunStatus assignment to use the normalized uppercase
value.
In `@internal/worker/manifests_test.go`:
- Around line 161-170: TestSmokeJob relies on an undocumented TTL-zero sentinel
and omits container assertions: introduce a named constant (e.g., noTTL = 0) and
use it when calling smokeJob("ns", "testrunner:v1", "my-secret", nil, envVars,
noTTL, nil) or add a brief comment near the call to document that 0 means
"disabled" for the test, then update assertions around TTLSecondsAfterFinished
on the returned job.Spec to reflect the intended "disabled" contract;
additionally, extend the test assertions for the first container
(job.Spec.Template.Spec.Containers[0]) to check its Name equals "testrunner",
Image equals "testrunner:v1", and that Env (c.Env) is non-empty to match the
coverage of TestSeederJob.
- Around line 172-187: The test TestTestrunnerEnvVars currently only checks that
POSTGRES_DSN and MINIO_ENDPOINT keys exist; update it to also assert their
expected values by reading envMap (built from testrunnerEnvVars) — e.g., assert
that envMap["MINIO_ENDPOINT"] equals "http://minio:9000" and that
envMap["POSTGRES_DSN"] targets the in-cluster postgres host (assert.Contains t,
envMap["POSTGRES_DSN"], "postgres:5432" or assert.Contains t,
envMap["POSTGRES_DSN"], "@postgres") so the manifest builder wiring is
validated.
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
internal/api/handler.gointernal/testrunner/jwt.gointernal/testrunner/schema/schema.sqlinternal/testrunner/tests.gointernal/worker/manifests_test.gointernal/worker/naming_test.go
🚧 Files skipped from review as they are similar to previous changes (3)
- internal/testrunner/jwt.go
- internal/testrunner/tests.go
- internal/testrunner/schema/schema.sql
There was a problem hiding this comment.
🧹 Nitpick comments (2)
internal/api/handler.go (2)
189-210: Consider making status filter case-insensitive for better UX.
Normalizing to uppercase before validation would let clients passqueued,Queued, etc., without 400s.Suggested tweak
if v := strings.TrimSpace(c.QueryParam("status")); v != "" { + v = strings.ToUpper(v) if !validStatuses[v] { return fp, fmt.Errorf("invalid status: %s", v) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/api/handler.go` around lines 189 - 210, The status filter is case-sensitive and rejects values like "queued" or "Queued"; update buildFilterParams to normalize the status query param to a canonical form before validating and assigning it: trim and convert the value to strings.ToUpper(v) (or equivalent) prior to checking against validStatuses and prior to setting queries.NullTestRunStatus/TestRunStatus so the stored value uses the uppercase constant; ensure you reference validStatuses, buildFilterParams, queries.NullTestRunStatus and queries.TestRunStatus when making the change.
73-81: Trim plugin fields alongside other request fields for consistency.
Right nowPluginEndpoint/PluginAPIKeyare trimmed only at payload construction. Consider trimming them with the rest of the request to keepreqconsistently sanitized and avoid doubleTrimSpace.Suggested refactor
@@ req.PluginID = strings.TrimSpace(req.PluginID) req.ProposalID = strings.TrimSpace(req.ProposalID) req.Version = strings.TrimSpace(req.Version) req.RequestedBy = strings.TrimSpace(req.RequestedBy) + req.PluginEndpoint = strings.TrimSpace(req.PluginEndpoint) + req.PluginAPIKey = strings.TrimSpace(req.PluginAPIKey) @@ payload := queue.TestRunPayload{ RunID: result.ID.String(), PluginID: req.PluginID, ProposalID: req.ProposalID, Version: req.Version, Suite: defaultSuite, RequestedBy: req.RequestedBy, - PluginEndpoint: strings.TrimSpace(req.PluginEndpoint), - PluginAPIKey: strings.TrimSpace(req.PluginAPIKey), + PluginEndpoint: req.PluginEndpoint, + PluginAPIKey: req.PluginAPIKey, }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/api/handler.go` around lines 73 - 81, Trim PluginEndpoint and PluginAPIKey on the incoming request object instead of only when building the payload: locate where the request variable req is unmarshalled/validated (the handler that constructs queue.TestRunPayload) and apply strings.TrimSpace to req.PluginEndpoint and req.PluginAPIKey early (alongside the other sanitized fields) so req is consistently sanitized; then remove the redundant TrimSpace calls in the queue.TestRunPayload construction.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@internal/api/handler.go`:
- Around line 189-210: The status filter is case-sensitive and rejects values
like "queued" or "Queued"; update buildFilterParams to normalize the status
query param to a canonical form before validating and assigning it: trim and
convert the value to strings.ToUpper(v) (or equivalent) prior to checking
against validStatuses and prior to setting
queries.NullTestRunStatus/TestRunStatus so the stored value uses the uppercase
constant; ensure you reference validStatuses, buildFilterParams,
queries.NullTestRunStatus and queries.TestRunStatus when making the change.
- Around line 73-81: Trim PluginEndpoint and PluginAPIKey on the incoming
request object instead of only when building the payload: locate where the
request variable req is unmarshalled/validated (the handler that constructs
queue.TestRunPayload) and apply strings.TrimSpace to req.PluginEndpoint and
req.PluginAPIKey early (alongside the other sanitized fields) so req is
consistently sanitized; then remove the redundant TrimSpace calls in the
queue.TestRunPayload construction.
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
internal/api/handler.gointernal/api/results.go
🚧 Files skipped from review as they are similar to previous changes (1)
- internal/api/results.go
Summary by CodeRabbit
New Features
Infrastructure