Trigger evaluation run on past monitor update#438
Conversation
📝 WalkthroughWalkthroughThe PR adds automatic evaluation run triggering for past monitors during creation and update, introduces GitHub Actions CI/CD workflows for evaluation-job, updates deployment configurations to support evaluation extension installation with new secrets (encryption key and publisher API key), and enhances code quality tooling for the evaluation-job component. Changes
Sequence DiagramsequenceDiagram
participant User
participant MonitorManager as Monitor Manager
participant Database as Database
participant Evaluator as Evaluator/Runner
User->>MonitorManager: CreateMonitor (Past)
MonitorManager->>Database: Save monitor config
alt Monitor is Past Type
MonitorManager->>Evaluator: ExecuteMonitorRun<br/>(monitor.TraceStart/End,<br/>evaluators)
alt Run succeeds
Evaluator-->>MonitorManager: Run result
MonitorManager->>Database: Update latestRun
MonitorManager-->>User: Monitor created with run
else Run fails
Evaluator-->>MonitorManager: Error
MonitorManager->>Database: Rollback monitor entry
MonitorManager-->>User: Creation failed
end
else Monitor is Future Type
MonitorManager-->>User: Monitor created<br/>(scheduler will run)
end
User->>MonitorManager: UpdateMonitor (Past)
MonitorManager->>Database: Apply updates to monitor
alt Monitor is Past Type
MonitorManager->>Evaluator: ExecuteMonitorRun<br/>(updated config,<br/>monitor.TraceStart/End)
Evaluator-->>MonitorManager: Run result
MonitorManager->>Database: Update latestRun
MonitorManager-->>User: Monitor updated with new run
else Monitor is Future Type
MonitorManager-->>User: Monitor updated<br/>(no immediate run)
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
evaluation-job/test_main.py (1)
167-168:⚠️ Potential issue | 🟡 MinorInaccurate version note in comment.
datetime.datetime.fromisoformat("2026-01-15")has accepted date-only strings since Python 3.7, not 3.11+. The comment should read# Python 3.7+(or simply be removed since this is the minimum version already).🔧 Proposed fix
- # datetime.fromisoformat accepts date-only strings in Python 3.11+ + # datetime.fromisoformat accepts date-only strings in Python 3.7+ assert validate_time_format("2026-01-15") is True🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@evaluation-job/test_main.py` around lines 167 - 168, The comment above the assertion in test_main.py incorrectly claims datetime.fromisoformat accepts date-only strings starting in Python 3.11; update that comment to the correct minimum version (Python 3.7+) or remove the comment entirely. Locate the assertion referencing validate_time_format("2026-01-15") and replace the comment text "# datetime.fromisoformat accepts date-only strings in Python 3.11+" with either "# datetime.fromisoformat accepts date-only strings since Python 3.7+" or delete the comment so the test reads without the inaccurate version note.agent-manager-service/config/config_loader.go (1)
171-176:⚠️ Potential issue | 🟠 MajorKeep a fallback for
INTERNAL_API_KEYduring migration.Reading only
PUBLISHER_API_KEYcan silently fall back to the dev default in existing setups that still provideINTERNAL_API_KEY, causing auth drift and weaker security posture.Suggested migration-safe pattern
config.InternalServer = InternalServerConfig{ Host: r.readOptionalString("INTERNAL_SERVER_HOST", ""), Port: int(r.readOptionalInt64("INTERNAL_SERVER_PORT", 9243)), CertDir: r.readOptionalString("INTERNAL_SERVER_CERT_DIR", "./data/certs"), - APIKey: r.readOptionalString("PUBLISHER_API_KEY", "dev-publisher-api-key"), + APIKey: "", ReadTimeoutSeconds: int(r.readOptionalInt64("INTERNAL_SERVER_READ_TIMEOUT_SECONDS", 10)), WriteTimeoutSeconds: int(r.readOptionalInt64("INTERNAL_SERVER_WRITE_TIMEOUT_SECONDS", 90)), IdleTimeoutSeconds: int(r.readOptionalInt64("INTERNAL_SERVER_IDLE_TIMEOUT_SECONDS", 60)), MaxHeaderBytes: int(r.readOptionalInt64("INTERNAL_SERVER_MAX_HEADER_BYTES", 65536)), } + config.InternalServer.APIKey = r.readOptionalString("PUBLISHER_API_KEY", "") + if config.InternalServer.APIKey == "" { + config.InternalServer.APIKey = r.readOptionalString("INTERNAL_API_KEY", "dev-publisher-api-key") + if os.Getenv("INTERNAL_API_KEY") != "" { + slog.Warn("INTERNAL_API_KEY is deprecated — use PUBLISHER_API_KEY") + } + }Also applies to: 241-243
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@agent-manager-service/config/config_loader.go` around lines 171 - 176, The config loader currently only reads PUBLISHER_API_KEY into config.InternalServer.APIKey which can silently fallback to the dev default and ignore an existing INTERNAL_API_KEY; update the logic that sets config.InternalServer.APIKey (the call to r.readOptionalString for "PUBLISHER_API_KEY") to prefer reading "INTERNAL_API_KEY" first and fall back to "PUBLISHER_API_KEY" (and then to the existing dev default), using the same pattern wherever you set API keys (including the similar APIKey assignment later in the file around the other config block) so migration-era environments providing INTERNAL_API_KEY are honored.
🧹 Nitpick comments (3)
.github/workflows/evaluation-job-pr-checks.yaml (2)
55-57:pytest-covis installed but no coverage flags are passed topytest.Either add
--cov=.(and optionally--cov-report=xml) to thepytestinvocation to actually collect coverage, or removepytest-covfrom the install step to avoid the unused dependency.♻️ Option A — enable coverage reporting
- pytest test_main.py -v + pytest test_main.py -v --cov=. --cov-report=xml♻️ Option B — drop unused install
- pip install requests litellm pytest pytest-cov + pip install requests litellm pytest🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/evaluation-job-pr-checks.yaml around lines 55 - 57, The workflow installs pytest-cov but never passes coverage flags to pytest; either add coverage flags to the pytest invocation (e.g., update the step that runs "pytest" to include "--cov=. --cov-report=xml" so pytest-cov actually collects and emits coverage) or remove "pytest-cov" from the pip install line to avoid the unused dependency; update the relevant CI job command that currently runs "pytest" to include the coverage options if you choose Option A, or delete "pytest-cov" from the pip installs if you choose Option B.
29-30: Pinruffversion to avoid non-deterministic CI failures.
pip install ruffwithout a version pin means a new ruff release can introduce additional rules and break CI on PRs that haven't changed any linted code. Pin to a known-good version (e.g.,ruff==0.9.x) and update it deliberately.♻️ Proposed fix
- pip install ruff + pip install ruff==0.9.10 # update intentionally when upgrading🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/evaluation-job-pr-checks.yaml around lines 29 - 30, The workflow currently installs ruff with an unpinned command ("pip install ruff"); change that to pin ruff to a known-good release (e.g., replace "pip install ruff" with "pip install ruff==0.9.x" or a specific patch like "ruff==0.9.0") so CI is deterministic; update the pip install line in the same job step that runs "python -m pip install --upgrade pip" and remember to intentionally bump this pinned version when you want to adopt a newer ruff release.evaluation-job/main.py (1)
45-45: Prefer built-in generics overtypingaliases for the Python 3.11 target.
Dict,Listfromtypingare soft-deprecated since Python 3.9 (PEP 585). Withtarget-version = "py311"inpyproject.toml, using the built-in forms is idiomatic and removes the import.♻️ Proposed fix
-from typing import Dict, List, Any +from typing import AnyThen update the type hints in
publish_scores:-def publish_scores( - monitor_id: str, - run_id: str, - scores: Dict[str, EvaluatorSummary], - display_name_to_identifier: Dict[str, str], - api_endpoint: str, - api_key: str, -) -> bool: +def publish_scores( + monitor_id: str, + run_id: str, + scores: dict[str, EvaluatorSummary], + display_name_to_identifier: dict[str, str], + api_endpoint: str, + api_key: str, +) -> bool:And the local variable annotations inside the function body:
- individual_scores: List[Dict[str, Any]] = [] - aggregated_scores: List[Dict[str, Any]] = [] + individual_scores: list[dict[str, Any]] = [] + aggregated_scores: list[dict[str, Any]] = []🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@evaluation-job/main.py` at line 45, Replace the typing aliases Dict and List with built-in generics in evaluation-job/main.py: remove Dict and List from the import and keep only Any (i.e., change "from typing import Dict, List, Any" to "from typing import Any"); then update all type hints in the publish_scores function (and any local variable annotations in its body) from Dict[...] to dict[...] and List[...] to list[...], preserving Any where used. Ensure function signature and internal variable annotations use the native dict/list generics throughout.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/evaluation-job-pr-checks.yaml:
- Around line 13-14: The workflow uses a self-hosted runner in the runs-on field
for lint/type-check which will stall on fork PRs; update the runs-on value for
the lint and type-check jobs (the entries currently using
"codebuild-wso2_agent-manager-${{ github.run_id }}-${{ github.run_attempt }}")
to "ubuntu-latest" to ensure fork PRs run, or alternatively document/configure
your repository's self-hosted runner approval policy so those jobs can run for
external contributions; locate and change the runs-on fields for the lint and
type-check jobs accordingly.
In `@agent-manager-service/services/monitor_manager.go`:
- Around line 324-336: The update path dereferences monitor.TraceStart /
monitor.TraceEnd for models.MonitorTypePast and triggers ExecuteMonitorRun
(ExecuteMonitorRunParams) after persisting, which can crash or leave invalid
state; add validation before persisting and before calling ExecuteMonitorRun:
ensure monitor.TraceStart and monitor.TraceEnd are non-nil, verify
traceStart.Before(traceEnd) (and any domain constraints you need), return a
clear error without writing the DB if invalid, and only then proceed to persist
and call s.executor.ExecuteMonitorRun with the validated values.
In `@deployments/docker-compose.yml`:
- Around line 60-61: Remove the hardcoded ENCRYPTION_KEY value from
docker-compose defaults and require it be provided externally; replace the
literal assignment for ENCRYPTION_KEY with a reference to an environment
variable (so runtime will read process env or an env_file) or configure it to
come from Docker secrets, and add a note/example in .env.example or the README
that ENCRYPTION_KEY must be set (do not commit real keys). Update any service
environment entries that mention ENCRYPTION_KEY and ensure the compose config
will fail-fast if the variable is missing.
In `@deployments/scripts/setup-openchoreo.sh`:
- Around line 214-216: The helm upgrade command installs
"amp-evaluation-workflows-extension" and overrides
ampEvaluation.image.repository with a short name "amp-evaluation-monitor", which
can cause image-pull failures in clean environments; update the override to a
fully-qualified image repository (e.g., include registry and namespace) or
remove the --set ampEvaluation.image.repository override so the chart default is
used, ensuring the ampEvaluation.image.repository value passed to helm (in the
upgrade/install invocation) is a fully qualified registry/path.
In `@docs/install/managed-cluster.md`:
- Around line 563-572: The second helm install snippet (the one setting
ampEvaluation.publisher.apiKey) reads like a follow-up step and could make users
run two installs for the same release; mark it explicitly as an alternative
command (e.g., "Alternative: use this command instead if you changed
publisherApiKey.value") and add a short note that it replaces the default
install to avoid duplicate-release errors; reference the values publisher.apiKey
and publisherApiKey.value and the amp-evaluation-extension helm install so
reviewers can locate and update the text accordingly.
In `@docs/install/self-hosted-cluster.md`:
- Around line 506-515: The install block presents a second helm install as if
sequential, which can fail; mark it explicitly as an alternative (e.g., start
with "Alternatively," or "To override the default API key, run:") and clarify it
is an alternative to the previous helm install for the same release name;
mention the specific values to change (ampEvaluation.publisher.apiKey vs Agent
Manager's publisherApiKey.value) so users know to use this command only when
overriding the default amp-internal-api-key.
---
Outside diff comments:
In `@agent-manager-service/config/config_loader.go`:
- Around line 171-176: The config loader currently only reads PUBLISHER_API_KEY
into config.InternalServer.APIKey which can silently fallback to the dev default
and ignore an existing INTERNAL_API_KEY; update the logic that sets
config.InternalServer.APIKey (the call to r.readOptionalString for
"PUBLISHER_API_KEY") to prefer reading "INTERNAL_API_KEY" first and fall back to
"PUBLISHER_API_KEY" (and then to the existing dev default), using the same
pattern wherever you set API keys (including the similar APIKey assignment later
in the file around the other config block) so migration-era environments
providing INTERNAL_API_KEY are honored.
In `@evaluation-job/test_main.py`:
- Around line 167-168: The comment above the assertion in test_main.py
incorrectly claims datetime.fromisoformat accepts date-only strings starting in
Python 3.11; update that comment to the correct minimum version (Python 3.7+) or
remove the comment entirely. Locate the assertion referencing
validate_time_format("2026-01-15") and replace the comment text "#
datetime.fromisoformat accepts date-only strings in Python 3.11+" with either "#
datetime.fromisoformat accepts date-only strings since Python 3.7+" or delete
the comment so the test reads without the inaccurate version note.
---
Nitpick comments:
In @.github/workflows/evaluation-job-pr-checks.yaml:
- Around line 55-57: The workflow installs pytest-cov but never passes coverage
flags to pytest; either add coverage flags to the pytest invocation (e.g.,
update the step that runs "pytest" to include "--cov=. --cov-report=xml" so
pytest-cov actually collects and emits coverage) or remove "pytest-cov" from the
pip install line to avoid the unused dependency; update the relevant CI job
command that currently runs "pytest" to include the coverage options if you
choose Option A, or delete "pytest-cov" from the pip installs if you choose
Option B.
- Around line 29-30: The workflow currently installs ruff with an unpinned
command ("pip install ruff"); change that to pin ruff to a known-good release
(e.g., replace "pip install ruff" with "pip install ruff==0.9.x" or a specific
patch like "ruff==0.9.0") so CI is deterministic; update the pip install line in
the same job step that runs "python -m pip install --upgrade pip" and remember
to intentionally bump this pinned version when you want to adopt a newer ruff
release.
In `@evaluation-job/main.py`:
- Line 45: Replace the typing aliases Dict and List with built-in generics in
evaluation-job/main.py: remove Dict and List from the import and keep only Any
(i.e., change "from typing import Dict, List, Any" to "from typing import Any");
then update all type hints in the publish_scores function (and any local
variable annotations in its body) from Dict[...] to dict[...] and List[...] to
list[...], preserving Any where used. Ensure function signature and internal
variable annotations use the native dict/list generics throughout.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (20)
.github/workflows/evaluation-job-pr-checks.yamlagent-manager-service/.env.exampleagent-manager-service/config/config_loader.goagent-manager-service/services/monitor_manager.goagent-manager-service/tests/monitor_test.godeployments/docker-compose.ymldeployments/helm-charts/wso2-agent-manager/templates/agent-manager-service/deployment.yamldeployments/helm-charts/wso2-agent-manager/templates/agent-manager-service/secret.yamldeployments/helm-charts/wso2-agent-manager/values.yamldeployments/helm-charts/wso2-amp-evaluation-extension/values.yamldeployments/quick-start/install-helpers.shdeployments/quick-start/install.shdeployments/scripts/setup-openchoreo.shdocs/install/managed-cluster.mddocs/install/self-hosted-cluster.mdevaluation-job/Makefileevaluation-job/main.pyevaluation-job/pyproject.tomlevaluation-job/test_main.pylibs/amp-evaluation/src/amp_evaluation/py.typed
| runs-on: | ||
| - codebuild-wso2_agent-manager-${{ github.run_id }}-${{ github.run_attempt }} |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n .github/workflows/evaluation-job-pr-checks.yamlRepository: wso2/agent-manager
Length of output: 2693
🌐 Web query:
GitHub Actions self-hosted runners fork pull requests unavailable behavior
💡 Result:
GitHub Actions treating pull requests from forks “as unavailable” on self-hosted runners is usually expected security behavior:
- Public repositories: PR workflows from forks may not run automatically and can require a maintainer approval first (policy: first-time contributors / all outside collaborators). Until approved, the run won’t start, which can look like runner unavailability. [1][2]
- Private/internal repositories: Fork PR workflows are disabled by default unless you enable the “Run workflows from fork pull requests” policy (set at enterprise/org level first, then repo). When enabled, fork PR runs get a read-only
GITHUB_TOKENand no secrets (unless you explicitly allow write tokens/secrets). [3][2] - Why GitHub is strict here: Self-hosted runners are not guaranteed to be ephemeral/clean and can be persistently compromised by untrusted PR code—especially from forks—so GitHub strongly cautions against using self-hosted runners for public repos. [4]
Most common fixes/checks
- For public repos: repo/org settings → Actions → General → set/confirm the fork PR approval policy, then approve pending runs. [1]
- For private/internal repos: enable Run workflows from fork pull requests (and optionally other fork policies) at the org/enterprise level, then the repo. [3]
Sources
[1] GitHub Docs – Managing Actions settings / fork PR approvals in public repos (docs.github.com)
[2] GitHub Docs – Enterprise policy: fork PR workflows & notes (docs.github.com)
[3] GitHub Enterprise Server Docs – Fork PR workflows in private repos (run/write/secrets/approval) (docs.github.com)
[4] GitHub Docs – Secure use reference (self-hosted runner hardening) (docs.github.com)
Citations:
- 1: https://docs.github.com/github/administering-a-repository/managing-repository-settings/disabling-or-limiting-github-actions-for-a-repository?utm_source=openai
- 2: https://docs.github.com/enterprise-cloud%40latest/admin/policies/enforcing-policies-for-your-enterprise/enforcing-policies-for-github-actions-in-your-enterprise?utm_source=openai
- 3: https://docs.github.com/en/enterprise-server%403.14/organizations/managing-organization-settings/disabling-or-limiting-github-actions-for-your-organization?utm_source=openai
- 4: https://docs.github.com/en/actions/how-tos/security-for-github-actions/security-guides/using-githubs-security-features-to-secure-your-use-of-github-actions?utm_source=openai
Lint and type-check use CodeBuild self-hosted runners while test uses ubuntu-latest — this will cause lint and type-check to stall for fork PRs.
Self-hosted runners require explicit approval or configuration to run on fork pull requests. Without this, the lint and type-check jobs will hang indefinitely on external contributor PRs, while the test job (on ubuntu-latest) will complete successfully, creating a false sense of safety. Either use ubuntu-latest for all three jobs for consistency and to ensure fork PRs complete, or document and configure the fork PR runner approval policy in your repository settings.
Also applies to: 65-66
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/evaluation-job-pr-checks.yaml around lines 13 - 14, The
workflow uses a self-hosted runner in the runs-on field for lint/type-check
which will stall on fork PRs; update the runs-on value for the lint and
type-check jobs (the entries currently using "codebuild-wso2_agent-manager-${{
github.run_id }}-${{ github.run_attempt }}") to "ubuntu-latest" to ensure fork
PRs run, or alternatively document/configure your repository's self-hosted
runner approval policy so those jobs can run for external contributions; locate
and change the runs-on fields for the lint and type-check jobs accordingly.
| if monitor.Type == models.MonitorTypePast { | ||
| // Past monitors: trigger a new evaluation run with updated config | ||
| result, err := s.executor.ExecuteMonitorRun(ctx, ExecuteMonitorRunParams{ | ||
| OrgName: orgName, | ||
| Monitor: monitor, | ||
| StartTime: *monitor.TraceStart, | ||
| EndTime: *monitor.TraceEnd, | ||
| Evaluators: monitor.Evaluators, | ||
| }) | ||
| if err != nil { | ||
| s.logger.Error("Failed to trigger past monitor run after update", "name", monitorName, "error", err) | ||
| return nil, fmt.Errorf("monitor updated but failed to trigger evaluation run: %w", err) | ||
| } |
There was a problem hiding this comment.
Validate past-monitor trace window before running (and before persisting).
The update path dereferences TraceStart/TraceEnd directly and triggers execution after the DB write. Missing/invalid windows can produce unstable behavior and leave invalid persisted state before returning an error.
Suggested fix
+ if monitor.Type == models.MonitorTypePast {
+ if monitor.TraceStart == nil || monitor.TraceEnd == nil {
+ return nil, fmt.Errorf("traceStart and traceEnd are required for past monitors: %w", utils.ErrInvalidInput)
+ }
+ if !monitor.TraceEnd.After(*monitor.TraceStart) {
+ return nil, fmt.Errorf("traceEnd must be after traceStart: %w", utils.ErrInvalidInput)
+ }
+ if monitor.TraceEnd.After(time.Now()) {
+ return nil, fmt.Errorf("traceEnd must not be in the future: %w", utils.ErrInvalidInput)
+ }
+ }
if err := s.monitorRepo.UpdateMonitor(monitor); err != nil {
return nil, fmt.Errorf("failed to update monitor: %w", err)
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@agent-manager-service/services/monitor_manager.go` around lines 324 - 336,
The update path dereferences monitor.TraceStart / monitor.TraceEnd for
models.MonitorTypePast and triggers ExecuteMonitorRun (ExecuteMonitorRunParams)
after persisting, which can crash or leave invalid state; add validation before
persisting and before calling ExecuteMonitorRun: ensure monitor.TraceStart and
monitor.TraceEnd are non-nil, verify traceStart.Before(traceEnd) (and any domain
constraints you need), return a clear error without writing the DB if invalid,
and only then proceed to persist and call s.executor.ExecuteMonitorRun with the
validated values.
| # Encryption Configuration (AES-256-GCM, hex-encoded 32-byte key) | ||
| - ENCRYPTION_KEY=0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef |
There was a problem hiding this comment.
Avoid committing a fixed ENCRYPTION_KEY in compose defaults.
A hardcoded shared key weakens confidentiality for encrypted-at-rest data across environments. Require it via external env instead.
Suggested fix
- # Encryption Configuration (AES-256-GCM, hex-encoded 32-byte key)
- - ENCRYPTION_KEY=0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef
+ # Encryption Configuration (AES-256-GCM, hex-encoded 32-byte key)
+ - ENCRYPTION_KEY=${ENCRYPTION_KEY:?Set ENCRYPTION_KEY (openssl rand -hex 32)}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Encryption Configuration (AES-256-GCM, hex-encoded 32-byte key) | |
| - ENCRYPTION_KEY=0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef | |
| # Encryption Configuration (AES-256-GCM, hex-encoded 32-byte key) | |
| - ENCRYPTION_KEY=${ENCRYPTION_KEY:?Set ENCRYPTION_KEY (openssl rand -hex 32)} |
🧰 Tools
🪛 Gitleaks (8.30.0)
[high] 61-61: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@deployments/docker-compose.yml` around lines 60 - 61, Remove the hardcoded
ENCRYPTION_KEY value from docker-compose defaults and require it be provided
externally; replace the literal assignment for ENCRYPTION_KEY with a reference
to an environment variable (so runtime will read process env or an env_file) or
configure it to come from Docker secrets, and add a note/example in .env.example
or the README that ENCRYPTION_KEY must be set (do not commit real keys). Update
any service environment entries that mention ENCRYPTION_KEY and ensure the
compose config will fail-fast if the variable is missing.
| helm upgrade --install amp-evaluation-workflows-extension "${SCRIPT_DIR}/../helm-charts/wso2-amp-evaluation-extension" --namespace openchoreo-build-plane \ | ||
| --set ampEvaluation.image.repository="amp-evaluation-monitor" \ | ||
| --set ampEvaluation.publisher.endpoint="http://agent-manager-service:8080" \ |
There was a problem hiding this comment.
Use a fully qualified evaluation image repository.
This override sets a short repository name, which can resolve to an unintended registry and cause image pull failures in clean environments. Prefer the chart’s full repo value (or remove the override if the chart default is already correct).
Suggested fix
helm upgrade --install amp-evaluation-workflows-extension "${SCRIPT_DIR}/../helm-charts/wso2-amp-evaluation-extension" --namespace openchoreo-build-plane \
- --set ampEvaluation.image.repository="amp-evaluation-monitor" \
+ --set ampEvaluation.image.repository="ghcr.io/wso2/amp-evaluation-monitor" \
--set ampEvaluation.publisher.endpoint="http://agent-manager-service:8080" \
--set ampEvaluation.publisher.apiKey="dev-publisher-api-key"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| helm upgrade --install amp-evaluation-workflows-extension "${SCRIPT_DIR}/../helm-charts/wso2-amp-evaluation-extension" --namespace openchoreo-build-plane \ | |
| --set ampEvaluation.image.repository="amp-evaluation-monitor" \ | |
| --set ampEvaluation.publisher.endpoint="http://agent-manager-service:8080" \ | |
| helm upgrade --install amp-evaluation-workflows-extension "${SCRIPT_DIR}/../helm-charts/wso2-amp-evaluation-extension" --namespace openchoreo-build-plane \ | |
| --set ampEvaluation.image.repository="ghcr.io/wso2/amp-evaluation-monitor" \ | |
| --set ampEvaluation.publisher.endpoint="http://agent-manager-service:8080" \ |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@deployments/scripts/setup-openchoreo.sh` around lines 214 - 216, The helm
upgrade command installs "amp-evaluation-workflows-extension" and overrides
ampEvaluation.image.repository with a short name "amp-evaluation-monitor", which
can cause image-pull failures in clean environments; update the override to a
fully-qualified image repository (e.g., include registry and namespace) or
remove the --set ampEvaluation.image.repository override so the chart default is
used, ensuring the ampEvaluation.image.repository value passed to helm (in the
upgrade/install invocation) is a fully qualified registry/path.
| **Note:** The default `publisher.apiKey` must match the `publisherApiKey.value` configured in the Agent Manager chart (Step 4). Both default to `amp-internal-api-key`. If you changed the Agent Manager's `publisherApiKey.value`, override it here: | ||
|
|
||
| ```bash | ||
| helm install amp-evaluation-extension \ | ||
| oci://${HELM_CHART_REGISTRY}/wso2-amp-evaluation-extension \ | ||
| --version 0.0.0-dev \ | ||
| --namespace ${BUILD_CI_NS} \ | ||
| --set ampEvaluation.publisher.apiKey="your-custom-key" \ | ||
| --timeout 1800s | ||
| ``` |
There was a problem hiding this comment.
Clarify this as an alternative command (not a follow-up).
As written, users may run both helm install commands for the same release, and the second one will fail because the release already exists.
✏️ Suggested wording tweak
-**Note:** The default `publisher.apiKey` must match the `publisherApiKey.value` configured in the Agent Manager chart (Step 4). Both default to `amp-internal-api-key`. If you changed the Agent Manager's `publisherApiKey.value`, override it here:
+**Alternative (use this instead of the previous install command when using a custom key):**
+The `ampEvaluation.publisher.apiKey` value must match `agentManagerService.config.publisherApiKey.value` from the Agent Manager chart (Step 4).Based on learnings: In docs/install/, show a default helm install command and a separate alternate helm install command for customized scenarios (not sequential steps).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/install/managed-cluster.md` around lines 563 - 572, The second helm
install snippet (the one setting ampEvaluation.publisher.apiKey) reads like a
follow-up step and could make users run two installs for the same release; mark
it explicitly as an alternative command (e.g., "Alternative: use this command
instead if you changed publisherApiKey.value") and add a short note that it
replaces the default install to avoid duplicate-release errors; reference the
values publisher.apiKey and publisherApiKey.value and the
amp-evaluation-extension helm install so reviewers can locate and update the
text accordingly.
| **Note:** The default `publisher.apiKey` must match the `publisherApiKey.value` configured in the Agent Manager chart (Step 3). Both default to `amp-internal-api-key`. If you changed the Agent Manager's `publisherApiKey.value`, override it here: | ||
|
|
||
| ```bash | ||
| helm install amp-evaluation-extension \ | ||
| oci://${HELM_CHART_REGISTRY}/wso2-amp-evaluation-extension \ | ||
| --version 0.0.0-dev \ | ||
| --namespace ${BUILD_CI_NS} \ | ||
| --set ampEvaluation.publisher.apiKey="your-custom-key" \ | ||
| --timeout 1800s | ||
| ``` |
There was a problem hiding this comment.
Mark this install block as an alternative command.
This currently reads like a second sequential helm install for the same release name, which will fail if users already ran the default command above.
✏️ Suggested wording tweak
-**Note:** The default `publisher.apiKey` must match the `publisherApiKey.value` configured in the Agent Manager chart (Step 3). Both default to `amp-internal-api-key`. If you changed the Agent Manager's `publisherApiKey.value`, override it here:
+**Alternative (use this instead of the previous install command when using a custom key):**
+The `ampEvaluation.publisher.apiKey` value must match `agentManagerService.config.publisherApiKey.value` from the Agent Manager chart (Step 3).Based on learnings: In docs/install/, present a default helm install command and an alternate customized helm install command as alternatives, not sequential steps.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/install/self-hosted-cluster.md` around lines 506 - 515, The install
block presents a second helm install as if sequential, which can fail; mark it
explicitly as an alternative (e.g., start with "Alternatively," or "To override
the default API key, run:") and clarify it is an alternative to the previous
helm install for the same release name; mention the specific values to change
(ampEvaluation.publisher.apiKey vs Agent Manager's publisherApiKey.value) so
users know to use this command only when overriding the default
amp-internal-api-key.
Problem
When a past monitor is updated (e.g. evaluators, trace window, or sampling rate changed), no new evaluation run is triggered. Unlike future monitors which have a scheduler, past monitors only triggered a run at creation time — leaving users with stale results after updates.
Goal
Ensure past monitors trigger a fresh evaluation run on update, consistent with the create behavior.
Approach
UpdateMonitor(), after the DB update succeeds, check if the monitor is a past monitor and callExecuteMonitorRunto trigger a new evaluation runCreateMonitor()by replacing the if/else block with a direct past-monitor check — no else needed since future monitors are handled by the schedulerFixes #437
Test plan
TestUpdatePastMonitor_TriggersNewRun— confirms a new WorkflowRun CR is created on past monitor updateTestUpdateFutureMonitor_DoesNotTriggerRun— confirms no extra CR on future monitor updatemake dev-test)Summary by CodeRabbit
New Features
Documentation
Chores