NO-JIRA: Document CEL over webhooks policy for AI agents#8478
Conversation
AI agents frequently attempt to add validation logic to the admission webhook. This documents the team's deliberate choice to use CEL instead, with the rationale around konnectivity latency in hosted-on-hosted topologies, availability coupling, and operational overhead. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Pipeline controller notification For optional jobs, comment This repository is configured in: LGTM mode |
|
@bryan-cox: This pull request explicitly references no jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Skipping CI for Draft Pull Request. |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository YAML (base), Central YAML (inherited) Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (1)
✅ Files skipped from review due to trivial changes (1)
📝 WalkthroughWalkthroughThis pull request adds documentation clarifying that the HostedCluster admission webhook files ( Sequence Diagram(s)(omitted) Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 12✅ Passed checks (12 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
|
||
| Add new validation as CEL markers on the API types in `api/hypershift/v1beta1/` and cover them with envtests. See `api/AGENTS.md` and `test/envtest/README.md`. | ||
|
|
||
| The exception is CAPI resources, where webhooks are used unconditionally during the v1beta2 migration period. |
There was a problem hiding this comment.
it's only the conversion webhook that is required for CAPI
There was a problem hiding this comment.
Good catch — updated to "a conversion webhook is required" in 7339dfe.
|
/retest |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #8478 +/- ##
==========================================
+ Coverage 37.49% 37.55% +0.05%
==========================================
Files 751 751
Lines 91984 92029 +45
==========================================
+ Hits 34487 34557 +70
+ Misses 54854 54830 -24
+ Partials 2643 2642 -1 see 8 files with indirect coverage changes
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
/lgtm |
|
Scheduling tests matching the |
|
|
||
| ## What to Do Instead | ||
|
|
||
| Add new validation as CEL markers on the API types in `api/hypershift/v1beta1/` and cover them with envtests. See `api/AGENTS.md` and `test/envtest/README.md`. |
There was a problem hiding this comment.
nit: this is valid for api/ not only api/hypershift/v1beta1
There was a problem hiding this comment.
🧹 Nitpick comments (1)
.claude/rules/webhook-validation.md (1)
24-24: 💤 Low valueConsider mentioning the CAPI exception in the other guidance files.
The exception for CAPI resources during the v1beta2 migration period is documented here but not mentioned in
api/AGENTS.mdorAGENTS.md. If this exception is important for contributors to know, consider adding a brief note in those files as well for consistency.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.claude/rules/webhook-validation.md at line 24, Add a short note to api/AGENTS.md and AGENTS.md mentioning the CAPI exception for webhooks during the v1beta2 migration period (as documented in .claude/rules/webhook-validation.md) so contributors see the caveat in the agent guidance; update both files' relevant webhook or migration sections with one concise sentence pointing to the existing CAPI exception and linking or referencing the .claude/rules/webhook-validation.md entry for details.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In @.claude/rules/webhook-validation.md:
- Line 24: Add a short note to api/AGENTS.md and AGENTS.md mentioning the CAPI
exception for webhooks during the v1beta2 migration period (as documented in
.claude/rules/webhook-validation.md) so contributors see the caveat in the agent
guidance; update both files' relevant webhook or migration sections with one
concise sentence pointing to the existing CAPI exception and linking or
referencing the .claude/rules/webhook-validation.md entry for details.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 6ef88b4d-5fa0-45c9-8174-2221d18c48ce
📒 Files selected for processing (3)
.claude/rules/webhook-validation.mdAGENTS.mdapi/AGENTS.md
|
/verified bypass |
|
@bryan-cox: The DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@bryan-cox: This pull request explicitly references no jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/override e2e-aks |
|
@bryan-cox: /override requires failed status contexts, check run or a prowjob name to operate on.
Only the following failed contexts/checkruns were expected:
If you are trying to override a checkrun that has a space in it, you must put a double quote on the context. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/override ci/prow/e2e-aks |
|
@bryan-cox: Overrode contexts on behalf of bryan-cox: ci/prow/e2e-aks, ci/prow/e2e-aws, ci/prow/e2e-aws-upgrade-hypershift-operator, ci/prow/e2e-azure-self-managed, ci/prow/e2e-kubevirt-aws-ovn-reduced, ci/prow/e2e-v2-aws DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/override ci/prow/e2e-azure-v2-self-managed |
|
@bryan-cox: Overrode contexts on behalf of bryan-cox: ci/prow/e2e-azure-v2-self-managed DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/override ci/prow/e2e-aks |
|
Confirmed: release imports for Test Failure Analysis CompleteJob Information
Test Failure AnalysisErrorSummaryThis is a transient CI infrastructure failure unrelated to the PR content. The job failed during the release image import phase — before any test code executed. ci-operator launched a pod ( Root CauseRace condition in ci-operator's pod termination message retrieval on the CI build cluster. The precise sequence of events:
Why n4minor succeeded but n1minor didn't: Both pods completed at nearly the same time on the same node and both were garbage-collected. However, ci-operator processed n4minor's termination message first (the "Waiting to import tag[0] on imagestream stable-n4minor:cli" line appears at 14:31:39, before the pod was deleted). For n1minor, ci-operator was still processing other pods and didn't read n1minor's termination message before the pod was removed. This namespace ( This failure is completely unrelated to PR #8478, which is a documentation-only change ("Document CEL over webhooks policy for AI agents"). No test code was ever executed — the failure occurred in the CI infrastructure's release import machinery. Recommendations
Evidence
|
|
@bryan-cox: Overrode contexts on behalf of bryan-cox: ci/prow/e2e-aks DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Test Resultse2e-aws
|
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bryan-cox, csrwng The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@bryan-cox: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Summary
.claude/rules/rule that triggers when agents touch the webhook files, redirecting them to CELapi/AGENTS.mdreinforcing CEL as the correct validation mechanismContext
AI agents frequently attempt to extend the webhook in
hostedcluster_webhook.gowhen adding validation. The webhook exists almost exclusively for KubeVirt platform-specific needs. The rationale for avoiding webhooks:Test plan
.claude/rules/webhook-validation.mdloads when Claude works withhostedcluster_webhook.goAGENTS.mdandapi/AGENTS.mdrender correctly🤖 Generated with Claude Code
Summary by CodeRabbit