CRD-driven CEL admission rules with async validator#374
Conversation
Introduce AdmissionCelEvent and AdmissionCelUserInfo structs that wrap admission.Attributes into plain Go structs suitable for CEL evaluation. NewAdmissionCelEvent extracts kind, name, namespace, operation, user info, and unstructured object/oldObject/options maps from the attributes. Docs-exempt: new internal package, no user-facing behavior change yet Signed-off-by: Ben <ben@armosec.io>
CEL evaluation engine for admission webhook events. Uses cel-go native
types to register AdmissionCelEvent/AdmissionCelUserInfo, with
object/oldObject/options as top-level map variables (cel-go NativeTypes
does not support map[string]interface{} struct fields). Includes
RWMutex-protected program cache, expression compilation with nil-caching
for compile failures, and EvaluateRuleWithContext that implements AND
semantics with short-circuit and event-type filtering.
Docs-exempt: new internal package, no user-facing behavior change yet
Signed-off-by: Ben <ben@armosec.io>
… rules Signed-off-by: Ben <ben@armosec.io>
Implements watcher.Adaptor that watches the Rules CRD, extracts rules with k8s-admission expressions, and syncs them to the CelRuleCreator via the RuleSyncer interface. Optionally calls RBCacheRefresher after sync. Adds docs/features/cel-admission-rules.md documenting the full CEL admission rules architecture. Signed-off-by: Ben <ben@armosec.io>
…only - NewCache now accepts a rules.RuleCreator parameter instead of hardcoding rulesv1.NewRuleCreator() - Added RefreshRules() method to RBCache for rebuilding rule mappings when rules change - Removed the rulesv1 import from cache package (no longer needed) - Validator now logs all matching rules and sends alerts without blocking requests - Updated cache_test.go to pass RuleCreatorMock to NewCache Signed-off-by: Ben <ben@armosec.io>
Wire AdmissionCEL, CelRuleCreator, and RulesWatcher into main.go so the admission controller now loads rules from CRDs at runtime. Delete the hardcoded R2000/R2001 rule files and factory; fix CelRuleCreator to satisfy the rules.RuleCreator interface with correct return types. Signed-off-by: Ben <ben@armosec.io>
Signed-off-by: Ben <ben@armosec.io>
…sion Signed-off-by: Ben <ben@armosec.io>
The dynamic watcher dispatches every event from every watched GVR to every registered adaptor. After RulesWatcher was added alongside RBCache, the RBCache started receiving Rules CRD events and tried to parse them as RuntimeAlertRuleBinding, producing 'cannot convert int64 to string' errors (the Rules CRD has integer severity; the binding has string severity). Add an isRuleBinding kind check at the entry of each handler so RBCache ignores cross-talk from other watched resources. Signed-off-by: Ben <ben@armosec.io>
Before any rule evaluation, the validator now drops requests whose UserInfo.Name matches the operator's own ServiceAccount subject. This is a hard guarantee against positive feedback loops: the operator may write to the API (status updates, leader election, alert exports), and we never want those writes to re-enter the rule pipeline. The subject is read at construction time by parsing the unverified 'sub' claim from the projected ServiceAccount token at the standard mount path. If reading fails (e.g. running outside a pod), the validator logs a warning and continues without the short-circuit — degraded but functional. Signed-off-by: Ben <ben@armosec.io>
Add a KindFilter to CelRuleCreator built from the loaded rule expressions at SyncRules time. The validator consults the creator (via a small KindAcceptor interface) before any evaluation work and drops events whose Kind no rule could possibly match. The filter is conservative: expressions with || or no event.Kind == constraint collapse the filter to wildcard mode (accept everything), preserving correctness at the cost of skipping the optimization. Only expressions tagged EventType k8s-admission are inspected; other event types are ignored entirely by this operator. Signed-off-by: Ben <ben@armosec.io>
The validator's Validate() now snapshots the request and enqueues onto a buffered channel, returning nil immediately so the API server never waits on CEL evaluation. A pool of worker goroutines (10 by default, 1000-slot queue) drains the channel and runs the full match + alert pipeline out-of-band. When the queue is full the request is dropped and an atomic counter is incremented; drops are logged at exponentially throttled intervals to avoid flooding the operator log under burst load. The counter is exposed via DropCount() so it can be surfaced as a Prometheus metric in a follow-up. Workers are bound to the webhook's serverContext and exit on cancellation. Tests cover: enqueue+process, drop-on-full, context-cancel shutdown, and short-circuit drops (self-pod, kind filter, nil object) staying out of the queue entirely. Signed-off-by: Ben <ben@armosec.io>
📝 WalkthroughWalkthroughThis PR replaces hardcoded admission rules with a CEL-driven, CRD-backed evaluation system. It introduces a CEL expression engine, dynamic rule loading via CRD watchers, and shifts webhook validation to an async worker pipeline with self-subject detection and kind-based pre-filtering. ChangesCEL Admission Rules
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Summary:
|
| func NewCelRuleCreator(celEngine *admissioncel.AdmissionCEL) *CelRuleCreator { | ||
| return &CelRuleCreator{ | ||
| celEngine: celEngine, | ||
| // Empty creator accepts nothing — the kind filter starts non-wildcard |
There was a problem hiding this comment.
is it fine to drop all admissions until we have rules? maybe we should only accept HTTP connections after we have rules?
There was a problem hiding this comment.
IMHO - yes, I don't want to hold any line up. What if the user doesn't install rules? The whole design idea is that we treat admission webhooks as best-effort event sources
| } | ||
|
|
||
| // Enrich with K8s details when a cache is available. | ||
| if access != nil { |
There was a problem hiding this comment.
enrichK8sDetails calls rulesv1.GetControllerDetails(attrs, clientset) which does a live clientset.CoreV1().Pods(namespace).Get(...), it only makes sense for a Pod or a Pod subresource.
| if access != nil { | |
| if access != nil && (attrs.GetResource().Resource == "pods" || attrs.GetKind().Kind == "PodExecOptions") { |
| // evaluation of the same expression avoids re-compilation. | ||
| type AdmissionCEL struct { | ||
| env *cel.Env | ||
| programCache map[string]cel.Program |
There was a problem hiding this comment.
cache is never cleared if some rules are dropped
| for { | ||
| select { | ||
| case <-ctx.Done(): | ||
| return |
There was a problem hiding this comment.
you should probably drain the jobs channel before exit, other jobs enqueued just before context cancellation will never get processed
There was a problem hiding this comment.
I think you're right
enrichK8sDetails resolves a Pod via clientset.CoreV1().Pods(ns).Get(...) through GetControllerDetails. That call only makes sense for Pod CRUD and Pod subresources (exec, portforward, attach) — for other kinds (NetworkPolicy, RoleBinding, Secret, ...) it either returns NotFound or fetches an unrelated Pod that happens to share the request name. Add a kind/resource gate so enrichment only runs when applicable. The admission alert payload remains complete via buildAdmissionAlert; only the Pod-derived RuntimeAlertK8sDetails fields are now omitted for non-Pod events. Reviewed-by: matthyx (PR #374 line review) Signed-off-by: Ben <ben@armosec.io>
The AdmissionCEL program cache was append-only — once an expression had been compiled it stayed cached forever, even if no loaded rule still referenced it. After enough SyncRules cycles (rules added, removed, edited) the cache would grow without bound. Add an AdmissionCEL.RetainOnly(activeExpressions) that drops cached programs whose expression is not in the active set. CelRuleCreator.SyncRules collects every expression string referenced by the new rule set (each RuleExpression, plus per-rule Message and UniqueID templates) and calls RetainOnly after the swap. Reviewed-by: matthyx (PR #374 line review) Signed-off-by: Ben <ben@armosec.io>
Previously the worker's select picked between ctx.Done() and av.jobs, returning immediately on cancel. Jobs enqueued just before cancellation sat in the buffered channel forever. On context cancellation each worker now switches to a non-blocking drain pass that processes everything currently in av.jobs before exiting. Drain uses a fresh context with a bounded timeout (10s) — the worker ctx is already canceled and would short-circuit downstream API calls. If the deadline is hit, remaining jobs are abandoned with a warning log including the count. Reviewed-by: matthyx (PR #374 line review) Signed-off-by: Ben <ben@armosec.io>
|
Pushed three fixes addressing @matthyx's review:
On the "block admissions until rules are loaded" question (creator.go:26) — I'd like to keep the current behavior. The operator is an alert-only audit path, the events are best-effort information, and gating admission on rule readiness would couple API server availability to operator startup. Cheaper to drop the cold-start events than to add a sync+blocking dance. Tests for all three fixes are in this push. |
| // compile and cache for the given rules: each RuleExpression, plus the | ||
| // per-rule Message and UniqueID templates. | ||
| func collectExpressions(rs []armotypes.RuntimeRule) []string { | ||
| out := make([]string, 0, len(rs)*3) |
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (1)
admission/exporter/http_exporter_test.go (1)
31-55: ⚡ Quick winTest no longer validates
SendAdmissionAlertbehavior directly.This currently re-implements internal assembly logic instead of exercising the production
SendAdmissionAlertpath, so regressions in that method can slip through. Please route this assertion through a realSendAdmissionAlertinvocation (e.g., via anhttptestreceiver) and verifyClusterName/ClusterUIDin the emitted alert.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@admission/exporter/http_exporter_test.go` around lines 31 - 55, The test currently reconstructs RuntimeAlert internals instead of exercising SendAdmissionAlert; update the test to call exporter.SendAdmissionAlert through an httptest receiver (e.g., start a test HTTP server that captures the posted alert) using the same ruleFailure input, then assert that the emitted RuntimeAlert contains exporter.ClusterName and exporter.ClusterUID (check fields ClusterName and ClusterUID on the received RuntimeAlert.RuntimeAlertK8sDetails). Ensure you remove the manual k8sDetails injection and use the real SendAdmissionAlert call path and validate the received payload.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@admission/cel/cel.go`:
- Around line 184-189: The current getOrCreateProgram in AdmissionCEL treats a
cached nil cel.Program as a success (returns nil, nil), which hides prior
compile failures; change the caching to record compile errors explicitly (e.g.,
add a programErrCache map[string]error guarded by cacheMu) and when a lookup
finds an entry, check programErrCache[expression] first and return that error if
present; when compilation fails, store the error into
programErrCache[expression] (and a nil/absent program in programCache), and when
compilation succeeds, store the Program in programCache and clear any entry in
programErrCache so subsequent lookups return the real error or the real program
rather than silently returning nil.
In `@admission/rules/cel/evaluator.go`:
- Around line 47-55: The per-binding parameter map stored via
CelRuleEvaluator.SetParameters/GetParameters (e.parameters) is never added to
the CEL evaluation context; update the code that builds the CEL evaluation
context (the method that currently constructs it from admission attributes) to
merge e.parameters into that context so CEL expressions can reference those
parameters, or if merging conflicts occur namespace the parameters (e.g., under
"params") when injecting into the context; ensure SetParameters/GetParameters
remain consistent with the injected key(s).
In `@admission/ruleswatcher/watcher.go`:
- Around line 41-49: NewRulesWatcher currently allows a nil ruleSyncer which
causes syncAllRules to panic when calling w.ruleSyncer.SyncRules; fix by
guarding that dependency: in NewRulesWatcher check if ruleSyncer == nil and
replace it with a small no-op implementation of RuleSyncer (e.g., noopRuleSyncer
implementing SyncRules returning nil), or alternatively add a nil check at the
start of syncAllRules to return early if w.ruleSyncer is nil; update references
to the RulesWatcher.ruleSyncer field and ensure syncAllRules calls are safe.
In `@admission/webhook/validator.go`:
- Around line 223-229: The loop currently may pick a job even after ctx is
canceled; to fix, check cancellation before attempting to receive from av.jobs:
test ctx.Err() (or do a non-blocking select on <-ctx.Done()) immediately before
the select that reads from av.jobs, call av.drainJobs() and return if canceled,
then proceed to receive and call av.evaluate(ctx, job.attrs); reference
av.drainJobs, av.jobs, av.evaluate and ctx.Done to locate the code to change.
In `@docs/features/cel-admission-rules.md`:
- Around line 12-20: The fenced architecture block starting with "Rules CRD
──watch──▶ RulesWatcher ──sync──▶ CelRuleCreator" is unlabeled and triggers
markdownlint MD040; update the opening fence to include a language identifier
(e.g., ```text) so the block becomes "```text" followed by the existing ASCII
diagram and a closing "```", ensuring consistent rendering and lint compliance.
---
Nitpick comments:
In `@admission/exporter/http_exporter_test.go`:
- Around line 31-55: The test currently reconstructs RuntimeAlert internals
instead of exercising SendAdmissionAlert; update the test to call
exporter.SendAdmissionAlert through an httptest receiver (e.g., start a test
HTTP server that captures the posted alert) using the same ruleFailure input,
then assert that the emitted RuntimeAlert contains exporter.ClusterName and
exporter.ClusterUID (check fields ClusterName and ClusterUID on the received
RuntimeAlert.RuntimeAlertK8sDetails). Ensure you remove the manual k8sDetails
injection and use the real SendAdmissionAlert call path and validate the
received payload.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: c8ca0e1c-dda9-4123-a89c-e49b86efdeb7
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (31)
admission/cel/cel.goadmission/cel/cel_test.goadmission/cel/event.goadmission/cel/event_test.goadmission/cel/integration_test.goadmission/exporter/http_exporter_test.goadmission/rulebinding/cache/cache.goadmission/rulebinding/cache/cache_test.goadmission/rulebinding/cache/helpers.goadmission/rulebinding/cache/helpers_test.goadmission/rules/cel/creator.goadmission/rules/cel/creator_test.goadmission/rules/cel/evaluator.goadmission/rules/cel/evaluator_test.goadmission/rules/cel/kindfilter.goadmission/rules/cel/kindfilter_test.goadmission/rules/v1/factory.goadmission/rules/v1/r2000_exec_to_pod.goadmission/rules/v1/r2000_exec_to_pod_test.goadmission/rules/v1/r2001_portforward.goadmission/rules/v1/r2001_portforward_test.goadmission/rules/v1/rule.goadmission/ruleswatcher/watcher.goadmission/ruleswatcher/watcher_test.goadmission/webhook/selfidentity.goadmission/webhook/selfidentity_test.goadmission/webhook/validator.goadmission/webhook/validator_test.godocs/features/cel-admission-rules.mdgo.modmain.go
💤 Files with no reviewable changes (6)
- admission/rules/v1/r2000_exec_to_pod_test.go
- admission/rules/v1/rule.go
- admission/rules/v1/factory.go
- admission/rules/v1/r2001_portforward.go
- admission/rules/v1/r2001_portforward_test.go
- admission/rules/v1/r2000_exec_to_pod.go
| func (c *AdmissionCEL) getOrCreateProgram(expression string) (cel.Program, error) { | ||
| c.cacheMu.RLock() | ||
| if prog, exists := c.programCache[expression]; exists { | ||
| c.cacheMu.RUnlock() | ||
| return prog, nil | ||
| } |
There was a problem hiding this comment.
Do not downgrade cached compile failures to success on subsequent evaluations.
nil is used as a cache sentinel for compile failures, and later returned as a non-error path. After the first failure, the same invalid expression becomes silently ignored (false/empty string), which hides rule misconfiguration.
Suggested direction
- programCache map[string]cel.Program
+ type cachedProgram struct {
+ prog cel.Program
+ err error
+ }
+ programCache map[string]cachedProgram
...
- if prog, exists := c.programCache[expression]; exists {
- return prog, nil
+ if cp, exists := c.programCache[expression]; exists {
+ return cp.prog, cp.err
}
...
- c.programCache[expression] = nil
- return nil, fmt.Errorf("compiling expression %q: %w", expression, issues.Err())
+ err := fmt.Errorf("compiling expression %q: %w", expression, issues.Err())
+ c.programCache[expression] = cachedProgram{err: err}
+ return nil, errAlso applies to: 204-213
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@admission/cel/cel.go` around lines 184 - 189, The current getOrCreateProgram
in AdmissionCEL treats a cached nil cel.Program as a success (returns nil, nil),
which hides prior compile failures; change the caching to record compile errors
explicitly (e.g., add a programErrCache map[string]error guarded by cacheMu) and
when a lookup finds an entry, check programErrCache[expression] first and return
that error if present; when compilation fails, store the error into
programErrCache[expression] (and a nil/absent program in programCache), and when
compilation succeeds, store the Program in programCache and clear any entry in
programErrCache so subsequent lookups return the real error or the real program
rather than silently returning nil.
| // SetParameters stores per-binding parameter overrides. | ||
| func (e *CelRuleEvaluator) SetParameters(parameters map[string]interface{}) { | ||
| e.parameters = parameters | ||
| } | ||
|
|
||
| // GetParameters returns the per-binding parameter overrides. | ||
| func (e *CelRuleEvaluator) GetParameters() map[string]interface{} { | ||
| return e.parameters | ||
| } |
There was a problem hiding this comment.
Per-binding parameters are never applied during CEL evaluation.
Line 47-55 stores per-binding overrides, but Line 65-74 builds evaluation context only from admission attributes. As written, binding parameters are inert, so parameterized rules won’t evaluate as intended. Please inject e.parameters into the CEL evaluation context (or remove/defer the parameter API until it is truly supported).
Also applies to: 65-74
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@admission/rules/cel/evaluator.go` around lines 47 - 55, The per-binding
parameter map stored via CelRuleEvaluator.SetParameters/GetParameters
(e.parameters) is never added to the CEL evaluation context; update the code
that builds the CEL evaluation context (the method that currently constructs it
from admission attributes) to merge e.parameters into that context so CEL
expressions can reference those parameters, or if merging conflicts occur
namespace the parameters (e.g., under "params") when injecting into the context;
ensure SetParameters/GetParameters remain consistent with the injected key(s).
| func NewRulesWatcher(k8sClient k8sclient.K8sClientInterface, ruleSyncer RuleSyncer, cacheRefresher RBCacheRefresher) *RulesWatcher { | ||
| return &RulesWatcher{ | ||
| k8sClient: k8sClient, | ||
| ruleSyncer: ruleSyncer, | ||
| cacheRefresher: cacheRefresher, | ||
| watchResources: []watcher.WatchResource{ | ||
| watcher.NewWatchResource(typesv1.RuleGvr, metav1.ListOptions{}), | ||
| }, | ||
| } |
There was a problem hiding this comment.
Guard against nil ruleSyncer to avoid startup/runtime panic.
syncAllRules unconditionally calls w.ruleSyncer.SyncRules(...). If wiring passes nil, this panics on first event/list sync.
Suggested fix
func NewRulesWatcher(k8sClient k8sclient.K8sClientInterface, ruleSyncer RuleSyncer, cacheRefresher RBCacheRefresher) *RulesWatcher {
+ if ruleSyncer == nil {
+ panic("ruleswatcher: ruleSyncer must not be nil")
+ }
return &RulesWatcher{
k8sClient: k8sClient,
ruleSyncer: ruleSyncer,
cacheRefresher: cacheRefresher,
watchResources: []watcher.WatchResource{
watcher.NewWatchResource(typesv1.RuleGvr, metav1.ListOptions{}),
},
}
}Also applies to: 94-94
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@admission/ruleswatcher/watcher.go` around lines 41 - 49, NewRulesWatcher
currently allows a nil ruleSyncer which causes syncAllRules to panic when
calling w.ruleSyncer.SyncRules; fix by guarding that dependency: in
NewRulesWatcher check if ruleSyncer == nil and replace it with a small no-op
implementation of RuleSyncer (e.g., noopRuleSyncer implementing SyncRules
returning nil), or alternatively add a nil check at the start of syncAllRules to
return early if w.ruleSyncer is nil; update references to the
RulesWatcher.ruleSyncer field and ensure syncAllRules calls are safe.
| select { | ||
| case <-ctx.Done(): | ||
| av.drainJobs() | ||
| return | ||
| case job := <-av.jobs: | ||
| av.evaluate(ctx, job.attrs) | ||
| } |
There was a problem hiding this comment.
Prioritize cancellation before consuming more jobs.
Once ctx.Done() is ready, this select may still pick job := <-av.jobs, evaluating with an already-canceled context and dropping work due to canceled downstream calls.
Suggested fix
func (av *AdmissionValidator) runWorker(ctx context.Context) {
defer av.wg.Done()
for {
+ // Give cancellation priority so post-cancel jobs are handled by drainCtx.
+ if ctx.Err() != nil {
+ av.drainJobs()
+ return
+ }
select {
case <-ctx.Done():
av.drainJobs()
return
case job := <-av.jobs:
av.evaluate(ctx, job.attrs)
}
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| select { | |
| case <-ctx.Done(): | |
| av.drainJobs() | |
| return | |
| case job := <-av.jobs: | |
| av.evaluate(ctx, job.attrs) | |
| } | |
| func (av *AdmissionValidator) runWorker(ctx context.Context) { | |
| defer av.wg.Done() | |
| for { | |
| // Give cancellation priority so post-cancel jobs are handled by drainCtx. | |
| if ctx.Err() != nil { | |
| av.drainJobs() | |
| return | |
| } | |
| select { | |
| case <-ctx.Done(): | |
| av.drainJobs() | |
| return | |
| case job := <-av.jobs: | |
| av.evaluate(ctx, job.attrs) | |
| } | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@admission/webhook/validator.go` around lines 223 - 229, The loop currently
may pick a job even after ctx is canceled; to fix, check cancellation before
attempting to receive from av.jobs: test ctx.Err() (or do a non-blocking select
on <-ctx.Done()) immediately before the select that reads from av.jobs, call
av.drainJobs() and return if canceled, then proceed to receive and call
av.evaluate(ctx, job.attrs); reference av.drainJobs, av.jobs, av.evaluate and
ctx.Done to locate the code to change.
| ``` | ||
| Rules CRD ──watch──▶ RulesWatcher ──sync──▶ CelRuleCreator | ||
| │ | ||
| ▼ | ||
| Admission Webhook ──▶ RBCache.ListRulesForObject ──▶ CelRuleEvaluator | ||
| │ | ||
| ▼ | ||
| AdmissionCEL.EvaluateRuleWithContext | ||
| ``` |
There was a problem hiding this comment.
Add a language identifier to the fenced architecture block.
Line 12 uses an unlabeled fenced block, which triggers markdownlint (MD040) and can reduce renderer/tooling consistency.
📝 Suggested diff
-```
+```text
Rules CRD ──watch──▶ RulesWatcher ──sync──▶ CelRuleCreator
│
▼
Admission Webhook ──▶ RBCache.ListRulesForObject ──▶ CelRuleEvaluator
│
▼
AdmissionCEL.EvaluateRuleWithContext</details>
<details>
<summary>🧰 Tools</summary>
<details>
<summary>🪛 markdownlint-cli2 (0.22.1)</summary>
[warning] 12-12: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
</details>
</details>
<details>
<summary>🤖 Prompt for AI Agents</summary>
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @docs/features/cel-admission-rules.md around lines 12 - 20, The fenced
architecture block starting with "Rules CRD ──watch──▶ RulesWatcher ──sync──▶
CelRuleCreator" is unlabeled and triggers markdownlint MD040; update the opening
fence to include a language identifier (e.g., text) so the block becomes "text" followed by the existing ASCII diagram and a closing "```", ensuring
consistent rendering and lint compliance.
</details>
<!-- fingerprinting:phantom:poseidon:hawk -->
<!-- This is an auto-generated comment by CodeRabbit -->
|
Summary:
|
Implements designs-and-proposals#3 (CRD-driven CEL admission rules) (merged).
Summary
Replaces the operator's hardcoded admission rules (R2000 Exec, R2001 Port-Forward) with CRD-defined CEL rules. Same
RulesCRD schema used by the node-agent for eBPF events, extended with a newk8s-admissionevent type.What's in this PR
Core pipeline
admission/cel/—AdmissionCELengine +AdmissionCelEventadapter (exposesevent.Kind,event.UserInfo, etc. to CEL;object/oldObject/optionsas top-level variables due toext.NativeTypes()map limitation).admission/rules/cel/—CelRuleCreatorandCelRuleEvaluatorreplace the legacy Go rule descriptors and factory.admission/ruleswatcher/— Dynamic informer overRulesCRDs; filters for enabled rules with at least onek8s-admissionexpression, callsCelRuleCreator.SyncRules.admission/rulebinding/cache/— Now takes theRuleCreatoras a constructor argument and exposesRefreshRules()for the watcher to invoke.Performance and safety (per the design doc)
subclaim from the projected ServiceAccount token at startup and drops any admission request whoseUserInfo.Namematches. Hard guarantee against feedback loops from the operator's own writes. Falls back gracefully if the token isn't readable.CelRuleCreatorextractsevent.Kind == "X"patterns from loaded expressions atSyncRulestime. Validator skips events for Kinds no rule could match. Conservative — falls back to wildcard for expressions with||or no Kind pin.Validate()enqueues onto a 1000-slot buffered channel and returns immediately; 10 workers drain it. The API server never waits on CEL. Queue-full → drop with throttled warning log + atomic counter (exposed viaDropCount()for Prometheus).Bug fix
admission/rulebinding/cache/— Both the RBCache and RulesWatcher adaptors share the dynamic watcher event stream. The RBCache now filters non-RuntimeRuleAlertBindingevents before parsing, fixing the"cannot convert int64 to string"spam previously seen after adding the RulesWatcher.Deleted
admission/rules/v1/r2000_exec_to_pod.go(+ test)admission/rules/v1/r2001_portforward.go(+ test)admission/rules/v1/factory.goadmission/rules/v1/rule.goKept:
helpers.go(K8s enrichment functions),failureobject.go(GenericRuleFailure),rule_interface.go.Validation
End-to-end on a kind cluster with the locally built image:
kubectl exec→ R2000 fires, alert in operator logs with K8s enrichment populatedkubectl port-forward→ R2001 fireskubectl applyNetworkPolicy → R2002 fires (additional rule from the kind test),event.UserInfo.Usernameresolves correctly in the message expressionDependencies
armoapi-goEventTypeK8sAdmissionconstantrulelibraryR2000/R2001 as CEL YAMLhelm-chartsupdates (RBAC, webhook config, image bump)Test plan
go test -race ./...Summary by CodeRabbit
Release Notes