Skip to content

path-wildcards: anchored trailing-* + per-endpoint port + R0040 args#807

Open
entlein wants to merge 3 commits into
kubescape:mainfrom
k8sstormcenter:upstream-pr/sbob-path-wildcards
Open

path-wildcards: anchored trailing-* + per-endpoint port + R0040 args#807
entlein wants to merge 3 commits into
kubescape:mainfrom
k8sstormcenter:upstream-pr/sbob-path-wildcards

Conversation

@entlein
Copy link
Copy Markdown

@entlein entlein commented May 15, 2026

Overview

  • ap.was_path_opened_with_suffix/prefix: in pass-through mode, only
    scan Values (concrete paths). Wildcard Patterns are delegated
    to CompareDynamic rather than raw strings.HasPrefix/HasSuffix
    a wildcard pattern can't safely answer a suffix/prefix question
    without dynamic matching.
  • New R0040 rule helper ap.was_executed_with_args: returns true
    when the profile contains an entry whose Path matches AND whose
    Args vector matches (literal-equality, one-position
    wildcard, or * zero-or-more trailing tokens).

Signed-off-by: entlein <einentlein@gmail.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 15, 2026

Warning

Rate limit exceeded

@entlein has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 37 minutes and 28 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1a1cb5f7-f55a-4cdb-9d25-bfa38ea9d7ab

📥 Commits

Reviewing files that changed from the base of the PR and between cc59fa0 and 70a9d9b.

📒 Files selected for processing (4)
  • pkg/rulemanager/cel/libraries/applicationprofile/exec.go
  • pkg/rulemanager/cel/libraries/applicationprofile/exec_test.go
  • pkg/rulemanager/cel/libraries/applicationprofile/open.go
  • pkg/rulemanager/cel/libraries/applicationprofile/open_test.go
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Signed-off-by: entlein <einentlein@gmail.com>
@entlein entlein marked this pull request as ready for review May 15, 2026 20:57
@entlein
Copy link
Copy Markdown
Author

entlein commented May 16, 2026

@CodeRabbit review this please together with kubescape/storage#323

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 16, 2026

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Fetch details of the linked storage PR
gh pr view 323 --repo kubescape/storage --json title,body,files,additions,deletions,changedFiles
echo "---FILES---"
gh pr diff 323 --repo kubescape/storage

Length of output: 143814


🏁 Script executed:

#!/bin/bash
# Fetch node-agent PR `#807` diff
gh pr diff 807 --repo kubescape/node-agent
echo "---PR META---"
gh pr view 807 --repo kubescape/node-agent --json title,body,files,additions,deletions,changedFiles

Length of output: 28898


🏁 Script executed:

#!/bin/bash
# Check if CompareExecArgs exists in the storage repo
gh api repos/kubescape/storage/git/trees/main?recursive=1 | jq -r '.tree[].path' | grep -i "compareexec\|compare_exec\|execargs"
echo "---SEARCH CODE---"
# Search for CompareExecArgs in storage repo via code search
gh api "search/code?q=CompareExecArgs+repo:kubescape/storage&per_page=10" | jq -r '.items[].path'

Length of output: 18


🏁 Script executed:

#!/bin/bash
# Get the head branch of storage PR `#323`
gh pr view 323 --repo kubescape/storage --json headRefName,headRefOid
echo "---SEARCH BRANCH---"
# Search for CompareExecArgs on the PR `#323` branch
gh api "repos/kubescape/storage/contents/pkg/registry/file/dynamicpathdetector" \
  --jq '.[].name' 2>/dev/null || echo "API failed"
echo "---FILE LIST ON BRANCH---"
gh api "repos/kubescape/storage/git/refs/heads" | jq -r '.[].ref' | head -20
echo "---NODE-AGENT BRANCH---"
gh pr view 807 --repo kubescape/node-agent --json headRefName,headRefOid
echo "---SEARCH NODE-AGENT BRANCH---"
# Check if CompareExecArgs is defined locally in the node-agent repo
gh api "search/code?q=CompareExecArgs+repo:kubescape/node-agent&per_page=10" | jq -r '.items[].path'

Length of output: 1010


🏁 Script executed:

#!/bin/bash
# Fetch the contents of utils.go from the storage PR branch
gh api "repos/kubescape/storage/contents/pkg/registry/file/dynamicpathdetector/utils.go?ref=upstream-pr/sbob-path-wildcards" \
  | jq -r '.content' | base64 -d

echo "---ANALYZER TAIL (search for CompareExecArgs)---"
# Check if CompareExecArgs appears in any file on the branch
gh api "repos/kubescape/storage/git/trees/upstream-pr%2Fsbob-path-wildcards?recursive=1" \
  | jq -r '.tree[].path' | xargs -I{} sh -c \
  'content=$(gh api "repos/kubescape/storage/contents/{}?ref=upstream-pr/sbob-path-wildcards" 2>/dev/null | jq -r ".content // empty" | base64 -d 2>/dev/null); echo "$content" | grep -l "CompareExecArgs" && echo "FOUND IN: {}"' 2>/dev/null | grep "FOUND IN" || true

echo "---DIRECT GREP via search---"
gh api "search/code?q=CompareExecArgs+repo:kubescape/storage+ref:upstream-pr/sbob-path-wildcards" 2>/dev/null | jq -r '.total_count, (.items[].path)' || echo "search API does not filter by ref"

Length of output: 241


🏁 Script executed:

#!/bin/bash
# Try to get the diff more specifically for utils.go in storage PR `#323` 
gh pr diff 323 --repo kubescape/storage -- "*.go" 2>/dev/null | grep -A 30 "CompareExecArgs" || echo "Not found in storage PR diff"

echo "---CHECK UTILS GO IN STORAGE MAIN---"
gh api "repos/kubescape/storage/contents/pkg/registry/file/dynamicpathdetector/utils.go" \
  2>/dev/null | jq -r '.content' | base64 -d || echo "utils.go not in main"

echo "---CHECK EXEC ARGS IN STORAGE MAIN ALL GO FILES---"
gh api "search/code?q=ExecArgs+repo:kubescape/storage&per_page=10" | jq -r '.total_count, (.items[] | .path + " -> " + .name)'

Length of output: 422


@entlein: Reviewed PR #807 (node-agent) together with kubescape/storage#323. Both are well-structured and the test coverage is extensive. Here are my findings, ordered by severity:


🔴 Critical — Build break

CompareExecArgs is not defined in kubescape/storage

exec.go calls dynamicpathdetector.CompareExecArgs(profileArgs, runtimeArgs) but this function does not exist anywhere in kubescape/storage — confirmed via code search against both main and storage PR #323's diff. This will cause a compilation failure in node-agent.

CompareExecArgs needs to be added to storage PR #323. Based on the exec-test cases (literal equality, single-position wildcard, * zero-or-more trailing tokens), it should mirror compareSegments from the rewritten CompareDynamic, but operating on a []string argument vector rather than a /-split path:

// CompareExecArgs reports whether runtimeArgs is matched by profileArgs.
// profileArgs may contain DynamicIdentifier (⋯, matches exactly one arg)
// or WildcardIdentifier (*, matches zero-or-more trailing args).
func CompareExecArgs(profileArgs, runtimeArgs []string) bool {
    return compareSegments(profileArgs, runtimeArgs)
}

The existing compareSegments in analyzer.go already implements this logic generically over []string. Exposing a thin public wrapper would avoid duplication.

Additionally, the comment in exec_test.go references the wrong repository — it should reference kubescape/storage#323, not k8sstormcenter/storage#23:

// dynamicpathdetector.CompareExecArgs (added in
// k8sstormcenter/storage#23 — ...   ← should be kubescape/storage#323

🟡 Medium — Missing test coverage

was_path_opened_with_flags function loses all direct tests

open_test.go removes TestOpenWithFlagsInProfile, TestOpenWithFlagsNoProfile, and TestOpenWithFlagsCompilation without replacement. The wasPathOpenedWithFlags function is still present in open.go (and even gains a new doc comment in this PR), but now has zero test coverage. If the flags argument is ever wired up in a future spec revision, there will be no baseline to diff against.

Consider retaining at least the compilation test and the no-profile guard test.


🟡 Medium — Misleading PR description / silent false-negatives

Patterns are skipped entirely, not "delegated to CompareDynamic"

The PR description states:

Wildcard Patterns are delegated to CompareDynamic rather than raw strings.HasPrefix/HasSuffix

But the actual change in open.go removes the Patterns loop entirely — patterns don't contribute at all to suffix/prefix answers in pass-through mode, regardless of whether CompareDynamic would match. This is the correct conservative fix (you can't answer "ends with .log?" by running CompareDynamic without a concrete candidate), but the description should be updated to say "Patterns are excluded from suffix/prefix scanning in pass-through mode" to avoid confusion for future rule authors who wonder why a profiled path that was collapsed to doesn't surface in a suffix query.


🔵 Low — Subtle back-compat semantics in wasExecutedWithArgs

In exec.go, both the exact-path and pattern branches return true unconditionally when cp.ExecsByPath[path] is absent:

} else {
    // No ExecsByPath entry for this path — back-compat: treat as
    // "no argv constraint", match.
    return types.Bool(true)
}

This means: any profile built before ExecsByPath projection was active gets unconditional path-only matching, silently ignoring runtimeArgs. While intentional, this creates an invisible two-tier behavior: new profiles enforce arg constraints, old profiles don't. Rule authors writing R0040 expressions may be surprised when a newly-deployed workload with the same path suddenly starts enforcing args that the legacy profile never evaluated. A log-level warning ("back-compat: ExecsByPath absent for %s, matching on path only") would make the fallback observable.


🔵 Low — Storage PR #323 notes

removeEndpoint mutates the backing array in-place (append(s[:i], s[i+1:]...) in MergeDuplicateEndpoints). This is safe because the return value is immediately reassigned, but it's the kind of subtle slice mutation that trips up future readers. A short comment clarifying this intent would help.

effectiveThreshold and FindConfigForPath tiebreak parity: Both correctly use strict > (first entry wins on equal-length prefixes). The comment in effectiveThreshold rightly warns about this, but NewPathAnalyzerWithConfigs could deduplicate or log a warning on duplicate prefixes at construction time to surface misconfiguration earlier.

consolidateOpens quadratic worst-case: The outer loop is O(literals) × O(patterns), and each CompareDynamic call with mid-path * can backtrack O(segment²). This is fine for realistic filesystem profiles but worth noting in the function doc for operators tuning CollapseConfig thresholds for very wide trees.


Summary

Issue Location Severity
CompareExecArgs missing from kubescape/storage exec.go + storage PR #323 🔴 Build break
Stale comment k8sstormcenter/storage#23 exec_test.go 🔵
was_path_opened_with_flags loses all test coverage open_test.go 🟡
PR description misleads on pattern delegation open.go 🟡
Back-compat silent tier in wasExecutedWithArgs exec.go 🔵
removeEndpoint in-place mutation undocumented analyze_endpoints.go 🔵

The storage PR #323 changes are solid — the per-port endpoint isolation, CompareDynamic rewrite with proper trailing-* anchoring, consolidateOpens, bufPool, and the extensive test suite all look correct. The node-agent changes are also logically sound, contingent on resolving the missing CompareExecArgs definition.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

entlein pushed a commit to k8sstormcenter/node-agent that referenced this pull request May 16, 2026
…prefix + exec-args

Two doc-only fixes for CodeRabbit cross-PR advisory:

#7 (NA kubescape#807): ap.was_path_opened_with_suffix / _prefix — explicitly
document the false-negative gap when the projection is in pass-through
mode (cp.Opens.All=true). Wildcard Patterns are skipped from the
suffix/prefix scan because their token-bearing text doesn't safely
answer suffix questions. Rule authors who need wildcard-aware coverage
should either declare an Opens-projection slice (so SuffixHits/PrefixHits
become authoritative for the literals they care about) or use
ap.was_path_opened (which runs CompareDynamic over Patterns).

#8 (NA kubescape#807): wasExecutedWithArgs — document the three states of
ExecsByPath:
  1. Path absent from Execs.Values → exec not allowed, fall through.
  2. Path in Values, ABSENT from ExecsByPath → legacy back-compat
     "no argv constraint", match.
  3. Path in Values, PRESENT with empty arg list [] → explicit
     "ran with no args" constraint, NOT a wildcard.
The distinction is load-bearing for profile authors: an entry of
{Path: ..., Args: []} is a constraint, not a free pass.

No behavioural change. Tests pass unchanged.
Signed-off-by: entlein <einentlein@gmail.com>
@entlein
Copy link
Copy Markdown
Author

entlein commented May 16, 2026

@coderabbitai review

Collective re-review request after applying your atomic-split advisory + rabbit-feedback fixes across the full SBOB upstream PR set. Each PR is now scoped to a single tier per your dependency map:

Tier 0 (opens):

Tier 2 (execs):

Tier 3 (network):

Tier 4 (tamper/sign):

Umbrellas:

Please confirm atomicity per your earlier advisory and flag any remaining cross-PR coupling we missed.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 16, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

I'll fetch all PRs in the set to verify atomicity and cross-PR coupling.


🏁 Script executed:

#!/bin/bash
set -e

# Fetch metadata for all storage PRs
for pr in 322 323 324 325 326; do
  echo "===STORAGE PR $pr==="
  gh pr view $pr --repo kubescape/storage --json title,state,headRefName,baseRefName,additions,deletions,changedFiles 2>/dev/null || echo "NOT FOUND"
done

# Fetch metadata for all node-agent PRs
for pr in 805 806 807 808 809 810 811 812; do
  echo "===NODE-AGENT PR $pr==="
  gh pr view $pr --repo kubescape/node-agent --json title,state,headRefName,baseRefName,additions,deletions,changedFiles 2>/dev/null || echo "NOT FOUND"
done

Length of output: 2908


🏁 Script executed:

#!/bin/bash
# Fetch diffs for storage PRs
for pr in 322 323 324 325 326; do
  echo "===STORAGE DIFF $pr==="
  gh pr diff $pr --repo kubescape/storage 2>/dev/null | head -600 || echo "DIFF FAILED"
  echo "---END STORAGE $pr---"
done

Length of output: 101307


🏁 Script executed:

#!/bin/bash
# Fetch diffs for node-agent PRs (excluding `#807` which we already have)
for pr in 805 806 808 809 810 811 812; do
  echo "===NODE-AGENT DIFF $pr==="
  gh pr diff $pr --repo kubescape/node-agent 2>/dev/null | head -500 || echo "DIFF FAILED"
  echo "---END NODE-AGENT $pr---"
done

Length of output: 93396


@entlein: Good progress on the split. After inspecting all 13 diffs in detail, here is the atomicity verdict and the remaining coupling issues that need resolution before merge ordering can be safely established.


✅ Confirmed Atomic (no cross-PR coupling issues found)

PR Verdict
kubescape/storage#322 ✅ Self-contained — CompareExecArgs + tests, no overlap with other PRs
#805 ✅ Self-contained — busybox-symlink resolveExecPath + 3-arg CEL overload
#809 ✅ Self-contained — cmd/sign-object CLI, no runtime entanglement
#811 ✅ Self-contained — removes Patterns loop from wasPathOpenedWithSuffix/Prefix + doc comments

🔴 Remaining Cross-PR Coupling (must resolve before merging)

1 — storage#324 ↔ storage#325: protobuf test cannot pass without the sibling

storage#324 adds NetworkNeighbor.IPAddresses []string to the struct and a TestNetworkNeighbor_IPAddresses_ProtobufRoundtrip test that calls Marshal() / Unmarshal().

storage#325's generated.pb.go is where the encoding of that field actually lives (tag 0x4a = field 9, inside NetworkNeighbor.MarshalToSizedBuffer and the corresponding Size() / Unmarshal() paths). The roundtrip test in #324 will fail to compile and run on its own because the generated protobuf code for the field is in #325.

Fix options:

  • Option A (preferred): Move the IPAddresses protobuf encoding from generated.pb.go in storage#325 into storage#324 so the test and the encoding ship in the same PR.
  • Option B: Mark #324 as depending on #325 and merge them atomically in CI (same release train).

2 — node-agent#808 ↔ node-agent#812: extractExecsByPath is defined in both

node-agent#812 adds ExecsByPath map[string][]string to ProjectedContainerProfile (in projection_types.go) and adds extractExecsByPath to projection_apply.go.

node-agent#808 also adds an identical extractExecsByPath to projection_apply.go and calls pcp.ExecsByPath = extractExecsByPath(cp).

If #812 merges first (which it must, since #807 reads cp.ExecsByPath), #808 will hit a direct merge conflict on the same function in the same file.

Fix: Rebase #808 on top of #812 and remove its copy of extractExecsByPath. After the rebase, #808 only needs the pcp.ExecsByPath = extractExecsByPath(cp) call line — the function definition is already present from #812.


3 — node-agent#806 (Tier 3) depends on node-agent#808 (Tier 4)

node-agent#806's matchIPField routes wildcard IP entries (CIDRs, *) to field.Patterns and calls networkmatch.MatchIP(field.Patterns, observed). For field.Patterns to contain anything, extractEgressAddresses / extractIngressAddresses must append n.IPAddresses... — and that append is in node-agent#808's projection_apply.go, not in #806 itself.

Without #808 merged first, a profile entry with e.g. IPAddresses: ["10.0.0.0/8"] will never reach Patterns, so matchIPField will always fall through to a miss for CIDR-only entries. Tier 3 silently degrades without Tier 4 being present.

Fix: Promote the extractEgressAddresses / extractIngressAddresses n.IPAddresses... append into node-agent#806 (it logically belongs with the network-wildcard CEL work), and remove it from #808 after rebasing. This restores the Tier 3-before-Tier 4 ordering the map intends.


4 — go.mod downgrades exist only in the umbrella (storage#326)

storage#326 has three dependency version changes absent from all atomic tier PRs:

github.com/anchore/syft         v1.42.3 → v1.32.0
github.com/anchore/stereoscope  v0.1.22 → v0.1.9
github.com/opencontainers/runtime-spec v1.3.0 → v1.2.1

If the tier PRs (#322#325) are merged individually against main, they will carry syft v1.42.3. The umbrella then merges them all and applies the downgrades. This means the umbrella cannot be trivially assembled from its tier PRs — the umbrella is doing real independent work (the dependency pin). Either:

  • Add a thin go.mod-only PR (no code changes) that must land before #326, or
  • Move the downgrades into whichever tier PR first introduces the dependency (likely storage#323 or storage#324 if networkmatch pulls in syft transitively).

⚠️ Minor / Cosmetic

Item Location
storage#323 title still says "R0040 args" That scope moved to node-agent#807; update the title
node-agent#808 still bundles 4 concerns stampOverlayIdentity, tamper detection, IPAddresses extraction, isNetworkIPWildcard/DNS classifiers — acceptable if you want a single Tier 4 PR, but worth noting for future bisect

Revised Merge Ordering (after fixes)

graph TD
  S322["storage#322\nCompareExecArgs"] --> NA807["node-agent#807\nR0040 args"]
  S323["storage#323\npath-wildcards core"] --> NA811["node-agent#811\nwas_path_opened*"]
  S323 --> S325["storage#325\nCollapseConfig CRD"]
  S324["storage#324 + IPAddresses pb\nnetworkmatch"] --> NA806["node-agent#806\nnetwork CEL"]
  S324 --> NA808["node-agent#808\nprojection overlay + tamper"]
  S325 --> NA808
  NA812["node-agent#812\nExecsByPath surface"] --> NA807
  NA812 --> NA808
  NA805["node-agent#805\nresolveExecPath"] --> NA807
  NA808 --> NA806
  NA808 --> S326["storage#326 umbrella"]
  NA810["node-agent#810 umbrella"] --> S326
Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

2 participants