feat(awk): add sandboxed awk builtin with review-fix-loop tool#218
feat(awk): add sandboxed awk builtin with review-fix-loop tool#218AlexandreYang wants to merge 73 commits intomainfrom
Conversation
|
@codex review this PR |
|
Codex Review: Didn't find any major issues. Can't wait for the next one! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
AlexandreYang
left a comment
There was a problem hiding this comment.
Review Summary
This PR implements a sandboxed awk builtin — a full custom lexer, parser, and interpreter written in Go with no external awk binary dependency. The implementation is large (~2500 lines across 9 files) and well-structured. Security blocking (system(), getline, redirects, ENVIRON) is done at parse time, which is the correct approach. Memory limits, loop iteration caps, and context cancellation are properly threaded through. The RE2 regex engine is used throughout, eliminating ReDoS risk.
Overall assessment: needs fixes — two P1 correctness/safety issues should be addressed before merge.
Findings Summary
Detailed Findings
Finding 1 — Printf literal width cap allows ~1 GiB allocation
Location: builtins/awk/printf.go, parsePrintfSpec, width-parsing loop (line ~145)
Description: The guard if spec.width > 1<<30 is exclusive, so width == 1<<30 (1,073,741,824) is accepted. padString/padNumber then call strings.Repeat(" ", padCount) where padCount can be up to ~1 GiB. This strings.Repeat call allocates the full padding string before the sb.Len() > MaxStringBytes guard in awkSprintf ever fires.
Proof-of-concept:
awk 'BEGIN { printf "%1073741824s\n", "x" }'
# Attempts to allocate ~1 GiB of spacesRemediation: Lower the width cap to MaxStringBytes (or a practical maximum like 8192):
if spec.width > MaxStringBytes {
return j, spec, fmt.Errorf("printf: width %d exceeds maximum %d", spec.width, MaxStringBytes)
}
Finding 2 — *-width from awk expression has no upper bound
Location: builtins/awk/printf.go, awkSprintf, *-width resolution (lines 48–55)
Description: When width is taken from a * placeholder, nextInt returns int(v.toNumber()) with no cap. A large positive value flows directly into padString/padNumber → strings.Repeat.
Proof-of-concept:
awk 'BEGIN { printf "%*s\n", 2000000000, "x" }'
# Tries to allocate ~2 GiBRemediation: Clamp spec.width after the *-width resolution:
spec.width = w
if w < 0 {
spec.flagMinus = true
spec.width = -w
}
if spec.width > MaxStringBytes {
spec.width = MaxStringBytes
}
Finding 3 — -F'\t' does not split on tab
Location: builtins/awk/interp.go, setFS (line 96) + splitFields (line 431)
Description: When the user passes -F'\t', the shell delivers the 2-char string \t (backslash + t) to setFS. setFS special-cases s != "\\t" to avoid compiling it as a regex, but stores it verbatim as r.fs = "\\t". splitFields checks r.fs == "\t" (1-byte actual tab), which does not match the 2-byte "\\t", so it falls through to strings.Split(rec, "\\t") — splitting on a literal \t string instead of tab.
Bash awk (gawk) expands backslash-escapes in FS:
printf 'a\tb\tc\n' | awk -F'\t' '{print $2}' # → "b"Our implementation produces $1 = "a\tb\tc" (no split).
Remediation: Expand common backslash escapes in setFS before storing:
func (r *runtime) setFS(s string) error {
// Expand common backslash escapes (gawk/mawk behaviour for command-line FS).
switch s {
case "\\t":
s = "\t"
case "\\n":
s = "\n"
case "\\r":
s = "\r"
}
// ... rest unchangedNote: the existing splitFields special-case for r.fs == "\t" already handles the corrected value correctly. Add a scenario test in tests/scenarios/cmd/awk/flags/ (without skip_assert_against_bash) to lock this in.
Finding 4 — index(s, "") returns 0 instead of 1
Location: builtins/awk/builtins.go, bIndex (line 166)
Description: Our code returns 0 for an empty needle, matching the POSIX "unspecified" allowance. However GNU awk (the reference used by bash) returns 1:
docker run --rm debian:bookworm-slim bash -c 'awk "BEGIN { print index(\"abc\", \"\") }"'
# → 1Since scenario tests run against bash by default, any future scenario test covering this edge case will fail.
Remediation:
if t == "" {
return numValue(1), nil
}
Finding 5 — Missing scenario test for -F'\t'
Location: tests/scenarios/cmd/awk/flags/
Description: fs_tab.yaml covers FS = "\t" set inside the program (which goes through the awk lexer), but there is no test for -F'\t' on the command line (which bypasses the lexer and hits setFS directly). Once Finding 3 is fixed, this path should be regression-tested.
Remediation: Add a scenario tests/scenarios/cmd/awk/flags/fs_tab_flag.yaml:
description: -F'\t' splits tab-separated input correctly.
setup:
files:
- path: tsv.txt
content: "name\tage\tcity\nalice\t30\tNYC\n"
input:
allowed_paths: ["$DIR"]
script: |+
awk -F'\t' 'NR==2 {print $2}' tsv.txt
expect:
stdout: |+
30
stderr: ""
exit_code: 0Test Coverage Summary
| Code path | Scenario test | Go test | Status |
|---|---|---|---|
system() blocked |
security/system_blocked.yaml |
TestAwkPentestSystemCaseVariations |
Covered |
print > file blocked |
security/redirect_write_blocked.yaml |
TestAwkPentestRedirectVariations |
Covered |
getline blocked |
security/getline_blocked.yaml |
TestAwkPentestGetlineVariations |
Covered |
ENVIRON blocked |
security/environ_blocked.yaml |
— | Covered |
| Loop iteration cap | — | TestAwkPentestInfiniteWhile |
Covered |
| Array entry cap | — | TestHardeningArrayEntryLimit |
Covered |
| String length cap | — | TestHardeningStringConcatLimit |
Covered |
| Record size cap | — | TestAwkPentestLineBeyondCap |
Covered |
| Deep nesting rejected | — | TestAwkPentestDeepNestingRejected |
Covered |
| Printf large literal width OOM | — | — | Missing |
Printf *-width OOM |
— | — | Missing |
-F'\t' command-line tab |
— | — | Missing |
index(s, "") empty needle |
— | — | Missing |
Coverage verdict: Gaps found (P1 and P2 paths untested).
Positive Observations
- Parse-time blocking for all dangerous constructs is correct and comprehensive.
- RE2 engine throughout eliminates ReDoS risk.
- Context cancellation is threaded through every loop and I/O read.
callCtx.OpenFileused exclusively — sandbox invariant is intact.floatToInt64Safecorrectly handles NaN/Inf before int64 cast.capReaderproperly bounds non-regular-file reads at 256 MiB.- Import allowlist in
analysis/symbols_builtins.gois complete and accurate. - Fuzz tests cover program text, input data, field separators, and variable assignments.
Iteration 1 self-review resultFindings by severity:
Summary: Two resource-exhaustion bugs (printf width OOM) must be fixed. Tab-FS handling and index() edge case are bash-compatibility issues. |
Implements a complete awk interpreter (lexer, parser, evaluator) as a native Go builtin — no external binary. Covers arithmetic, arrays, patterns, printf, and all standard built-in functions, with security restrictions on getline, system, pipe, redirect, and ENVIRON.
…edle - Cap printf literal width at MaxStringBytes (was > 1<<30, off by one allowing 1 GiB allocation before output guard fires) - Clamp *-width after arg resolution to MaxStringBytes (prevents ~2 GiB allocation from large awk expression values) - Expand \t/\n/\r backslash escapes in setFS so -F'\t' splits on actual tab (matches gawk/mawk behaviour) - Return 1 for index(s,"") empty needle (GNU awk compatibility) - Add fs_tab_flag.yaml scenario test for command-line -F'\t' path
|
[Claude Sonnet 4.6] Addressed all review findings from self-review (review #4219389257):
All tests pass. |
|
@codex review this PR |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 294aa22842
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review this PR |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f5ec6f1d9a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review this PR |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f67e090753
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review this PR |
1 similar comment
|
@codex review this PR |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b4f165a5b4
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review this PR |
AlexandreYang
left a comment
There was a problem hiding this comment.
Review Summary
Scope: Full sandboxed awk builtin (~10 K lines across lexer, parser, AST, interpreter, printf engine, value system, hardening tests, ~100 scenario tests, pentest integration tests) plus a review-fix-loop developer tool and a minor tail cap reduction.
Overall assessment: safe to merge. The sandbox is correctly implemented — no direct filesystem access bypasses callCtx.OpenFile, all dangerous constructs (system(), getline, ENVIRON, redirects, pipes, user functions) are blocked at parse time, and memory/loop/output caps are thorough. The implementation is impressive in scope and quality. The findings below are all P2/P3.
Finding Summary
Test Coverage Table
| Code path | Scenario test | Go test | Status |
|---|---|---|---|
system() blocked |
security/system_blocked.yaml |
TestAwkPentestSystemCaseVariations |
Covered |
getline blocked |
security/getline_blocked.yaml |
TestAwkPentestGetlineVariations |
Covered |
ENVIRON blocked (all forms) |
6× environ_*.yaml |
— | Covered |
| redirects/pipes blocked | redirect_*.yaml, pipe_blocked.yaml |
TestAwkPentestRedirectVariations |
Covered |
| user-defined functions blocked | function_blocked.yaml |
— | Covered |
-f progfile blocked |
none | TestAwkPentestUnknownFlags (covers --no-such but not -f) |
Missing scenario |
| Record cap (1 MiB) | — | TestAwkPentestLineBeyondCap |
Covered |
| Array entry cap | — | TestAwkPentestArrayCap |
Covered |
| Array total byte cap | — | (implicit via above) | Covered |
| String concat cap | — | TestAwkPentestStringDoublingCap |
Covered |
| Loop iteration cap | — | TestAwkPentestInfiniteWhile |
Covered |
| gsub match-count cap (>1M matches) | none | none | Missing |
/dev/zero read cap |
— | TestAwkPentestDevZeroInfiniteStream |
Covered |
| Deep nesting rejected | — | TestAwkPentestDeepNestingRejected |
Covered |
| Printf precision cap | security/printf_precision_cap.yaml |
— | Covered |
| Path traversal | — | TestHardeningPathTraversal |
Covered |
Positive Observations
- No direct filesystem access — every file open in the awk implementation goes through
callCtx.OpenFile, which enforcesAllowedPaths. The single"os"import ininterp.gois used exclusively for theos.O_RDONLYconstant. Excellent discipline. - Parse-time blocking —
system(),getline,ENVIRON, redirects, pipes, and user-defined functions are all rejected at the lexer/parser level before any runtime code runs, preventing any runtime bypass via creative expression evaluation. - RE2 for all regex —
regexp.Compile(RE2 engine) is used throughout; ReDoS attacks are structurally impossible. - Thorough memory caps —
MaxRecordBytes,MaxStringBytes,MaxArrayEntries,MaxArrayTotalBytes,MaxLoopIterations,MaxTotalReadBytes,MaxFieldsall have independent guards with clear error messages. floatToInt64Safe— NaN/±Inf are handled before everyint64()cast, preventing UB from the Go spec's implementation-defined conversion behaviour.- TOCTOU race documented — the race between
StatFileandOpenFileinopenInputis clearly documented with correct analysis: it only affects the non-regular-file cap (defence-in-depth), not the sandbox itself. SetInterspersed(false)— correctly prevents flag injection via arguments after the program text.- Exit code mod-256 wrapping —
uint8(n & 0xFF)matches POSIX/mawk semantics for negative exit codes. - Deterministic
for (k in arr)iteration — sorted key snapshot prevents nondeterministic output across runs.
|
[AI Generated] Self-review iteration 24 complete — see inline review comments for findings. |
|
@codex review this PR |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 14de466c5c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| case 'g', 'G', 'f', 'F', 'e', 'E': | ||
| if prec < 0 { | ||
| prec = 6 | ||
| } | ||
| return strconv.FormatFloat(f, byte(verb), prec, 64), true |
There was a problem hiding this comment.
Map %F before formatting OFMT/CONVFMT numbers
When a script sets OFMT or CONVFMT to %F, numeric conversion now returns the literal string %F because strconv.FormatFloat does not implement an uppercase F verb. This affects ordinary awk programs such as awk 'BEGIN{OFMT="%F"; print 1.25}', which GNU/mawk print as 1.250000 but this implementation prints %F; fold F to f here as awkSprintf already does for printf.
Useful? React with 👍 / 👎.
AlexandreYang
left a comment
There was a problem hiding this comment.
Review Summary
This PR introduces a sandboxed awk builtin with a full POSIX-subset implementation (lexer → parser → AST → interpreter), plus a developer review-fix-loop automation tool. I reviewed the full diff — all new files plus the tail.go doc-fix.
Overall Assessment: ✅ Safe to merge
The security architecture is sound. All dangerous awk constructs (system(), getline, redirects, pipes, ENVIRON, user-defined functions, -f progfile) are blocked at parse time via blockedNames, before any execution occurs. File access is exclusively routed through callCtx.OpenFile/callCtx.StatFile — no direct os.Open/os.Stat calls exist in the awk package. Memory safety is well-guarded with MaxRecordBytes, MaxStringBytes, MaxArrayEntries, MaxArrayTotalBytes, and MaxLoopIterations. RE2 (linear-time, no backtracking) eliminates ReDoS. Test coverage is excellent: ~100 scenario YAML tests, fuzz corpus, hardening tests, and pentest integration tests.
Only minor P3 observations follow.
Findings Summary
Positive Observations
- Sandbox integrity: Every file read goes through
callCtx.OpenFile; noos.Open/os.Statanywhere in the awk package. The singleosimport is for theO_RDONLYconstant only. - Parse-time rejection:
system(),getline,ENVIRON,ARGV,ARGC,close,fflush, user functions, and all I/O redirects are rejected in the parser before any runtime execution — defense-in-depth against obfuscated inputs. - RE2 regex:
compileEREroutes throughregexp.Compile(Go's RE2 engine), giving linear-time guarantees. ReDoS is structurally impossible. - Memory caps: The layered approach (per-record, per-string, per-array-entry, per-array-total, per-loop-iteration) is well-thought-out. The
arrayTotalBytesaccounting correctly handles large-key attacks that stay within the entry count limit. - gsub match-count cap:
FindAllStringSubmatchIndexresult count is checked againstMaxLoopIterationsbefore the replacement loop, preventing DoS via dense-match patterns on large strings. - TOCTOU race in
openInputbetweenStatFileandOpenFileis explicitly documented with a clear justification — theAllowedPathscheck insideOpenFileis the true sandbox guard; theisRegularflag only controls the 256 MiB streaming cap, whose bypass carries no security consequence. - Intentional divergences from bash (division-by-zero as error, sorted
for-in,RS=""paragraph mode fallback, byte-indexed string functions) are all properly markedskip_assert_against_bash: truewith matching documentation inSHELL_FEATURES.md. review-fix-looptool: All shell commands useexec.Commandwith argument slices (no shell injection). TheprRefinteger is formatted with%d. File paths usefilepath.Join. The concurrentcode-review+triggerCodexgoroutines are race-free becausetriggerCodexis a standalone subprocess that never touches the sharedagentstate.
|
[AI Generated] Self-review iteration 25 complete — see inline review comments for findings. |
|
@codex review this PR |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 14de466c5c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // as non-clean: if a review or fix agent fails, we cannot be confident the | ||
| // iteration is valid even when thread counts and CI appear clean. | ||
| agentErr := reviewAgentErr != nil || addrAgentErr != nil || fixCIAgentErr != nil | ||
| iterClean := reviewFindings == 0 && unresolved == 0 && ciClean && unresolvedErr == nil && ciErr == nil && !agentErr |
There was a problem hiding this comment.
Treat failed review counts as non-clean
When either the pre- or post-review countUnresolvedThreads call fails (for example due to a transient GitHub GraphQL/rate-limit/auth error), the zero values for beforeReview/afterReview still produce reviewFindings == 0, and this clean check only gates on the later unresolvedErr. If the final unresolved lookup and CI pass, the loop can advance the clean streak and eventually report CLEAN without ever measuring whether the review step opened new threads; include beforeErr/afterErr in the non-clean condition so unknown review results cannot be treated as clean.
Useful? React with 👍 / 👎.
AlexandreYang
left a comment
There was a problem hiding this comment.
Review Summary
This PR introduces a sandboxed awk builtin (plus a review-fix-loop development tool). The awk implementation is large (~1,800 LOC interpreter + parser + lexer + printf engine) with extensive test coverage (scenario YAML tests, Go unit/hardening/fuzz/pentest tests). Overall the security posture is strong.
Overall Assessment: ✅ Safe to merge — with minor notes
Security verdict: No exploitable vulnerabilities found. The sandbox perimeter is intact:
system(),ENVIRON,getline,|pipes, and all file redirections are rejected at parse time — no runtime bypass possible.- All file I/O goes through
callCtx.OpenFile(AllowedPaths enforced); no directos.Open/os.Statcalls outside the sandbox API. - RE2 engine prevents ReDoS.
- Memory bounds are comprehensive: per-record cap (1 MiB), string cap (1 MiB), array entries cap (1 M), array total-bytes cap (256 MiB), loop iteration cap (1 M).
- Context cancellation is checked every loop iteration, every statement, and every record.
Summary of Findings
Positive Observations
- The blocked-names list (
system,ENVIRON,getline,function,close,fflush,ARGV,ARGC) is checked in every parse path:parsePrimary,parseDelete,parseForIn,parseCall,parseInExpr,parsePrimary (tkIdent). Comprehensive. - The
bSub(gsub) zero-width match handling is correct: Go RE2'sFindAllStringSubmatchIndexreturns non-overlapping matches at distinct byte positions, so thelastcursor advances correctly between empty matches. rebuildRecordpre-checks OFS × NF before callingstrings.Jointo prevent a transient allocation ofO(ofsLen × NF)bytes before the cap fires.floatToInt64Safecorrectly handles NaN/±Inf before any int64 cast.- All intentional divergences from bash (
for-insorted order, division-by-zero fatal,RS=""paragraph mode, byte-indexed string functions,index(s,"")=1) are documented inSHELL_FEATURES.mdand covered byskip_assert_against_bash: truescenario tests. - The
deterministicRand(xorshift64*) avoidsmath/randglobal state and is clearly scoped per-runtime. - Fuzz corpus seeds are committed (
builtins/tests/awk/testdata/fuzz/). interp/builtin_awk_pentest_test.gois thorough: exercises/dev/zero, deep nesting, NF overflow, string doubling, path traversal, and all security-blocked forms.
|
[AI Generated] Self-review iteration 26 complete — see inline review comments for findings. |
|
@codex review this PR |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 14de466c5c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| } | ||
|
|
||
| func (p *parser) parseAssignment() (expr, error) { | ||
| lhs, err := p.parseTernary() |
There was a problem hiding this comment.
Parse assignments on boolean RHS
For expressions like ok && x = 1 or cond ? x = 1 : 0, mawk/gawk parse the assignment as part of the RHS/branch; this implementation parses the whole ternary/logical expression as lhs before looking for =, so awk 'BEGIN{ if (1&&x=1) print x }' is rejected with invalid assignment target instead of printing 1. This breaks awk idioms that assign inside conditions unless users add extra parentheses.
Useful? React with 👍 / 👎.
AlexandreYang
left a comment
There was a problem hiding this comment.
Review Summary
This PR implements a sandboxed awk builtin (~10 400 lines added) and a review-fix-loop developer tool (~1 400 lines). The review focused primarily on sandbox integrity, memory safety, bash compatibility, and correctness of the awk implementation.
Overall Assessment: ✅ Safe to merge with minor improvements
The security model is sound. The critical invariant — all file access goes through callCtx.OpenFile (not raw os.Open/os.Stat) — is correctly maintained throughout. The parser-level blocking of system(), getline, ENVIRON, file redirects, and pipe redirects is robust and tested. Memory bounds (record size, string size, array entries, array total bytes, loop iterations, gsub match count) are comprehensive and correctly enforced. The RE2 engine eliminates ReDoS risk.
Four findings are raised: two P2 (unintentional bash-compat divergences) and two P3 (code quality). No P0 or P1 issues found.
Findings Summary
Positive Observations
- Sandbox file access: All file I/O routes through
callCtx.OpenFile/callCtx.StatFile. No directos.Open,os.Stat, orioutilcalls anywhere in theawkpackage. This is the most critical invariant and it is correctly maintained. ✅ - Parser-level security blocks:
system(),ENVIRON,getline(all forms), file redirects (>,>>,|),function,close,fflushare rejected at parse time — before any runtime execution — making them immune to any runtime bypass attempts. ✅ - Multiple memory caps working in tandem: per-record (1 MiB), per-string (1 MiB), per-array (1 M entries + 256 MiB total bytes), gsub match count (1 M), loop iterations (1 M), and total non-regular-file reads (256 MiB). The caps are independently enforced and tested. ✅
- RE2 / linear-time regex: All regex compilation uses Go's
regexppackage (RE2), making ReDoS impossible. ✅ - TOCTOU race acknowledged and acceptable: The
StatFile → OpenFilerace for theisRegularflag is well-commented. The worst case is losing the 256 MiB cap for that file;AllowedPathsinsideOpenFilestill enforces path boundaries. ✅ - Parse-depth cap:
maxParseDepth = 256in the recursive-descent parser prevents stack-overflow DoS from deeply nested programs. Tested in both the hardening tests and pentest suite. ✅ - Deterministic PRNG: Custom xorshift64* ensures
rand()/srand()are reproducible in tests without exposing system state. ✅ - Comprehensive test suite: ~100 YAML scenario tests, fuzz corpus, hardening tests, GNU compat tests, and a full pentest integration test covering all blocked constructs, integer edge cases, resource exhaustion, and binary inputs. ✅
|
[AI Generated] Self-review iteration 27 complete — see inline review comments for findings. |
|
@codex review this PR |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 14de466c5c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| arr := r.arrays[v.arrayVar] | ||
| if arr == nil { | ||
| return numValue(0), nil |
There was a problem hiding this comment.
Preserve scalar-as-array errors for membership tests
When the right-hand side of an in expression has already been used as a scalar, this returns 0 instead of the same fatal scalar-as-array error enforced by index assignment, delete, and for (k in x). For example, awk 'BEGIN { x = 1; print ("a" in x); print "after" }' now prints 0 and continues, while awk treats this as an invalid scalar/array namespace reuse; scripts that rely on that error will silently take the false branch and keep running.
Useful? React with 👍 / 👎.
AlexandreYang
left a comment
There was a problem hiding this comment.
Review Summary
This PR adds a sandboxed awk builtin to the restricted shell plus a developer review-fix-loop tool. The awk implementation is a substantial piece of work: a hand-written lexer, recursive-descent parser, tree-walking interpreter, and a custom printf engine — all within the sandbox invariant.
Overall assessment: Safe to merge with minor fixes. No P0 sandbox-escape or P1 security/correctness issues found. The sandbox invariants are correctly upheld (all file access goes through callCtx.OpenFile/callCtx.StatFile; all dangerous awk constructs are blocked at parse time with multiple defence-in-depth checks; resource limits are consistently applied). The main findings are two P2 unintentional bash-divergence issues and a pair of P3 concerns.
Findings Summary
Coverage Table
| Code path | Scenario test | Go test | Status |
|---|---|---|---|
system() / getline / ENVIRON / redirects blocked at parse |
security/*.yaml | awk_test.go | Covered |
print > "file" / >> "file" / | cmd blocked |
security/*.yaml | awk_test.go | Covered |
| MaxRecordBytes per-record cap | pentest (lineBeyondCap) | awk_test.go:TestLongLineRejected | Covered |
| MaxArrayEntries / MaxArrayTotalBytes caps | hardening | awk_hardening_test.go | Covered |
| MaxLoopIterations per-loop cap | awk_test.go:TestLoopIterationLimit | pentest | Covered |
| MaxStringBytes concat cap | hardening | awk_hardening_test.go | Covered |
| MaxGsubMatches cap | — | — | Missing scenario test |
| gsub empty-pattern (zero-width match) | builtins_fn/gsub_empty_pattern.yaml | — | Covered |
| /dev/zero infinite stream → capReader | — | pentest:TestAwkPentestDevZeroInfiniteStream | Covered |
for (k in ENVIRON) blocked |
security/environ_for_in_blocked.yaml | awk_test.go | Covered |
delete ENVIRON blocked |
security/environ_delete_blocked.yaml | — | Covered |
(k in ENVIRON) blocked |
security/environ_in_expr_blocked.yaml | — | Covered |
| Deeply-nested expr parse-depth cap | — | pentest + hardening | Covered |
| TOCTOU StatFile→OpenFile (accepted risk) | — | — | Documented in code |
next in BEGIN → propagates as error |
edge/next_in_begin.yaml | — | Covered (skip_assert) |
| Directory-as-input → exit 1 vs mawk's 2 | — | hardening + pentest | P2 divergence |
printf "%.*d", MaxStringBytes, 1 cap |
security/printf_precision_cap.yaml (has \n) | pentest | Partially covered |
Positive Observations
- Complete sandbox coverage: every dangerous awk construct (
system(),getline,ENVIRON, file redirects, pipes, user-defined functions) is rejected at parse time with clear error messages. Multiple defence-in-depth checks (blockedNames, parsePrintLike/checkNoRedirect, storeScalar) ensure blocked operations cannot slip through. - RE2 safety: all regexes compile through Go's
regexp(RE2 engine), making ReDoS attacks structurally impossible. - Tight resource caps: per-record 1 MiB, per-string 1 MiB, arrays 1 M entries + 256 MiB total bytes, 1 M loop iterations, 1 M gsub matches, 256 MiB total non-regular-file read, parser depth 256, runtime depth 1024.
- No direct
os.*file access: all file I/O routes throughcallCtx.OpenFile/callCtx.StatFile— the single most critical invariant is upheld throughout. - Context cancellation: every tight loop checks
ctx.Err()at least every 1024 iterations; the gsub match-processing loop also checks it;processFile's record loop checks it each iteration. - Deterministic
for (k in a)iteration: sorted keys prevent output non-determinism in scripts.
|
[AI Generated] Self-review iteration 28 complete — see inline review comments for findings. |
|
@codex review this PR |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 14de466c5c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| arr := r.arrays[v.arrayVar] | ||
| if arr == nil { | ||
| return numValue(0), nil |
There was a problem hiding this comment.
Evaluate
in keys before checking the array map
When the right-hand array has not been materialized yet, this returns before evaluating the left-hand key expression. In awk the key expression still runs, so cases like awk 'BEGIN { print (++i in a), i }' should print 0 1; this implementation prints 0 with i still uninitialized. The same early return also skips the scalar/array conflict that other array paths enforce for x=1; (1 in x), so moving key evaluation and scalar checks before the nil-array fast path would avoid silently changing program behavior.
Useful? React with 👍 / 👎.
| if fs.Changed("field-separator") { | ||
| if err := runtime.setFS(*fieldSep); err != nil { | ||
| callCtx.Errf("awk: -F: %s\n", err) | ||
| return builtins.Result{Code: 1} | ||
| } | ||
| } | ||
| for _, a := range assignments { |
There was a problem hiding this comment.
Preserve command-line order between
-F and -v
Applying -F before all -v assignments loses awk's left-to-right option semantics when both write FS. I checked GNU awk locally: awk -v FS=: -F, 'BEGIN{print FS}' prints ,, while awk -F, -v FS=: 'BEGIN{print FS}' prints :; this implementation prints : for both because the -v loop always runs after setFS, which also makes field splitting use the wrong separator for scripts that rely on option order.
Useful? React with 👍 / 👎.
AlexandreYang
left a comment
There was a problem hiding this comment.
Security Review — PR #218: feat(awk): add sandboxed awk builtin
Overall assessment: ✅ Safe to merge — no P0 or P1 findings. The implementation shows careful, layered security design with comprehensive testing. A few P2/P3 observations are noted below.
Summary Table
Findings
Known TOCTOU race between StatFile and OpenFile
Severity: P2 · Security · Correctness
Location: builtins/awk/interp.go ~L773 (openInput)
The code calls StatFile to determine whether an input is a regular file (to skip the 256 MiB cumulative cap), then calls OpenFile separately. A file could be swapped to a symlink or special file between these two calls, causing isRegular=true and bypassing the read cap for that input.
The comment in the code correctly analyses this and concludes the risk is accepted:
AllowedPathsinOpenFileis the primary sandbox guard — it still fires.- The 256 MiB cap is defense-in-depth only; bypassing it does not unlock new capabilities.
- Exploitation requires the attacker to also control the
AllowedPathsdirectory (typically read-only in production).
This analysis is sound. Flagging as P2 only to ensure future readers are aware and so the acceptance is explicit in the review record. No code change required; the existing comment is sufficient documentation.
Fragile "is a directory" string match in scanner error path
Severity: P2 · Correctness · Portability
Location: builtins/awk/interp.go ~L536
if errors.Is(scErr, os.ErrInvalid) || strings.Contains(scErr.Error(), "is a directory") {
return &fileOpenError{...}
}The strings.Contains(scErr.Error(), "is a directory") match relies on the OS-specific error message string. On Linux this message comes from the kernel and is stable, but on macOS it is "is a directory" (lowercase) and on Windows it would be something different entirely. If the message ever diverges (e.g. a future OS, a translated locale, or a non-standard file system), this path silently falls through to the return fmt.Errorf(...) line, which is treated as a fatal runtime error (exit 1) rather than a non-fatal file-open error (exit 2), changing the observed exit code.
Remediation: Use os.Lstat or inspect the file mode before opening, or wrap the os.Open error with a syscall.EISDIR check:
var pathErr *os.PathError
if errors.As(scErr, &pathErr) && errors.Is(pathErr.Err, syscall.EISDIR) {
return &fileOpenError{...}
}Alternatively, since callCtx.StatFile is already called in openInput, you could stat the file and detect directories before scanning starts. This is a P2 (not P1) because the current behavior still fails safely — it just exits with the wrong code on non-Linux platforms.
Dead fields: numExpr.src and regexExpr.src
Severity: P3 · Code Quality
Location: builtins/awk/ast.go L30 and L212
type numExpr struct {
val float64
src string // original lexeme, preserved so toString can use it for OFMT
}
type regexExpr struct {
re *regexp.Regexp
src string
}Both src fields are set at parse time but never read anywhere in the codebase. The comment on numExpr.src suggests they were intended for OFMT formatting, but formatNumber / printableString use v.f directly. This is harmless but wastes a small amount of memory per AST node. Consider removing them or adding the planned usage.
Key snapshot in for-in not context-checked
Severity: P3 · DoS (very minor)
Location: builtins/awk/interp.go (forInStmt case)
keys := make([]string, 0, len(arr))
for k := range arr {
keys = append(keys, k)
}
slices.Sort(keys)For an array with MaxArrayEntries (1 M) keys, the snapshot build + sort takes O(N log N) time. ctx.Err() is not checked during the snapshot or sort phases. In practice this is fast (well under 1 s for 1 M string keys on modern hardware), and ctx.Err() is checked on each body iteration immediately after, so execution will be cancelled promptly on the next iteration. The comment in the code correctly notes this. Flagging as P3 only for completeness.
Missing scenario: very-large OFMT/CONVFMT width
Severity: P3 · Test Coverage
Location: tests/scenarios/cmd/awk/
The formatFloatWithFmt function in value.go only handles format strings of the form %[.prec][gGfeEd] and has a precision cap of 64. It silently falls back to %.6g for anything outside that subset (including strings with a width like %1024g or %s). This is safe, but there is no scenario test that exercises OFMT/CONVFMT set to an unusual value (e.g. %1024g, %s, empty string).
Suggested scenario:
description: OFMT set to a non-standard format string falls back to default %.6g.
skip_assert_against_bash: true
input:
script: |
awk -v OFMT="%1024g" 'BEGIN { x = 3.14159; print x }'
expect:
stdout: "3.14159\n"
exit_code: 0Positive Observations
🔒 Layered sandbox enforcement: All I/O-capable awk constructs (system(), getline, print > file, print | cmd, close(), fflush(), ENVIRON, user-defined functions) are rejected at parse time — before any code runs. This is the correct approach: a single point of enforcement that cannot be bypassed by runtime tricks.
🔒 RE2 regex engine: All regex compilation goes through Go's regexp package (RE2), which guarantees linear-time matching with no backtracking. ReDoS attacks are structurally impossible.
🔒 Comprehensive resource caps: Record size (1 MiB), string length (1 MiB), array entries (1 M), array total bytes (256 MiB), loop iterations per construct (1 M), gsub matches (1 M), and total reads from non-regular files (256 MiB) — all independently bounded.
🔒 Context propagation: ctx.Err() is checked at every loop iteration in the main record loop, every statement in execBlock, and every iteration of while/for/do-while/for-in body execution. The 30 s shell execution timeout is reliably honoured.
🔒 Parse-time depth limiting: maxParseDepth = 256 prevents stack-overflow attacks via deeply nested expressions. The runtime adds a second, independent maxRuntimeDepth = 1024 guard.
🔒 Integer safety: floatToInt64Safe correctly maps NaN→0 and clamps ±Inf to int64 bounds before conversion, avoiding undefined behaviour from out-of-range float→int casts.
🔒 Array byte budget: The arrayTotalBytes counter is decremented correctly in delete and reset in split, preventing the subtle large-key DoS that entry-count limits alone cannot stop.
🔒 Fuzz + pentest coverage: The PR ships fuzz seeds, a hardening test file, and a full integration pentest suite covering integer edges, path traversal, resource exhaustion, and all blocked language constructs.
Coverage Summary
| Code path | Scenario test | Go test | Status |
|---|---|---|---|
| All blocked constructs (system, getline, ENVIRON, functions, pipes, redirects) | tests/scenarios/cmd/awk/security/ |
pentest_test.go | ✅ Covered |
| Record size cap | — | awk_hardening_test.go / pentest |
✅ Covered |
| Array entry + memory cap | — | awk_hardening_test.go / pentest |
✅ Covered |
| Loop iteration cap | — | awk_hardening_test.go / pentest |
✅ Covered |
| Division by zero (intentional divergence) | arithmetic/division_by_zero.yaml |
— | ✅ Covered |
for-in sorted order (intentional divergence) |
loops/for_in_sorted.yaml |
— | ✅ Covered |
| RS="" paragraph mode not implemented (intentional divergence) | edge/rs_empty_paragraph.yaml |
— | ✅ Covered |
| OFMT/CONVFMT unusual format strings | — | — | |
strings.Contains("is a directory") on non-Linux |
— | — |
|
[AI Generated] Self-review iteration 29 complete — see inline review comments for findings. |
|
@codex review this PR |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 14de466c5c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if fs.Changed("field-separator") { | ||
| if err := runtime.setFS(*fieldSep); err != nil { |
There was a problem hiding this comment.
Preserve command-line ordering for -F and -v
When a caller passes -v FS=... before -F..., GNU awk applies these options in argv order, so the later -F value should win. This handler always applies -F here and then replays all -v assignments afterward, making -v FS=... win regardless of the original order; for example, awk -v FS=, -F: '{print $1}' will split on commas instead of colons. Please preserve the relative order of -F and -v assignments when initializing the runtime.
Useful? React with 👍 / 👎.
|
@codex review this PR |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ca312898b0
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if i != len(spec)-1 { | ||
| return "", false |
There was a problem hiding this comment.
Honor OFMT/CONVFMT width and flags
When users set OFMT or CONVFMT to a normal awk printf format with width or flags, such as OFMT="%2.2f" or CONVFMT="%08.2f", this parser returns false and formatNumber silently falls back to %.6g. That makes common awk programs print or concatenate the wrong numeric text; for example awk 'BEGIN{OFMT="%2.2f"; print 1.5}' should print 1.50, but this builtin prints 1.5.
Useful? React with 👍 / 👎.
What does this PR do?
Implements
awkas a sandboxed builtin command in the restricted shell interpreter. Also adds areview-fix-loopdeveloper tool that automates iterative PR review cycles.awk builtin
getline,system(),close(), pipe redirects (|), file redirects (>,>>,<),fflush(), user-defined functions, andENVIRONaccesssplit,sub,gsub,match,sprintf, etc.)-Ffield separator,-vvariable assignments,-fprogram filereview-fix-loop tool (
tools/review-fix-loop/)alex-code-reviewandaddress-pr-commentsClaude Code skills in sequenceTesting
builtins/tests/awk/builtins/tests/awk/awk_fuzz_test.gobuiltins/tests/awk/awk_hardening_test.gointerp/builtin_awk_pentest_test.gotests/scenarios/cmd/awk/(~100 YAML files, bash-compared)Checklist
SHELL_FEATURES.md)