fix(controller): cap JSON parser depth/elements and tighten request-boundary state#25
fix(controller): cap JSON parser depth/elements and tighten request-boundary state#25andypost wants to merge 1 commit into
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces several security and stability improvements, including hard caps on JSON parser depth and element counts to prevent stack or heap exhaustion. It also addresses a bug in PHP where stale exception states were inherited by worker processes and adds NULL-deref guards for Ruby stream IO handles accessed outside the request lifetime. A review comment highlights that the file-static variable used for tracking JSON recursion depth is not thread-safe, which could be problematic if the code is executed in a multi-threaded context like the router process.
| * file-static suffices. Reset on every entry to nxt_conf_json_parse() | ||
| * so a stale value from an aborted prior call cannot leak. | ||
| */ | ||
| static nxt_uint_t nxt_conf_json_parse_depth; |
There was a problem hiding this comment.
The use of a file-static variable for recursion depth is not thread-safe. While the Unit controller process is currently single-threaded, this file (src/nxt_conf.c) is part of the core library and is also linked into the router process, which supports multi-threading (e.g., via the listen_threads setting). If nxt_conf_json_parse is ever called from a multi-threaded context in the future, this will lead to race conditions and incorrect depth tracking. Consider using thread-local storage (e.g., __thread or _Thread_local) or passing the depth as an argument through the recursive calls.
b4fac9c to
7cef9c8
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7cef9c8df0
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (req == NULL) { | ||
| return Qnil; | ||
| } |
There was a problem hiding this comment.
Prevent stale rack.input from reading the next request
When a Ruby app keeps a rack.input object and reads it while the same worker context is handling a later request, this guard does not fire because rctx->req has been repointed to that new request in nxt_ruby_request_handler_gvl. Since the IO object is reused from rctx->io_input in the shared env, a stale handle can still read another client's request body rather than returning nil; the boundary check needs to bind the IO object to the request it was handed out for, not just check whether the context is currently idle.
Useful? React with 👍 / 👎.
| if (rctx->req == NULL) { | ||
| nxt_unit_alert(NULL, "Ruby: %s", RSTRING_PTR(val)); | ||
| return RSTRING_LEN(val); |
There was a problem hiding this comment.
Avoid attributing stale rack.errors writes to later requests
If an app keeps rack.errors and writes to it while this same Ruby worker context is already serving a later request, this NULL-only check also does not fire because rctx->req now points at that later request. The write then falls through to nxt_unit_req_error(rctx->req, ...), so background/late logs from a completed request are emitted under another client's request stream; the IO object needs request ownership/lifetime tracking rather than only the idle-context guard.
Useful? React with 👍 / 👎.
|
CI fix in The clang-ast failure on that run was an unrelated infrastructure error ( |
7cef9c8 to
ce68dbe
Compare
…oundary state Audit-driven controller robustness + request-boundary state pass (PR-H from security-audit.md / #10). Five findings; no protocol or config-surface changes. * V4 [High] Unbounded JSON recursion (nxt_conf.c) A POST /config with deeply-nested JSON like "[[[[...]]]]" recursed through nxt_conf_json_parse_value → parse_object / parse_array with no depth cap, blowing the controller's stack. Add a file-static depth counter checked at the '{' / '[' arms of parse_value (the only recursion sites); cap at 100, well above any legitimate Unit config (nesting > 6 levels does not occur in practice). Reset at every entry to nxt_conf_json_parse() so a stale value from an aborted prior call cannot leak. * V4 [High] Unbounded JSON array/object element count (nxt_conf.c) parse_object and parse_array looped with `count++` and no cap; a flat [1,2,...,1e6] passes a billion-byte allocation request to nxt_mp_get() at the end. Cap at 100k elements (well above any real config — the largest production configs we've seen have a few thousand routes). Reject with a clean parse error. * V4 [Medium] Validator trust-model annotation (nxt_conf_validation.c) Document at nxt_conf_vldt_app_isolation_members that allowing arbitrary "executable" and isolation = false is intentional: the privilege boundary lives at the control-socket layer (SO_PEERCRED, landed in #14) and not in this schema validator. Allow-listing executables here would be a deployment policy decision, not a config-schema concern. * V7 [Medium] PHP TrueAsync EG exception scrubbing (nxt_php_sapi.c) The TrueAsync entrypoint intentionally skips php_request_shutdown() so the callback zval persists across the prototype → worker fork. But if the entrypoint script raised an exception that wasn't caught before HttpServer-> onRequest() stored the callback, every forked worker would inherit the exception on its first request. Add a zend_clear_exception() before the function returns. Other EG globals (output buffers, error_reporting, symbol table) are reset implicitly by the worker's per-request init; the exception state is the one that early-exits php_execute_script() on the next request. * V8 [Medium] Ruby per-request IO NULL-deref hardening (nxt_ruby_stream_io.c) rctx->req is cleared after each request handler runs (src/ruby/nxt_ruby.c:657). Apps that capture rack.input or rack.errors across the request boundary — background threads, cached IO handles — would NULL-deref rctx->req when they called gets / read / puts / write. Add a NULL guard at each public entry point; gets/read return Qnil, write silently drops. The audit framed this as "buffers from a prior request remain accessible"; the actual hazard is NULL-deref, but the fix covers both shapes (all reads go through rctx->req, no buffers are held inside the IO object). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
ce68dbe to
534b274
Compare
|
Codex P1 ( Root cause. The old NULL-guard only caught the case where a captured handle was used after a request fully ended ( Fix. The IO handle is now per-instance bound to its originating request via a
ABA-safe. Using Existing test coverage preserved. Local sanity: |
Summary
Audit-driven controller robustness + request-boundary state pass (PR-H from `security-audit.md` / PR #10). Five findings; no protocol or config-surface changes. Closes the last actionable audit slot — after this, the only remaining items are the two excluded by the maintainer's DoS policy.
Findings addressed
Notes on departures from the audit
Conflicts with parallel PRs
Files changed
5 files, +120 / -3.
Test plan
Upstream
Same fixes apply to `freeunitorg/freeunit`; will forward after merge here.
Generated by Claude Code