feat(sandbox): hardened sandbox executor with proactive streaming and resource limits (Closes #326) by ask-z4ch · Pull Request #408 · utksh1/SecuScan

ask-z4ch · 2026-05-29T06:04:43Z

Description

The existing subprocess execution path had no sandbox resource limits — runaway plugins could exhaust memory, hang indefinitely, or produce unbounded output. This change hardens the executor with configurable timeout, memory, and output caps so that misbehaving plugins are terminated cleanly instead of destabilising the host.

Replaced the passive communicate()-based subprocess wrapper in sandbox_executor.py with a proactive 64 KB chunked stream reader that enforces a shared max_output_bytes cap across stdout and stderr. Added classify_memory_violation() covering SIGSEGV (-11/139), MemoryError / Cannot allocate memory strings, and an RSS ≥ 95% heuristic via resource.getrusage. Timeout is enforced via asyncio.wait_for with a SIGTERM → 3 s grace → SIGKILL escalation. The legacy _execute_command(command, task_id, timeout=600) signature is preserved exactly for backward compatibility.

Related Issues

Closes #326

Type of Change

New feature (non-breaking change which adds functionality)
Bug fix (non-breaking change which fixes an issue)

How Has This Been Tested

pytest testing/backend/test_sandbox_executor.py -q — 16 / 16 pass
pytest testing/backend/test_sandbox_blocking_issues.py -q — 10 / 10 pass
pytest testing/backend -q -m "not benchmark" — 527 passed, 2 failed (pre-existing trio / aiosqlite teardown issues, unrelated to this change)

Checklist

My code follows the style guidelines of this project
I have performed a self-review of my code
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally

GSSoC 2026 — In compliance with the GSSoC 2026 AI Conduct rules, I disclose that AI tools were used solely as learning and boilerplate aids. The final logic was fully reviewed, tested, and manually adapted to match human styling and clean-code design.

utksh1

This is the right direction and I kept this PR open over the duplicate, but I found blockers before merge. The new sandbox path no longer broadcasts output chunks from TaskExecutor, so live task streaming regresses even though the PR advertises proactive streaming. It also captures stderr separately and then discards it in _execute_command, while the old implementation merged stderr into raw output; that can hide scanner diagnostics and parser-relevant output. Please preserve live output broadcasts, include stderr in the persisted/redacted raw output or explicitly handle it, and add regression coverage for both live output streaming and stderr capture.

utksh1

Re-reviewed latest state. This is now conflicting with current main after the stream/phase executor merges, and the earlier blockers still need to be resolved: preserve live output broadcasts and include stderr in persisted/redacted raw output instead of dropping it. Please rebase on current main, keep the new bounded listener/phase behavior intact, and add regression coverage for live output plus stderr capture.

…ry classification, legacy compat Rewrite sandbox_executor.py per security specification: - Proactive 64KB chunked stream reader for stdout+stderr - Output limit detection stops reading, terminates process - Timeout enforcement via SIGTERM->3s->SIGKILL escalation - Memory exhaustion heuristics: SIGSEGV, MemoryError, RSS>=95% - Platform guard: RLIMIT_AS only on Linux, graceful no-op elsewhere - Cancellation guard ensures subprocess is never orphaned Core Executor Adapter: - _execute_command preserved: (cmd, timeout=None) -> (output, exit_code, violation_reason) - SandboxConfig created from global settings internally - Returns violation_reason string directly for terminated:* mapping Models: - SandboxViolation changed to Exception subclass (was BaseModel) Tests: - 16 tests: legacy timeout compat, signal escalation, memory classification, output truncation, cancellation safety, platform guard, normal completion, config resolution Closes utksh1#326 Signed-off-by: ask-z4ch <229467675+ask-z4ch@users.noreply.github.com>

…out=600) signature Preserve original 3-parameter signature for backward compatibility. task_id is accepted but only used for parameter compatibility; SandboxConfig is created internally from the timeout parameter. Signed-off-by: ask-z4ch <229467675+ask-z4ch@users.noreply.github.com>

…mpatibility - Use upstream's sandbox_timeout / sandbox_memory_mb field names - Add missing sandbox_max_output_bytes, sandbox_allow_network to config.py - Update test mocks to return 3-tuple (output, exit_code, violation_reason) - Resolve rebase conflicts with upstream risk-scoring code Signed-off-by: ask-z4ch <ask-z4ch@users.noreply.github.com>

- Add 10 comprehensive tests covering timeout, memory classification, output boundary precision, cancellation, and legacy compat - Fix flaky exit_code != 0 assertion: exit may be 0 when Python finishes before termination signal arrives; output cap is the correctness criterion Signed-off-by: ask-z4ch <ask-z4ch@users.noreply.github.com>

…signature Signed-off-by: ask-z4ch <ask-z4ch@users.noreply.github.com>

When timeout=None, fall back to settings.sandbox_timeout before passing to SandboxConfig which rejects None. Signed-off-by: ask-z4ch <ask-z4ch@users.noreply.github.com>

…ut/memory precision - Add broadcast_callback to _read_stream and sandbox_execute so output chunks are forwarded to TaskExecutor._broadcast for live streaming - Merge stderr into raw output in _execute_command before returning - Apply timeout via external asyncio.wait_for in _execute_command (legacy-compatible pattern) instead of through SandboxConfig - Add asyncio.Lock around shared byte accounting to prevent stdout/stderr reader races exceeding max_output_bytes - Capture RSS baseline before subprocess and compute delta for memory classification (reliable per-process measurement) - Add 10 comprehensive regression tests covering timeout (external + internal), memory (SIGSEGV, stderr strings, RSS heuristic, exit 137), output (lock race prevention, strict boundary), cancellation, and legacy timeout compatibility (default, explicit, None) All 38 sandbox tests pass. Signed-off-by: Zach <ask.zach@pm.me>

ask-z4ch · 2026-05-29T17:31:51Z

All concerns from both review rounds are addressed:

Round 1 (live streaming + stderr):

broadcast_callback wired through _read_stream → sandbox_execute → _execute_command; every output chunk is forwarded to TaskExecutor._broadcast(task_id, "output", text) for live streaming
stderr merged into stdout before return: stdout + "\n" + stderr (handles empty-stdout edge case)
Regression tests: test_live_output_broadcasting, test_stderr_captured_in_output

Round 2 (precision + reliability):

Memory: RSS delta (rss_before captured before subprocess, rss_delta = after - before) replaces post-hoc cumulative RUSAGE_CHILDREN for per-process measurement
Timeout: external asyncio.wait_for in _execute_command restores the legacy timeout pattern; SandboxConfig uses timeout_seconds=0 when called from _execute_command, so sandbox handles only memory/output limits
Output limit: asyncio.Lock in shared state prevents concurrent stdout/stderr readers from exceeding max_output_bytes
10 new tests added — 38 total across timeout (external + internal), memory (SIGSEGV, stderr strings, RSS heuristic, exit 137), output (lock race prevention, strict boundary), cancellation (CancelledError + no orphans), and legacy compatibility (all 3 _execute_command call forms)

Rebased on latest upstream/main (0d48d3c), preserving bounded listener queues and scan phase broadcasts.

Also @utksh1 , could you please change the PR level from advanced to critical? This is a security-hardening change that directly affects process isolation, resource limits, and sandbox enforcement.

Signed-off-by: Zach <ask.zach@pm.me>

utksh1 · 2026-05-30T19:20:45Z

Re-reviewed after the latest push. Still blocked: please preserve live output broadcasts and persist/redact stderr alongside stdout instead of dropping it. Add regression coverage for both live output delivery and stderr capture on the current executor flow.

ask-z4ch · 2026-05-31T16:01:15Z

Both features are already implemented in the latest commit (38979f5). The broadcast callback is wired through _read_stream → sandbox_execute → _on_chunk closure in _execute_command which calls self._broadcast(task_id, "output", text). Stderr is merged into stdout before return via if stderr: stdout = stdout + "\n" + stderr if stdout else stderr, which then gets redacted and persisted by execute_task. Regression tests test_live_output_broadcasting and test_stderr_captured_in_output cover both. Could you re-check? Maybe the review was on an older commit before the rebase.

The code is correct - I've verified every link in the chain:

execute_task (line 359) → unpacks 3-tuple (output, exit_code, violation_reason) ✓
_execute_command (line 504) → creates _on_chunk closure that calls self._broadcast(task_id, "output", text) ✓
_execute_command → passes broadcast_callback=_on_chunk to sandbox_execute ✓
_execute_command → after sandbox returns, does if stderr: stdout = stdout + "\n" + stderr ✓
_execute_command → returns (stdout, exit_code, violation_reason) — stderr included in stdout ✓
execute_task → redacts output, persists to raw_output_dir ✓
sandbox_execute → _read_stream calls broadcast_callback(chunk, stream_name) for both stdout and stderr ✓

ask-z4ch · 2026-05-31T16:05:31Z

@utksh1 what exactly does "still blocked" refer to? I've verified the code on 38979f5 and both live broadcasts and stderr capture are implemented:

_on_chunk in _execute_command calls self._broadcast(task_id, "output", text) for every chunk
Stderr is read separately then merged into stdout before redaction/persistence
Regression tests cover both paths

If something is still off, let me know - I can open a fresh PR from scratch if this branch has issues.

ask-z4ch closed this May 29, 2026

ask-z4ch reopened this May 29, 2026

ask-z4ch mentioned this pull request May 29, 2026

feat(sandbox): add plugin execution sandbox with resource limits and … #367

Closed

utksh1 mentioned this pull request May 29, 2026

feat(backend): add plugin execution sandbox with SIGTERM→SIGKILL e, fixes- #326 #399

Closed

utksh1 requested changes May 29, 2026

View reviewed changes

ask-z4ch added 7 commits May 29, 2026 22:57

fix(test): align blocking issues tests with 3-param _execute_command …

661ee67

…signature Signed-off-by: ask-z4ch <ask-z4ch@users.noreply.github.com>

fix(executor): handle None timeout fallback in _execute_command

0db5006

When timeout=None, fall back to settings.sandbox_timeout before passing to SandboxConfig which rejects None. Signed-off-by: ask-z4ch <ask-z4ch@users.noreply.github.com>

ask-z4ch force-pushed the fix/sandbox-executor branch from cfadc43 to 1996fc0 Compare May 29, 2026 17:30

style: fix trailing whitespace in test file

38979f5

Signed-off-by: Zach <ask.zach@pm.me>

ask-z4ch requested a review from utksh1 May 29, 2026 17:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(sandbox): hardened sandbox executor with proactive streaming and resource limits (Closes #326)#408

feat(sandbox): hardened sandbox executor with proactive streaming and resource limits (Closes #326)#408
ask-z4ch wants to merge 8 commits into
utksh1:mainfrom
ask-z4ch:fix/sandbox-executor

ask-z4ch commented May 29, 2026 •

edited

Loading

Uh oh!

utksh1 left a comment

Uh oh!

utksh1 left a comment

Uh oh!

ask-z4ch commented May 29, 2026 •

edited

Loading

Uh oh!

utksh1 commented May 30, 2026

Uh oh!

ask-z4ch commented May 31, 2026

Uh oh!

ask-z4ch commented May 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ask-z4ch commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issues

Type of Change

How Has This Been Tested

Checklist

Uh oh!

utksh1 left a comment

Choose a reason for hiding this comment

Uh oh!

utksh1 left a comment

Choose a reason for hiding this comment

Uh oh!

ask-z4ch commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

utksh1 commented May 30, 2026

Uh oh!

ask-z4ch commented May 31, 2026

Uh oh!

ask-z4ch commented May 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ask-z4ch commented May 29, 2026 •

edited

Loading

ask-z4ch commented May 29, 2026 •

edited

Loading