fix(modal): surface repo image build failure details by hreiten · Pull Request #652 · ColeMurray/background-agents

hreiten · 2026-05-18T13:30:38Z

Summary

Preserve structured setup/supervisor failure details when a repo image build exits before image_build.complete
Redact the clone token and repo environment variable values before storing build failure messages
Fix supervisor.fatal logging so it does not collide with the reserved LogRecord.message field

Why

Repo image builds currently fall back to Build sandbox exited without completing (exit_code=...) even when the build sandbox already emitted a more useful structured failure such as setup.failed or supervisor.error. That makes failed image builds hard to diagnose from the settings UI.

There is also a separate fatal-error logging path that passes message= through logging.extra, which raises KeyError: "Attempt to overwrite 'message' in LogRecord" before the fatal error can be reported.

Validation

cd packages/modal-infra && ./.venv/bin/ruff check . && ./.venv/bin/ruff format --check . && ./.venv/bin/pytest tests/ -q
cd packages/sandbox-runtime && ./.venv/bin/ruff check . && ./.venv/bin/ruff format --check . && ./.venv/bin/pytest tests/ -q
npm run lint
npm run typecheck

npm test currently fails in unrelated control-plane model tests for GPT 5.5 default reasoning expectations; this PR does not touch TypeScript model metadata or tests.

Summary by CodeRabbit

Bug Fixes
- Improved image build failure reporting: captures concise error messages from build logs, trims long messages, and redacts sensitive values; sandbox exits now report extracted errors instead of generic messages.
- Fixed supervisor fatal-error reporting to use the correct structured field for the error message.
Tests
- Updated and added tests to validate failure reporting, truncation, and redaction behavior.

coderabbitai · 2026-05-18T13:30:46Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 78a41982-c4ff-47ab-92cf-00b2a1809423

📥 Commits

Reviewing files that changed from the base of the PR and between d3a4ce4 and be9088d.

📒 Files selected for processing (4)

packages/modal-infra/src/scheduler/image_builder.py
packages/modal-infra/tests/test_image_builder_v2.py
packages/sandbox-runtime/src/sandbox_runtime/entrypoint.py
packages/sandbox-runtime/tests/test_supervisor_monitor.py

🚧 Files skipped from review as they are similar to previous changes (4)

packages/sandbox-runtime/tests/test_supervisor_monitor.py
packages/sandbox-runtime/src/sandbox_runtime/entrypoint.py
packages/modal-infra/tests/test_image_builder_v2.py
packages/modal-infra/src/scheduler/image_builder.py

📝 Walkthrough

Walkthrough

Streams structured build logs, extracts a concise redacted/truncated failure message when present, returns (sha, complete, error) from the log streamer, and raises BuildError with the extracted message if the sandbox exits without completion; aligns supervisor fatal logging field and updates tests.

Changes

Image Build Error Reporting Enhancement

Layer / File(s)	Summary
Failure message extraction utilities `packages/modal-infra/src/scheduler/image_builder.py`	Adds `Iterable` import, constants `BUILD_FAILURE_MESSAGE_MAX_CHARS`, `BUILD_FAILURE_REDACT_MIN_CHARS`, `_BUILD_LOG_EVENTS`, `_SETUP_FAILURE_EVENTS`, and helpers `_format_build_failure_event` and `_trim_build_failure_message` to extract and redact concise failure messages from structured JSON log events.
Stream build logs refactor `packages/modal-infra/src/scheduler/image_builder.py`	Rewrites `_stream_build_logs(sandbox, redact_values)` to parse structured JSON stdout, capture `head_sha` from `git.sync_complete`, treat `image_build.complete` as completion, select setup/supervisor payloads for concise error text, and return `(head_sha, build_complete, error_message)`.
Build pipeline integration `packages/modal-infra/src/scheduler/image_builder.py`	`build_repo_image` now collects `redact_values` (clone token and injected env values), passes them into `_stream_build_logs`, and raises `BuildError` with the extracted `build_error` if the sandbox exits without completion.
Test coverage for stream logs contract `packages/modal-infra/tests/test_image_builder_v2.py`	Tests updated to await the new 3-tuple and assert `error is None` on success; `_build_handle` expanded to accept `stdout_lines` and `returncode`; new test `test_reports_build_log_failure_when_stream_never_completes` verifies redaction/truncation and failure callback payload.
Fatal error reporting field alignment `packages/sandbox-runtime/src/sandbox_runtime/entrypoint.py`, `packages/sandbox-runtime/tests/test_supervisor_monitor.py`	`SandboxSupervisor._report_fatal_error` now emits the fatal message under `error_message` (not `message`); new test `TestFatalErrorReporting` ensures reporting does not collide with reserved fields.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

ColeMurray/background-agents#636: Modifies build_repo_image and sandbox termination in image build failure handling; closely related to this PR's error extraction and reporting changes.
ColeMurray/background-agents#503: Related changes to how error_message is extracted/redacted from build-log events and surfaced to UI.

Suggested reviewers

open-inspect

Poem

🐇 I stitched the logs with careful art,

hid the secrets, trimmed the part,
masked the tokens, kept the clue,
now builds can tell us what they do —
hop in, reviewers, take a cart!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 71.43% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically describes the main change: surfacing detailed error information from failed repo image builds instead of using generic messages.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/modal-infra/src/scheduler/image_builder.py`:
- Around line 53-63: The failure events list is missing "supervisor.fatal", so
update the constants: add "supervisor.fatal" to the _BUILD_LOG_EVENTS tuple and
include it in the _BUILD_FAILURE_EVENTS set (which is _SETUP_FAILURE_EVENTS |
{...}) so supervisor.fatal is treated as a build/fatal error; this will allow
_stream_build_logs and the error-extraction logic that inspects supervisor
events to pick up the structured fatal messages instead of falling back to
generic failures.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 018208cd-7e99-4263-9a04-6fc34f4838bf

📥 Commits

Reviewing files that changed from the base of the PR and between a8a14c4 and 3f5574b.

📒 Files selected for processing (4)

packages/modal-infra/src/scheduler/image_builder.py
packages/modal-infra/tests/test_image_builder_v2.py
packages/sandbox-runtime/src/sandbox_runtime/entrypoint.py
packages/sandbox-runtime/tests/test_supervisor_monitor.py

hreiten changed the title ~~[codex] Surface repo image build failure details~~ fix(modal): surface repo image build failure details May 18, 2026

hreiten marked this pull request as ready for review May 18, 2026 13:32

coderabbitai Bot reviewed May 18, 2026

View reviewed changes

Comment thread packages/modal-infra/src/scheduler/image_builder.py

hreiten force-pushed the fix/repo-image-build-error-details branch from 3f5574b to d3a4ce4 Compare May 18, 2026 17:39

fix: surface repo image build failures

be9088d

hreiten force-pushed the fix/repo-image-build-error-details branch from d3a4ce4 to be9088d Compare May 18, 2026 17:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(modal): surface repo image build failure details#652

fix(modal): surface repo image build failure details#652
hreiten wants to merge 1 commit into
ColeMurray:mainfrom
watchdog-no:fix/repo-image-build-error-details

hreiten commented May 18, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 18, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hreiten commented May 18, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Validation

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

hreiten commented May 18, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 18, 2026 •

edited

Loading