fix(logging): route all console logs to stderr to protect stdout JSON by juanmichelini · Pull Request #497 · OpenHands/benchmarks

juanmichelini · 2026-03-10T00:17:40Z

This PR routes all console logs to stderr instead of stdout in benchmarks/utils/console_logging.py, even when rich formatting is enabled.

Why

Scripts that consume the output of benchmark runners (like swebenchmultimodal) often expect pure JSON on stdout. If logs are printed to stdout, they can corrupt the JSON stream, causing tools like jq to fail.

The swebenchmultimodal benchmark, for example, crashes consistently with jq: parse error: Invalid numeric literal when OpenTelemetry logs an error to stdout.

By ensuring all logs go to stderr, we keep stdout clean for machine-readable output.

Fixes an issue where OpenTelemetry context detachment errors printed to stdout break JSON parsing in downstream scripts.

all-hands-bot

🟢 Good taste - Clean, simple fix that solves a real problem by following Unix conventions (stdout for data, stderr for diagnostics).

Verdict: ✅ Worth merging

Key Insight: The original code had a subtle bug where rich-mode logs went to stdout but plain logs went to stderr—this fix correctly routes everything to stderr, protecting structured output on stdout.

benchmarks/utils/console_logging.py

all-hands-bot

🟢 Good taste - This is the right fix. The original code had a subtle but real bug: rich mode dumped diagnostics to stdout while plain mode correctly used stderr. Your change fixes it properly by following the Unix rule: stdout is for data, stderr is for diagnostics. No special cases, no complexity, just correct I/O redirection. Ship it.

Root cause of swebenchmultimodal jq parse error (exit code 5): _ThreadRoutedConsoleHandler wrote formatted log messages to sys.__stdout__, which got captured by shell $() substitution. OpenTelemetry context-detach errors appearing after JSON output corrupted the stream that run_swebenchmultimodal.sh pipes to jq. Changes: - Route ALL console handler output to sys.__stderr__ (matches main's approach from PR #497) to protect stdout for JSON output parsing - Suppress opentelemetry.context logger (CRITICAL level) to prevent harmless context-detach errors from leaking to any output stream - Extract setup_routed_logging() to set up main-thread logging defaults explicitly, eliminating fallback paths in handlers - Change PendingInstance.start_time to Optional[float] (None while queued) instead of float("inf") sentinel value - Replace _ThreadLocalWriter.__getattr__ with explicit properties (encoding, closed, isatty, fileno) - Add integration tests: test_evaluation_run_end_to_end (4 instances with 1 failure through full Evaluation.run()) and test_evaluation_timeout_cancels_instance - Fix test_workspace_cleanup.py to use _process_one_sync (renamed from _process_one_mp in asyncio refactor) - Add pytest-asyncio to dev dependencies with asyncio_mode = "auto" - Update SDK submodule to latest main - Rebase onto latest main Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

fix(logging): route all console logs to stderr to protect stdout JSON

0e1ce03

all-hands-bot reviewed Mar 10, 2026

View reviewed changes

benchmarks/utils/console_logging.py Show resolved Hide resolved

benchmarks/utils/console_logging.py Show resolved Hide resolved

docs: improve comment clarity on why stderr is used

8da3449

juanmichelini requested a review from all-hands-bot March 10, 2026 00:23

all-hands-bot approved these changes Mar 10, 2026

View reviewed changes

juanmichelini merged commit 51bbbc7 into main Mar 10, 2026
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(logging): route all console logs to stderr to protect stdout JSON#497

fix(logging): route all console logs to stderr to protect stdout JSON#497
juanmichelini merged 2 commits intomainfrom
fix-logging-stderr

juanmichelini commented Mar 10, 2026

Uh oh!

all-hands-bot left a comment

Uh oh!

Uh oh!

Uh oh!

all-hands-bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

juanmichelini commented Mar 10, 2026

Why

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants