Skip to content

fix: decode local python output from bytes#7702

Open
bugkeep wants to merge 1 commit intoAstrBotDevs:masterfrom
bugkeep:codex/fix-local-python-output-decoding
Open

fix: decode local python output from bytes#7702
bugkeep wants to merge 1 commit intoAstrBotDevs:masterfrom
bugkeep:codex/fix-local-python-output-decoding

Conversation

@bugkeep
Copy link
Copy Markdown

@bugkeep bugkeep commented Apr 21, 2026

Summary

  • run LocalPythonComponent subprocesses in bytes mode so stdout/stderr can use the existing fallback decoder
  • normalize Python tool newlines to preserve the previous text=True behavior
  • add a regression test for non-UTF-8 local Python stdout

Verification

  • confirmed the new regression test fails on current upstream before the fix
  • uv run pytest tests/unit/test_computer.py -q (41 passed)
  • uv run ruff format .
  • uv run ruff check .

Fixes #7695.

Summary by Sourcery

Decode LocalPythonComponent subprocess output from bytes while preserving previous text behavior and ensure non-UTF-8 stdout is handled via the existing fallback decoder.

Bug Fixes:

  • Fix LocalPythonComponent to run subprocesses in bytes mode and decode stdout/stderr using the existing fallback decoder, including non-UTF-8 output.

Tests:

  • Add a regression test verifying LocalPythonComponent correctly decodes non-UTF-8 stdout using the fallback decoder.

@dosubot dosubot Bot added size:S This PR changes 10-29 lines, ignoring generated files. area:core The bug / feature is about astrbot's core, backend labels Apr 21, 2026
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • _normalize_python_output unconditionally converts lone '\r' characters to '\n', which changes behavior for outputs that intentionally use carriage returns (e.g., progress bars); consider limiting normalization to '\r\n' or documenting this change in behavior.
  • Since both stdout and stderr now share the same decode-and-normalize path, consider reusing or generalizing _normalize_python_output/_decode_shell_output so shell and Python components share a single, consistent normalization function rather than splitting this between two helpers.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- _normalize_python_output unconditionally converts lone '\r' characters to '\n', which changes behavior for outputs that intentionally use carriage returns (e.g., progress bars); consider limiting normalization to '\r\n' or documenting this change in behavior.
- Since both stdout and stderr now share the same decode-and-normalize path, consider reusing or generalizing _normalize_python_output/_decode_shell_output so shell and Python components share a single, consistent normalization function rather than splitting this between two helpers.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the local Python execution logic to handle output decoding and normalization more robustly by disabling automatic text mode in subprocess calls and applying a custom normalization function. A new test case is added to verify the handling of non-UTF-8 (GBK) output. Review feedback recommends always capturing stderr to include diagnostic warnings even when the process succeeds and suggests mocking the operating system name in tests to ensure consistent behavior across different environments.

Comment on lines +164 to 168
stderr = (
_normalize_python_output(_decode_shell_output(result.stderr))
if result.returncode != 0
else ""
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Currently, stderr is only captured and processed if the return code is non-zero. This differs from the behavior in LocalShellComponent.exec, which always returns the decoded stderr. Capturing stderr even on success (return code 0) is beneficial for surfacing warnings or diagnostic information that doesn't necessarily trigger a process failure. Since _decode_shell_output handles empty input gracefully, you can simplify this logic to always process result.stderr.

                stderr = _normalize_python_output(_decode_shell_output(result.stderr))

Comment on lines +243 to +244
with patch("astrbot.core.computer.booters.local.subprocess.run", fake_run):
result = await python.exec("print('中文输出')")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The new test test_exec_decodes_non_utf8_stdout_with_fallback uses GBK encoding for the mocked stdout. However, the _decode_bytes_with_fallback function only explicitly attempts GBK/CP936 decoding when os.name == 'nt'. On non-Windows environments (like many Linux-based CI systems), this test might fail if the default locale is UTF-8, as it will fall back to replacement characters. To ensure the test is platform-independent and correctly exercises the fallback logic, consider patching os.name to 'nt' within the test context.

Suggested change
with patch("astrbot.core.computer.booters.local.subprocess.run", fake_run):
result = await python.exec("print('中文输出')")
with patch("astrbot.core.computer.booters.local.subprocess.run", fake_run), \
patch("astrbot.core.computer.booters.local.os.name", "nt"):
result = await python.exec("print('中文输出')")

bugkeep

This comment was marked as spam.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:core The bug / feature is about astrbot's core, backend size:S This PR changes 10-29 lines, ignoring generated files.

Projects

None yet

1 participant