Skip to content

Subprocess(fix[encoding]): Enforce UTF-8 decoding for tmux output#679

Merged
tony merged 6 commits into
masterfrom
utf-8-encoding
May 23, 2026
Merged

Subprocess(fix[encoding]): Enforce UTF-8 decoding for tmux output#679
tony merged 6 commits into
masterfrom
utf-8-encoding

Conversation

@tony
Copy link
Copy Markdown
Member

@tony tony commented May 23, 2026

Summary

  • Add regression test reproducing Enforce UTF-8 encoding #678: tmux_cmd uses subprocess.Popen(text=True) without encoding="utf-8", causing FORMAT_SEPARATOR (U+241E) corruption on non-UTF-8 locales
  • Fix: add encoding="utf-8" to the subprocess.Popen call in tmux_cmd.__init__

Commit sequence (hermetic proof)

Commit What Test result
922a326a xfail regression test — monkeypatches locale.getencoding"latin-1", runs list-sessions through tmux_cmd, asserts FORMAT_SEPARATOR survives XFAIL — separator corrupted to â\x90\x9e, assertion fails at the right place
e05c8ca0 Fix: encoding="utf-8" in subprocess.Popen — zero test changes XPASS(strict) → CI fails, proving the fix resolves the reproduction
c8279acf Remove xfail marker — zero code changes 1257 passed

Root cause

Commit 1a5e69a2 (2025-02-02) removed console_to_str() (which had an explicit UTF-8 fallback) and switched to bare text=True. Without encoding="utf-8", CPython's subprocess._text_encoding() falls back to locale.getencoding(). On non-UTF-8 locales, the 3-byte UTF-8 sequence for (U+241E) is decoded as three latin-1 code points (â\x90\x9e), making .split(FORMAT_SEPARATOR) produce a single unsplit element. This cascades to parse_output() where zip(..., strict=True) raises ValueError, and list accessors like server.sessions return empty results.

tmux has mandated UTF-8 since 2015 — hardcoding encoding="utf-8" matches tmux's output contract.

Fixes: #678
Related: tmux-python/tmuxp#1044

Test plan

  • uv run ruff check . --fix --show-fixes — passes
  • uv run ruff format . — no changes
  • uv run mypy — no issues
  • uv run pytest --reruns 0 -vvv — 1257 passed, 2 skipped
  • just build-docs — builds successfully

@codecov
Copy link
Copy Markdown

codecov Bot commented May 23, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 51.29%. Comparing base (5089f31) to head (fc5e6c4).

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #679   +/-   ##
=======================================
  Coverage   51.29%   51.29%           
=======================================
  Files          25       25           
  Lines        3488     3488           
  Branches      686      686           
=======================================
  Hits         1789     1789           
  Misses       1404     1404           
  Partials      295      295           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@tony tony force-pushed the utf-8-encoding branch 2 times, most recently from d510846 to b9a0223 Compare May 23, 2026 14:40
Copy link
Copy Markdown
Member Author

@tony tony left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review

No issues found. Checked for bugs and CLAUDE.md compliance.

🤖 Generated with Claude Code

@tony tony force-pushed the utf-8-encoding branch 2 times, most recently from 17fb9b1 to c85d4da Compare May 23, 2026 15:39
tony added 5 commits May 23, 2026 10:43
…tion

why: The previous xfail depended on locale.getencoding(), which only exists on Python 3.11+ and skipped libtmux's supported Python 3.10 runtime.
what:
- Recreate the strict xfail regression test with a temporary LC_CTYPE=C boundary
- Assert FORMAT_SEPARATOR survives list-sessions output before parse_output consumes it
- Keep the test skipped only when Python UTF-8 mode masks locale decoding
why: tmux format output is UTF-8, but text=True without an explicit encoding lets CPython decode with the process locale and corrupts non-ASCII separators under non-UTF-8 locales.
what:
- Pass encoding="utf-8" to subprocess.Popen in tmux_cmd
why: ControlMode opens its own tmux -C subprocess with text=True, so it has the same locale-decoding risk as tmux_cmd but through a separate stdout surface.
what:
- Add a strict xfail regression test for non-ASCII control protocol output under LC_CTYPE=C
- Drive display-message through the control client and assert FORMAT_SEPARATOR survives stdout decoding
why: ControlMode reads tmux control protocol output through its own text-mode subprocess, so locale decoding can corrupt or reject UTF-8 output under non-UTF-8 locales.
what:
- Pass encoding="utf-8" to the control-mode subprocess.Popen call
why: tmux_cmd and ControlMode now decode tmux output as UTF-8 explicitly, so the locale regression checks should fail only on future regressions.
what:
- Remove strict xfail markers from the tmux_cmd and ControlMode UTF-8 regression tests
- Keep the UTF-8-mode skip that avoids masked locale-decoding behavior
@tony tony force-pushed the utf-8-encoding branch from c85d4da to 38d2fd2 Compare May 23, 2026 15:43
why: Document the user-visible bug fix for the 0.58.x release.
what:
- Add Fixes entry for tmux_cmd encoding on non-UTF-8 locales
@tony tony force-pushed the utf-8-encoding branch from 38d2fd2 to fc5e6c4 Compare May 23, 2026 15:48
@tony tony merged commit cb8384e into master May 23, 2026
13 checks passed
@tony tony deleted the utf-8-encoding branch May 23, 2026 15:52
@tony tony changed the title tmux_cmd: Add xfail regression test for UTF-8 encoding on non-UTF-8 locales Subprocess(fix[encoding]): Enforce UTF-8 decoding for tmux output May 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enforce UTF-8 encoding

1 participant