Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLI: fix line buffering with Python 3 (#528) #537

Merged
merged 1 commit into from
Sep 28, 2023

Conversation

thiell
Copy link
Collaborator

@thiell thiell commented Sep 20, 2023

When the code was ported to Python 3, we switched to sys.stdout.buffer and sys.stderr.buffer that use BufferedWriter (bytes) for printing outputs. However, this method doesn't work great for line buffering, which is used a lot by clush and clubak.

Switch back to sys.stdout and sys.stderr, as TextIOWrapper objects, which take strings and benefit from automatic line buffering. If stdout (or stderr) is not a tty, we reconfigure it (only once) to force line buffering. Indeed, clush and clubak might be piped to other tools (like pv) that expects each line to be flushed.

Key points of this patch:

  • write() to stdout/stderr now takes a string and the Display class is in charge of performing the needed bytes-to-string decoding

  • any non-printable character will be decoded to U+FFFD thanks to errors='replace'; U+FFFD is the official REPLACEMENT CHARACTER (this is now tested in CLIDisplayTest.py)

  • when the Display object is initialized, we make sure stdout/stderr are configured with line buffering enabled

  • the print methods from Display like print_line(), print_line_error() and _print_content() still take bytes as argument. Buffers are still mainly handled as bytes internally (by MsgTreeElem).

  • most CLI tests remain unchanged for now: they capture the output (now as string) and compare against expected bytes, but proper encoding is done by TLib in CLI_main(). In the future, we could convert all tests from using bytes to string to make it easier.

Fixes #528.

When the code was ported to Python 3, we switched to sys.stdout.buffer
and sys.stderr.buffer that use BufferedWriter (bytes) for printing
outputs. However, this method doesn't work great for line buffering,
which is used a lot by clush and clubak.

Switch back to sys.stdout and sys.stderr, as TextIOWrapper objects,
which take strings and benefit from automatic line buffering.
If stdout (or stderr) is not a tty, we reconfigure it (only once) to
force line buffering. Indeed, clush and clubak might be piped to other
tools (like pv) that expects each line to be flushed.

Key aspects of this patch:

- write() to stdout/stderr now takes a string and the Display class is
in charge of performing the needed bytes-to-string decoding

- any non-printable character will be decoded to U+FFFD thanks to
errors='replace'; U+FFFD is the official REPLACEMENT CHARACTER (this
is now tested in CLIDisplayTest.py)

- when the Display object is initialized, we make sure stdout/stderr
are configured with line buffering enabled

- the print methods from Display like print_line(), print_line_error()
and _print_content() still take bytes as argument. Buffers are still
mainly handled as bytes internally (by MsgTreeElem)

- most CLI tests remain unchanged for now: they capture the output (now
as string) and compare against expected bytes, but proper encoding is
done by TLib in CLI_main(). In the future, we could convert all tests
from using bytes to string to make it easier.

Fixes cea-hpc#528.
@thiell thiell added this pull request to the merge queue Sep 28, 2023
Merged via the queue into cea-hpc:master with commit aaba98f Sep 28, 2023
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

clush: line buffering broken with Python 3.9.16
1 participant