Handle unreadable stdin in sanitize-string #4

assisted-by-ai · 2025-12-02T09:10:02Z

Summary

handle stdin objects missing a read method or raising errors by emitting clear failures instead of crashing
add regression tests for unreadable stdin scenarios

Testing

PYTHONPATH="$PWD/usr/lib/python3/dist-packages" python3 -m pytest -o 'python_files=tests/*.py' usr/lib/python3/dist-packages/sanitize_string/tests/sanitize_string.py

assisted-by-ai · 2025-12-03T10:38:04Z

Added reusable helper to assert sanitize-string usage output, exit codes, and standardized help text for argument validation scenarios.

Introduced coverage for invalid argument combinations and absence of stdin to ensure graceful handling of edge conditions.

Skipped stdin.reconfigure when the stream lacks that method so sanitize-string can handle custom stdin implementations without crashing.

Added regression coverage to ensure stdin streams without reconfigure still process input correctly even when pytest is absent from sys.modules.

Added safeguards for stdin handling so sanitize-string now reports unreadable streams instead of crashing on missing read or read failures.

Introduced regression tests covering stdin read errors and absent read methods to ensure graceful exits with clear messaging.

ArrayBolt3 · 2025-12-04T01:35:26Z

usr/lib/python3/dist-packages/sanitize_string/tests/sanitize_string.py

+    def test_missing_stdin(self) -> None:
+        """Verify sanitize_string exits cleanly when stdin is unavailable."""
+
+        stdout_buf: StringIO = StringIO()
+        stderr_buf: StringIO = StringIO()
+        args_arr: list[str] = [self.argv0, "nolimit"]
+        with (
+            mock.patch.object(sys, "argv", args_arr),
+            mock.patch.object(sys, "stdin", None),
+            mock.patch.object(sys, "stdout", stdout_buf),
+            mock.patch.object(sys, "stderr", stderr_buf),
+        ):
+            exit_code: int = sanitize_string_main()
+        self.assertEqual(stdout_buf.getvalue(), "")
+        self.assertEqual(stderr_buf.getvalue(), "")
+        self.assertEqual(exit_code, 0)
+        stdout_buf.close()
+        stderr_buf.close()


We already test this in run-test itself. Maybe moving that logic out of the Bash script and into the Python scripts might be a good future goal, but for now we don't need this.

ArrayBolt3 · 2025-12-04T01:45:37Z

usr/lib/python3/dist-packages/sanitize_string/tests/sanitize_string.py

+    def test_stdin_without_reconfigure(self) -> None:
+        """Ensure sanitize_string tolerates stdin objects lacking reconfigure."""
+
+        stdout_buf: StringIO = StringIO()
+        stderr_buf: StringIO = StringIO()
+        stdin_buf: StringIO = StringIO("Sample input")
+        args_arr: list[str] = [self.argv0, "nolimit"]
+        original_pytest_module = sys.modules.pop("pytest", None)
+        try:
+            with (
+                mock.patch.object(sys, "argv", args_arr),
+                mock.patch.object(sys, "stdin", stdin_buf),
+                mock.patch.object(sys, "stdout", stdout_buf),
+                mock.patch.object(sys, "stderr", stderr_buf),
+            ):
+                exit_code: int = sanitize_string_main()
+        finally:
+            if original_pytest_module is not None:
+                sys.modules["pytest"] = original_pytest_module
+        self.assertEqual(stdout_buf.getvalue(), "Sample input")
+        self.assertEqual(stderr_buf.getvalue(), "")
+        self.assertEqual(exit_code, 0)
+        stdout_buf.close()
+        stderr_buf.close()


Is this actually useful? In real-world use stdin will either have reconfigure(), or it will be None, unless something changes in the future so that sys.stdin is no longer a TextIOWrapper. When using StringIO under pytest, that isn't the case, but that's why the sanitize_string main function skips the reconfigure step if pytest is in sys.modules. Testing a case that can only happen in the test suite seems unnecessary.

What would be better is if the test suite actually provided a TextIOWrapper to stdin so that we could always use reconfigure... that's actually possible according to https://stackoverflow.com/questions/44702487/how-to-create-and-write-to-a-textiowrapper-and-readlines, so that's probably what I'll do. Then this whole block can be skipped.

ArrayBolt3 · 2025-12-04T01:49:36Z

usr/lib/python3/dist-packages/sanitize_string/tests/sanitize_string.py

+    def test_unreadable_stdin_raises_error(self) -> None:
+        """Ensure unreadable stdin streams fail gracefully."""
+
+        class ExplodingStdin:
+            def read(self) -> str:  # pragma: no cover - invoked via main
+                raise ValueError("boom")
+
+        stdout_buf: StringIO = StringIO()
+        stderr_buf: StringIO = StringIO()
+        args_arr: list[str] = [self.argv0, "nolimit"]
+        with (
+            mock.patch.object(sys, "argv", args_arr),
+            mock.patch.object(sys, "stdin", ExplodingStdin()),
+            mock.patch.object(sys, "stdout", stdout_buf),
+            mock.patch.object(sys, "stderr", stderr_buf),
+        ):
+            exit_code: int = sanitize_string_main()
+        self.assertEqual(stdout_buf.getvalue(), "")
+        self.assertEqual(
+            stderr_buf.getvalue(),
+            "sanitize-string: failed to read from standard input\n",
+        )
+        self.assertEqual(exit_code, 1)
+        stdout_buf.close()
+        stderr_buf.close()


If stdin raises an error on read, something is very, very wrong most likely in the kernel. Reading from stdin should never fail, it should In that case exiting gracefully would actually be a bad thing.

ArrayBolt3 · 2025-12-04T01:50:06Z

usr/lib/python3/dist-packages/sanitize_string/tests/sanitize_string.py

+    def test_missing_read_attribute_returns_error(self) -> None:
+        """Validate stdin objects without read cause a clean failure."""
+
+        class NoReadStdin:  # pragma: no cover - exercised indirectly
+            pass
+
+        stdout_buf: StringIO = StringIO()
+        stderr_buf: StringIO = StringIO()
+        args_arr: list[str] = [self.argv0, "nolimit"]
+        with (
+            mock.patch.object(sys, "argv", args_arr),
+            mock.patch.object(sys, "stdin", NoReadStdin()),
+            mock.patch.object(sys, "stdout", stdout_buf),
+            mock.patch.object(sys, "stderr", stderr_buf),
+        ):
+            exit_code: int = sanitize_string_main()
+        self.assertEqual(stdout_buf.getvalue(), "")
+        self.assertEqual(
+            stderr_buf.getvalue(),
+            "sanitize-string: standard input is unreadable\n",
+        )
+        self.assertEqual(exit_code, 1)
+        stdout_buf.close()
+        stderr_buf.close()


Again, this can only ever happen in tests, at least if sanitize_string is used properly. This test is unnecessary.

ArrayBolt3 · 2025-12-04T01:52:38Z

usr/lib/python3/dist-packages/sanitize_string/sanitize_string.py

-            if "pytest" not in sys.modules:
-                sys.stdin.reconfigure(errors="ignore")  # type: ignore
-            untrusted_string = sys.stdin.read()
+            if "pytest" not in sys.modules and hasattr(sys.stdin, "reconfigure"):
+                sys.stdin.reconfigure(errors="ignore")  # type: ignore[arg-type]
+            try:
+                read_fn = sys.stdin.read
+            except AttributeError:
+                print(
+                    "sanitize-string: standard input is unreadable", file=sys.stderr
+                )
+                return 1
+            try:
+                untrusted_string = read_fn()
+            except (OSError, ValueError):
+                print(
+                    "sanitize-string: failed to read from standard input",
+                    file=sys.stderr,
+                )
+                return 1


This is again guarding against conditions that must not occur in real-world use, so this isn't useful.

As described above though, the if "pytest" not in sys.modules bit can probably be lost if we patch stdin with an actual TextIOWrapper.

ArrayBolt3 · 2025-12-04T02:28:04Z

Added reusable helper to assert sanitize-string usage output, exit codes, and standardized help text for argument validation scenarios.

Rejected, because we have a reusable helper in strip_markup's tests that we're already using. Instead, augmented the existing helper to accept an exit_code argument.

Introduced coverage for invalid argument combinations and absence of stdin to ensure graceful handling of edge conditions.

Accepted invalid argument combination tests. Rejected test for absence of stdin because it's already implemented in the `run-test script itself.

Skipped stdin.reconfigure when the stream lacks that method so sanitize-string can handle custom stdin implementations without crashing.

Rejected, because a real-world scenario will never have stdin lacking the reconfigure() method. Instead, augmented the mocked streams so that they also support reconfigure(), and removed the pytest module check.

Added regression coverage to ensure stdin streams without reconfigure still process input correctly even when pytest is absent from sys.modules.

Rejected, see above.

Added safeguards for stdin handling so sanitize-string now reports unreadable streams instead of crashing on missing read or read failures.

Rejected, because an unreadable stdin stream would indicate a very critical error in some other software on the system. A crash would be the expected behavior. (Note that in Python, closed stdin streams do not become unreadable, they simply stop returning data when read.)

Introduced regression tests covering stdin read errors and absent read methods to ensure graceful exits with clear messaging.

Rejected, see above.

Useful parts of this PR integrated in ArrayBolt3@ad74840. This PR can be closed.

Handle unreadable stdin in sanitize-string

1277d98

assisted-by-ai added the codex label Dec 2, 2025 — with ChatGPT Codex Connector

ArrayBolt3 reviewed Dec 4, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Handle unreadable stdin in sanitize-string #4

Handle unreadable stdin in sanitize-string #4

Uh oh!

assisted-by-ai commented Dec 2, 2025

Uh oh!

assisted-by-ai commented Dec 3, 2025

Uh oh!

ArrayBolt3 Dec 4, 2025

Uh oh!

ArrayBolt3 Dec 4, 2025

Uh oh!

ArrayBolt3 Dec 4, 2025

Uh oh!

ArrayBolt3 Dec 4, 2025

Uh oh!

ArrayBolt3 Dec 4, 2025

Uh oh!

ArrayBolt3 commented Dec 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Handle unreadable stdin in sanitize-string #4

Are you sure you want to change the base?

Handle unreadable stdin in sanitize-string #4

Uh oh!

Conversation

assisted-by-ai commented Dec 2, 2025

Summary

Testing

Uh oh!

assisted-by-ai commented Dec 3, 2025

Uh oh!

ArrayBolt3 Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

ArrayBolt3 Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

ArrayBolt3 Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

ArrayBolt3 Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

ArrayBolt3 Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

ArrayBolt3 commented Dec 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants