-
Notifications
You must be signed in to change notification settings - Fork 0
Cover remaining C1 controls in stdisplay tests #8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Cover remaining C1 controls in stdisplay tests #8
Conversation
|
stdisplay builds an allowlist of safe characters: printable ASCII (0x20–0x7E), newline, tab, and allowed SGR escape sequences. Everything else—including other control characters, Unicode beyond ASCII, and any escape sequence that is not an allowed SGR—is replaced with underscores. Conclusion Added a dedicated test case class covering hostile non-SGR escape sequences (OSC, DCS, kitty graphics, UTF-8 mode, and C1 CSI) to ensure they are neutralized by stdisplay. Added coverage for APC, PM, SOS, and C1 APC control strings to verify stdisplay redacts them outside the SGR allowlist. Added zero-width and right-to-left override examples to the shared escape-case corpus to verify they are reduced to safe placeholders. Introduced coverage for single C1 control characters and 8-bit string commands to confirm stdisplay neutralizes them. Added Unicode formatting controls (bidi overrides/isolates, zero-width joiner, BOM, and related markers) to the shared escape corpus to confirm they are reduced to safe placeholders. Added coverage for additional C0/C1 controls and nested OSC/DCS clipboard payloads to ensure stdisplay neutralizes them with underscores. Added regression coverage for additional C1 control bytes (NEL, PAD/SS3, and STS–EPA range) to confirm stdisplay replaces them with safe placeholders. |
| ] | ||
|
|
||
|
|
||
| class TestSTDisplayMaliciousCases(unittest.TestCase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These don't have to go in their own class.
| ("\x1b]0;evil title\x07", "_]0;evil title_"), | ||
| ("\x1bP1;2|malicious\x1b\\", "_P1;2|malicious_\\"), | ||
| ("\x1b_Gf=24,s=1,v=1;AAAA\x1b\\", "__Gf=24,s=1,v=1;AAAA_\\"), | ||
| ("\x1b%Gpayload", "_%Gpayload"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ESC % doesn't seem to be a C1 control sequence that I can tell.
All accepted. (I didn't audit to make sure all of the sequences in this PR are the sequences ChatGPT says it included, but really any weird Unicode and escape sequences should be caught by stdisplay, so I don't care too much about whether the commands being sanitized out are well-formed or even valid. We only cared about this for the CSI commands because stdisplay is supposed to semi-interpret and allow through some CSI commands.) Added the test cases in ArrayBolt3@bb095e1. (Did not use the redundant code from this.) This PR can be closed. |
Summary
Testing
Codex Task