Sanitize control/format characters in console logger output across all formatters#128741
Sanitize control/format characters in console logger output across all formatters#128741Copilot wants to merge 2 commits into
Conversation
|
Tagging subscribers to this area: @dotnet/area-extensions-logging |
Co-authored-by: rosebyte <14963300+rosebyte@users.noreply.github.com>
tarekgh
left a comment
There was a problem hiding this comment.
Review of the control character sanitization changes
The security motivation here is solid. Preventing terminal escape injection (ANSI sequences, bidi overrides, etc.) is worth doing. But the current implementation has several problems that need to be addressed before merging.
\n, \r, and \t should not be escaped
The sanitizer uses UnicodeCategory.Control which catches every character in U+0000-U+001F, including \n, \r, and \t. These are not security threats. They are structural formatting characters that the formatters depend on.
Both SimpleConsoleFormatter and SystemdConsoleFormatter have explicit downstream logic that operates on real newlines:
- SimpleConsoleFormatter.WriteMessage calls
message.Replace(Environment.NewLine, _newLineWithMessagePadding)to add indentation padding after each newline in exception text. - SystemdConsoleFormatter.WriteReplacingNewLine calls
message.Replace(Environment.NewLine, " ")to flatten multi-line messages into a single line (required by systemd/journald).
Because the sanitizer runs before these calls, it converts \n to the literal text \u000A. The downstream Replace calls then find no real newlines and become no-ops. This breaks multi-line exception formatting in Simple mode (no padding) and breaks the single-line guarantee in Systemd mode.
The fix should target only the actually dangerous characters: ESC (\x1B), BEL (\x07), backspace (\x08), bidi overrides (\u202E, \u202D), and similar. Not \n/\r/\t.
Double-escaping on the JSON path
Utf8JsonWriter with JavaScriptEncoder.Default already escapes all control characters (U+0000-U+001F) and all non-BasicLatin characters (including \u202E). Pre-sanitizing the strings is redundant and produces double-escaped output.
For example, ESC (\x1B) would normally appear as \u001B in the JSON output. With the sanitizer, it becomes \\u001B, which is a literal backslash followed by u001B. JSON consumers parsing these logs would see the text \u001B instead of the actual ESC character. The test changes in JsonConsoleFormatterTests.cs confirm this: they switched to expecting \\\\u000D\\\\u000A.
The sanitizer should be skipped entirely for the JsonConsoleFormatter path, or at minimum should not run when Utf8JsonWriter is handling the escaping.
Breaking change with default true
Setting SanitizeControlCharacters = true by default changes the output format for every existing application without any opt-in. Exception stack traces go from properly formatted multi-line output to a single blob containing \u000A literals. This will break log parsing tools and dashboards that expect the current format.
Consider either defaulting to false or narrowing the escape set so that \n/\r/\t pass through unchanged (which would make the default safe).
Minor issues
- API review: adding a public property to
ConsoleFormatterOptionsrequires going through the dotnet/runtime API review process. - Allocations: every exception log triggers a
StringBuilderallocation since exception strings always contain\n. Considerstring.CreateorValueStringBuilderfor the hot path. - Test coverage:
Log_ControlCharacters_SanitizationCanBeDisabledonly tests Simple and Systemd formatters. JSON opt-out is not covered. - Existing test expectations modified: the changes to
ConsoleLoggerTest.csnormalize broken formatting as the new expected output rather than preserving the original behavior.
Console logging currently writes untrusted control characters verbatim, allowing terminal escape/control effects and ambiguous output. This change sanitizes control/format characters across
Simple,Systemd, andJsonformatter paths, with an explicit opt-out for compatibility.Behavioral change
ConsoleFormatterOptions.SanitizeControlCharacters(defaulttrue).Cc/Cfcharacters as\uXXXXbefore writing log output.Implementation
ConsoleControlCharacterSanitizer.Public surface
Microsoft.Extensions.Logging.Consoleref API withSanitizeControlCharacters.Tests
SanitizeControlCharacters = false) on non-JSON formatters.