C#: Fix whitespace violations in parser#6977
Merged
knutwannheden merged 8 commits intomainfrom Mar 13, 2026
Merged
Conversation
…ssembly-level attributes Add a WhitespaceValidator that detects non-whitespace content leaking into Space fields, integrated into the test harness so all existing tests catch these parser bugs. Fix two violation categories: 1. UsingDirective: add `unsafe` keyword support (C# 12+) to the C#-side AST, parser, printer, visitor, and RPC sender/receiver (Java side already had this field). 2. CompilationUnit: add `AttributeLists` field to parse assembly/module- level attributes instead of letting them leak into the EOF space. Updated AST, parser, printer, visitor, and both C#/Java RPC sender/receiver pairs.
…terns Add `Externs` field to C# `CompilationUnit` and `NamespaceDeclaration` to properly parse extern alias directives instead of letting them leak into Space fields. Updated parser, printer, visitor, and both C#/Java RPC sender/receiver pairs. Also fix file-scoped namespace handling to iterate `fsns.Externs` and `fsns.Usings`, which were previously skipped by the parser.
…lds to match Java model The C# model now uses JRightPadded wrappers for externs and members, matching the Java-side Cs.CompilationUnit and BlockScopeNamespaceDeclaration. This eliminates fabricated JRightPadded wrappers in RPC sender/receiver that were discarding pre-semicolon whitespace on round-trips. The parser now captures the space before semicolons on extern alias and using directives into JRightPadded.After via _pendingSemicolonSpace.
where clauses can only constrain declared type parameters, so
`class a where b : c { }` and `class a { void M() where b : c { } }`
are not legal C# and don't need parser support.
…rminator
Property initializers (e.g., `public int X { get; set; } = 10;`) were not
being parsed, causing the initializer text to leak into the accessor block's
End space as whitespace. Now parses `node.Initializer` into the existing
`JLeftPadded<Expression>` field and handles the trailing semicolon through
`_pendingSemicolonSpace`/`PrintStatementTerminator` for both expression-bodied
and initializer properties.
The InterfaceSpecifier field already existed on PropertyDeclaration but was always passed as null. Now parses node.ExplicitInterfaceSpecifier using the same pattern as events, indexers, and operators.
Record types can have both a block body and a trailing semicolon
(`record C { };`). The semicolon was not consumed, leaking into
CompilationUnit.Eof. Now consumed during parsing and printed via
PrintStatementTerminator using a Semicolon marker on the ClassDeclaration.
ProcessGapDirectives used Space.Format for directive prefixes, which treats everything as raw whitespace. Comments like `//foo` before a `#define` directive leaked into Space.Whitespace. Now uses CachedFormat (which calls FormatWithComments) to properly structure comments.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes several categories of C# parser whitespace violations where non-whitespace source code content was incorrectly placed into
Space.WhitespaceorComment.Suffixfields. Adds aWhitespaceValidatorvisitor and integrates it into the test infrastructure so allRewriteTest-based tests automatically catch whitespace violations.Changes
WhitespaceValidatorthat walks all AST nodes and asserts Space fields contain only whitespaceRewriteTestso violations are caught automaticallyunsafeusing directives and assembly-level attributes (categories 1-3)JRightPaddedforCompilationUnitandNamespaceDeclarationfields to match Java modelInitializerfieldPrintStatementTerminatorinstead of inlineWhitespace violation categories
unsafekeyword in using directives (19 failures)extern aliasdirectives (3 failures)usingdeclaration statements (7 failures)Test plan