Skip to content

C#: Fix whitespace violations in parser#6977

Merged
knutwannheden merged 8 commits intomainfrom
feral-wasp
Mar 13, 2026
Merged

C#: Fix whitespace violations in parser#6977
knutwannheden merged 8 commits intomainfrom
feral-wasp

Conversation

@knutwannheden
Copy link
Contributor

Summary

Fixes several categories of C# parser whitespace violations where non-whitespace source code content was incorrectly placed into Space.Whitespace or Comment.Suffix fields. Adds a WhitespaceValidator visitor and integrates it into the test infrastructure so all RewriteTest-based tests automatically catch whitespace violations.

Changes

  • Add WhitespaceValidator that walks all AST nodes and asserts Space fields contain only whitespace
  • Integrate validator into RewriteTest so violations are caught automatically
  • Fix unsafe using directives and assembly-level attributes (categories 1-3)
  • Parse extern alias directives and file-scoped namespace usings/externs
  • Use JRightPadded for CompilationUnit and NamespaceDeclaration fields to match Java model
  • Parse property initializers into the existing Initializer field
  • Print property semicolons via PrintStatementTerminator instead of inline

Whitespace violation categories

  • 1. unsafe keyword in using directives (19 failures)
  • 2. Assembly/module-level attributes (10 failures)
  • 3. extern alias directives (3 failures)
  • 4. Type constraints on non-generic types — removed illegal test cases (2 failures)
  • 5. Primary constructor base calls (5 failures)
  • 6. Explicit interface implementations on methods (1+ failures)
  • 7. Explicit interface implementations on properties (1 failure)
  • 8. Property initializers (3 failures)
  • 9. Conversion operator explicit interface implementations (2 failures)
  • 10. using declaration statements (7 failures)
  • 11. Preprocessor directive comments (2+ failures)
  • 12. Miscellaneous edge cases (4 failures)

Test plan

  • All existing tests pass (1204 passing)
  • 38 remaining failures are from known unchecked categories above
  • Fix remaining categories in follow-up commits

…ssembly-level attributes

Add a WhitespaceValidator that detects non-whitespace content leaking
into Space fields, integrated into the test harness so all existing
tests catch these parser bugs. Fix two violation categories:

1. UsingDirective: add `unsafe` keyword support (C# 12+) to the C#-side
   AST, parser, printer, visitor, and RPC sender/receiver (Java side
   already had this field).

2. CompilationUnit: add `AttributeLists` field to parse assembly/module-
   level attributes instead of letting them leak into the EOF space.
   Updated AST, parser, printer, visitor, and both C#/Java RPC
   sender/receiver pairs.
…terns

Add `Externs` field to C# `CompilationUnit` and `NamespaceDeclaration`
to properly parse extern alias directives instead of letting them leak
into Space fields. Updated parser, printer, visitor, and both C#/Java
RPC sender/receiver pairs.

Also fix file-scoped namespace handling to iterate `fsns.Externs` and
`fsns.Usings`, which were previously skipped by the parser.
…lds to match Java model

The C# model now uses JRightPadded wrappers for externs and members,
matching the Java-side Cs.CompilationUnit and BlockScopeNamespaceDeclaration.
This eliminates fabricated JRightPadded wrappers in RPC sender/receiver
that were discarding pre-semicolon whitespace on round-trips.

The parser now captures the space before semicolons on extern alias and
using directives into JRightPadded.After via _pendingSemicolonSpace.
where clauses can only constrain declared type parameters, so
`class a where b : c { }` and `class a { void M() where b : c { } }`
are not legal C# and don't need parser support.
…rminator

Property initializers (e.g., `public int X { get; set; } = 10;`) were not
being parsed, causing the initializer text to leak into the accessor block's
End space as whitespace. Now parses `node.Initializer` into the existing
`JLeftPadded<Expression>` field and handles the trailing semicolon through
`_pendingSemicolonSpace`/`PrintStatementTerminator` for both expression-bodied
and initializer properties.
The InterfaceSpecifier field already existed on PropertyDeclaration but
was always passed as null. Now parses node.ExplicitInterfaceSpecifier
using the same pattern as events, indexers, and operators.
Record types can have both a block body and a trailing semicolon
(`record C { };`). The semicolon was not consumed, leaking into
CompilationUnit.Eof. Now consumed during parsing and printed via
PrintStatementTerminator using a Semicolon marker on the ClassDeclaration.
ProcessGapDirectives used Space.Format for directive prefixes, which
treats everything as raw whitespace. Comments like `//foo` before a
`#define` directive leaked into Space.Whitespace. Now uses CachedFormat
(which calls FormatWithComments) to properly structure comments.
@knutwannheden knutwannheden marked this pull request as ready for review March 13, 2026 12:34
@knutwannheden knutwannheden merged commit 706162c into main Mar 13, 2026
1 check passed
@knutwannheden knutwannheden deleted the feral-wasp branch March 13, 2026 12:34
@github-project-automation github-project-automation bot moved this from In Progress to Done in OpenRewrite Mar 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

1 participant