Add support for writing HTML literals using UTF-8 strings#83457
Open
chsienki wants to merge 14 commits intodotnet:mainfrom
Open
Add support for writing HTML literals using UTF-8 strings#83457chsienki wants to merge 14 commits intodotnet:mainfrom
chsienki wants to merge 14 commits intodotnet:mainfrom
Conversation
Implement auto-detection of UTF-8 WriteLiteral support for legacy .cshtml code generation. When a page's @inherits base class has a callable WriteLiteral(ReadOnlySpan<byte>) overload, HTML literals are emitted as C# UTF-8 string literals ("..."u8).
- FullyQualifiedInherits: namespaced type with fully-qualified @inherits - ShortNameInherits_WithUsing: documents that short names don't resolve for UTF-8 detection (GetTypeByMetadataName requires full qualification) - PartiallyQualifiedInherits: documents partial names don't resolve - SwitchesWhenOverloadAddedOrRemoved: uses fresh drivers per edit step Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add GetInheritsDirectiveContent and GetUsingDirectives extension methods on RazorCodeDocument for extracting @inherits and @using directives - Resolve short/aliased type names via augmented compilation with the document's @using directives when GetTypeByMetadataName fails - Dual-lookup Utf8SupportMap: per-file (filePath -> FQN) + per-type (FQN -> bool) to handle same @inherits text resolving differently - Use GetFullName() for metadata name formatting - Call HasCallableUtf8WriteLiteralOverload via string overload to avoid cross-compilation symbol issues - Add InheritsInfo nested record on DefaultUtf8WriteLiteralFeature - Tests: short name with @using, alias via _ViewImports, file-level alias, alias shadowing (CS0576 graceful fallback), fully-qualified Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Build one probe syntax tree with namespace-scoped usings for all entries that need resolution, instead of creating a separate augmented compilation per entry. This reduces O(N) AddSyntaxTrees calls to O(1). - Two-pass Create: fast path via GetTypeByMetadataName, then batch slow path - ResolveTypeNamesWithUsings takes CSharpCompilation directly - Split pipeline: extract @inherits first, then usings only for files that need it - Rename GetInheritsDirectiveContent to GetInheritsDirectiveValue - Make InheritsInfo fields non-nullable Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
GetInheritsDirectiveValue() now searches import syntax trees when the main document has no @inherits directive. The most specific _ViewImports wins, and the page's own @inherits overrides everything. Added tests for @inherits in _ViewImports (global and namespaced types) and cascading _ViewImports with override precedence. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The slow path for resolving @inherits type names previously skipped entries with no Razor @using directives. Since .cshtml files always have default MVC imports, this filter was ineffective. Removing it ensures types resolvable via C# global usings or the compilation's existing context are not missed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The slow path now uses GetFullMetadataName() which builds a proper CLR metadata name (with backtick arity for generics and + for nested types) instead of GetFullName() which produces C# display syntax that cannot be resolved by GetTypeByMetadataName. Added tests for generic base classes (single and multiple type params), generics in namespaces, nested generics, and generics from metadata references. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ToddGrun
reviewed
Apr 28, 2026
ToddGrun
reviewed
Apr 28, 2026
ToddGrun
reviewed
Apr 28, 2026
davidwengier
approved these changes
Apr 28, 2026
Member
davidwengier
left a comment
There was a problem hiding this comment.
This looks how I remember
- Use HashCodeCombiner in Utf8SupportMap.GetHashCode - Replace DescendantNodes() with ChildNodes() over the shallow probe tree - Drop the .ToArray() and iterate the namespace declarations with a foreach + explicit entryIndex Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Adds Razor support for emitting HTML literals as C# UTF-8 literals when a page base type exposes WriteLiteral(ReadOnlySpan<byte>), wiring that through the source-generator pipeline, codegen options, and tests.
Changes:
- Precomputes per-file UTF-8 support from
@inheritsand passes it into code generation via a newIUtf8WriteLiteralFeature. - Threads a new
WriteHtmlUtf8StringLiteralsoption through lowering andRuntimeNodeWriterso HTML literals can render as"..."u8. - Adds source-generator, integration, and node-writer coverage plus updated baselines.
Notable review findings:
- UTF-8 literal emission is not gated on C# 11+, so projects pinned to older language versions will get invalid generated code.
- The precomputed support map is built from raw
@inheritstext beforeTModelsubstitution, so common MVC@inherits MyBase<TModel>patterns will miss the new behavior.
Reviewed changes
Copilot reviewed 23 out of 23 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
src/Razor/src/Shared/Microsoft.AspNetCore.Razor.Test.Common/Language/CodeGeneration/TestCodeRenderingContext.cs |
Adds test hook for UTF-8 literal option. |
src/Razor/src/Compiler/test/Microsoft.NET.Sdk.Razor.SourceGenerators.UnitTests/TestFiles/RazorSourceGeneratorCshtmlTests/Utf8HtmlLiterals_WithoutOverload_UsesStringLiterals/Pages/Index_cshtml.g.cs |
New baseline for non-UTF-8 emission. |
src/Razor/src/Compiler/test/Microsoft.NET.Sdk.Razor.SourceGenerators.UnitTests/TestFiles/RazorSourceGeneratorCshtmlTests/Utf8HtmlLiterals_AutoDetectedFromInherits/Pages/Index_cshtml.g.cs |
New baseline for UTF-8 emission. |
src/Razor/src/Compiler/test/Microsoft.NET.Sdk.Razor.SourceGenerators.UnitTests/RazorSourceGeneratorCshtmlTests.cs |
Adds source-generator scenario coverage. |
src/Razor/src/Compiler/perf/Microsoft.AspNetCore.Razor.Microbenchmarks.Generator/Microsoft.AspNetCore.Razor.Microbenchmarks.Generator.csproj |
Updates perf harness transport package version. |
src/Razor/src/Compiler/Microsoft.CodeAnalysis.Razor.Compiler/src/SourceGenerators/SourceGeneratorProjectEngine.cs |
Passes UTF-8 support map into final generation phase. |
src/Razor/src/Compiler/Microsoft.CodeAnalysis.Razor.Compiler/src/SourceGenerators/RazorSourceGenerator.Helpers.cs |
Registers UTF-8 feature on generation engine. |
src/Razor/src/Compiler/Microsoft.CodeAnalysis.Razor.Compiler/src/SourceGenerators/RazorSourceGenerator.cs |
Builds and wires the precomputed support map. |
src/Razor/src/Compiler/Microsoft.CodeAnalysis.Razor.Compiler/src/Language/RazorProjectEngine.cs |
Adds UTF-8 detection pass to default features. |
src/Razor/src/Compiler/Microsoft.CodeAnalysis.Razor.Compiler/src/Language/RazorCodeGenerationOptions.Flags.cs |
Adds option flag bit. |
src/Razor/src/Compiler/Microsoft.CodeAnalysis.Razor.Compiler/src/Language/RazorCodeGenerationOptions.cs |
Exposes option and flag plumbing. |
src/Razor/src/Compiler/Microsoft.CodeAnalysis.Razor.Compiler/src/Language/RazorCodeGenerationOptions.Builder.cs |
Adds builder surface for the new option. |
src/Razor/src/Compiler/Microsoft.CodeAnalysis.Razor.Compiler/src/Language/RazorCodeDocumentExtensions.cs |
Extracts @inherits and @using data from syntax trees. |
src/Razor/src/Compiler/Microsoft.CodeAnalysis.Razor.Compiler/src/Language/DefaultRazorCSharpLoweringPhase.cs |
Uses per-document options during lowering. |
src/Razor/src/Compiler/Microsoft.CodeAnalysis.Razor.Compiler/src/Language/CodeGeneration/RuntimeNodeWriter.cs |
Emits HTML literals using UTF-8 option. |
src/Razor/src/Compiler/Microsoft.CodeAnalysis.Razor.Compiler/src/Language/CodeGeneration/CodeWriterExtensions.cs |
Adds UTF-8 suffix support to string literal writer. |
src/Razor/src/Compiler/Microsoft.CodeAnalysis.Razor.Compiler/src/CSharp/Utf8WriteLiteralDetectionPass.cs |
New pass that enables UTF-8 literals per file. |
src/Razor/src/Compiler/Microsoft.CodeAnalysis.Razor.Compiler/src/CSharp/IUtf8WriteLiteralFeature.cs |
New feature contract for support checks. |
src/Razor/src/Compiler/Microsoft.CodeAnalysis.Razor.Compiler/src/CSharp/DefaultUtf8WriteLiteralFeature.cs |
Implements support-map based detection logic. |
src/Razor/src/Compiler/Microsoft.CodeAnalysis.Razor.Compiler/src/CSharp/CompilationExtensions.cs |
Adds overload detection helpers on compilations. |
src/Razor/src/Compiler/Microsoft.AspNetCore.Razor.Language/test/RazorProjectEngineTest.cs |
Verifies default feature set includes new pass. |
src/Razor/src/Compiler/Microsoft.AspNetCore.Razor.Language/test/CodeGeneration/RuntimeNodeWriterTest.cs |
Adds node-writer UTF-8 output tests. |
src/Razor/src/Compiler/Microsoft.AspNetCore.Mvc.Razor.Extensions/test/IntegrationTests/CodeGenerationIntegrationTest.cs |
Adds integration coverage for UTF-8 literal emission. |
Comment on lines
+43
to
+46
| var baseTypeName = baseType.BaseType.Content; | ||
| if (_utf8Feature.IsSupported(codeDocument.Source.FilePath, baseTypeName)) | ||
| { | ||
| documentNode.Options = documentNode.Options.WithFlags(writeHtmlUtf8StringLiterals: true); |
Comment on lines
+271
to
+285
| var utf8SupportMap = parsedDocuments | ||
| .Select(static (item, _) => | ||
| { | ||
| var codeDocument = item.Item3.CodeDocument; | ||
| return (codeDocument, InheritsValue: codeDocument.GetInheritsDirectiveValue()); | ||
| }) | ||
| .Where(static item => item.InheritsValue is not null) | ||
| .Select(static (item, _) => new DefaultUtf8WriteLiteralFeature.InheritsInfo( | ||
| item.codeDocument.Source.FilePath ?? string.Empty, item.InheritsValue!, item.codeDocument.GetUsingDirectives())) | ||
| .Collect() | ||
| .Combine(declCompilation) | ||
| .Select(static (pair, _) => | ||
| { | ||
| var (inheritsInfos, compilation) = pair; | ||
| return DefaultUtf8WriteLiteralFeature.Utf8SupportMap.Create(inheritsInfos, compilation); |
- Utf8SupportMap.Create returns Empty when the consuming compilation is not C# 11+ so older projects don't get invalid 'u8' literals. - Add 'using TModel = global::System.Object;' to each probe namespace so '@inherits Base<TModel>' (paired with @model) still resolves; WriteLiteral overloads don't depend on the model type argument. - Add tests covering both fixes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…literals-refactor # Conflicts: # src/Razor/src/Compiler/Microsoft.CodeAnalysis.Razor.Compiler/src/Language/CodeGeneration/RuntimeNodeWriter.cs # src/Razor/src/Compiler/Microsoft.CodeAnalysis.Razor.Compiler/src/Language/RazorProjectEngine.cs
Comment on lines
+50
to
+61
| for (var currentType = type; currentType is not null; currentType = currentType.BaseType) | ||
| { | ||
| foreach (var member in currentType.GetMembers("WriteLiteral")) | ||
| { | ||
| if (member is IMethodSymbol | ||
| { | ||
| IsStatic: false, | ||
| ReturnsVoid: true, | ||
| Parameters: [{ Type: var paramType }] | ||
| } method && | ||
| SymbolEqualityComparer.Default.Equals(paramType, readOnlySpanOfByte) && | ||
| compilation.IsSymbolAccessibleWithin(method, type)) |
Comment on lines
+118
to
+126
| var usings = new List<string>(); | ||
| CollectUsings(syntaxTree, usings); | ||
|
|
||
| if (codeDocument.TryGetImportSyntaxTrees(out var importSyntaxTrees)) | ||
| { | ||
| foreach (var importTree in importSyntaxTrees) | ||
| { | ||
| CollectUsings(importTree, usings); | ||
| } |
- HasCallableUtf8WriteLiteralOverload no longer treats the base type itself as the lookup context. Private (and assembly-restricted internal) overloads on a referenced base now correctly fall back to string literals instead of producing inaccessible 'u8' calls. - GetUsingDirectives now returns import usings before page usings, matching DefaultRazorIntermediateNodeLoweringPhase. Probe compilations now bind aliases that originate in _ViewImports.cshtml the same way the final generated code does. - Add tests for private/internal-on-referenced-assembly fallback, protected detection, and import-defined alias resolution. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The Roslyn-wide BannedSymbols.txt forbids the (string, ...) overload in favor of the (SourceText, ...) overload. The Correctness_Analyzers CI leg builds with --runanalyzers --warnaserror so this fails the build. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Comment on lines
+120
to
+132
| var usings = new List<string>(); | ||
|
|
||
| if (codeDocument.TryGetImportSyntaxTrees(out var importSyntaxTrees)) | ||
| { | ||
| foreach (var importTree in importSyntaxTrees) | ||
| { | ||
| CollectUsings(importTree, usings); | ||
| } | ||
| } | ||
|
|
||
| CollectUsings(syntaxTree, usings); | ||
|
|
||
| return [.. usings]; |
Comment on lines
+19
to
+31
| /// <summary> | ||
| /// Determines whether the type identified by <paramref name="typeMetadataName"/> has a callable | ||
| /// instance <c>WriteLiteral(ReadOnlySpan<byte>)</c> overload accessible from that type. | ||
| /// </summary> | ||
| public static bool HasCallableUtf8WriteLiteralOverload(this Compilation compilation, string typeMetadataName) | ||
| { | ||
| var type = compilation.GetTypeByMetadataName(typeMetadataName); | ||
| if (type is null || type.TypeKind == TypeKind.Error) | ||
| { | ||
| return false; | ||
| } | ||
|
|
||
| return compilation.HasCallableUtf8WriteLiteralOverload(type); |
Comment on lines
+59
to
+63
| /// <item>Per-file: maps <c>(filePath, rawInheritsText)</c> to a fully-qualified type name</item> | ||
| /// <item>Per-type: maps fully-qualified type name to <see langword="bool"/></item> | ||
| /// </list> | ||
| /// This handles cases where the same <c>@inherits</c> text resolves to different types | ||
| /// in different files (e.g., via <c>@using</c> aliases). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Builds on #12848 by @DamianEdwards, refactoring the UTF-8 HTML literal detection to use a pipeline-friendly pre-computed map approach.
When a
.cshtmlpage's@inheritsbase class has a callableWriteLiteral(ReadOnlySpan<byte>)overload, HTML literals are emitted as C# UTF-8 string literals ("..."u8), enabling direct binding to the byte-span overload and avoiding UTF-16→UTF-8 transcoding at runtime.Key changes from the original PR
Utf8SupportMapinstead of per-file probe compilations -- the source generator extracts@inheritsbase type names from parsed syntax trees, combines with the declaration compilation to build a value-comparable map, and passes it toProcessRemainingIUtf8WriteLiteralFeatureengine feature withDefaultUtf8WriteLiteralFeatureimplementation backed by the mapUtf8WriteLiteralDetectionPasstoCSharpnamespaceTests
u8vs string literals).cshtmlfiles with different@inherits, only one uses UTF-8@inheritsdirective → string literals (default base class)@inheritsdetectionCloses dotnet/razor#8429
Microsoft Reviewers: Open in CodeFlow