Skip to content

fix(csharp): decode HTML entities in XML documentation comments#10412

Merged
fern-support merged 4 commits intomainfrom
devin/1762542000-fix-csharp-xml-entity-bug
Nov 10, 2025
Merged

fix(csharp): decode HTML entities in XML documentation comments#10412
fern-support merged 4 commits intomainfrom
devin/1762542000-fix-csharp-xml-entity-bug

Conversation

@devin-ai-integration
Copy link
Copy Markdown
Contributor

Description

Fixes bug where HTML entities in API documentation were causing C# compilation errors.

Link to Devin run: https://app.devin.ai/sessions/6ac5baab8cd14fc28a0a95bc894527f5
Requested by: judah@buildwithfern.com (@jsklan)

Problem

When OpenAPI specs or Fern definitions contained HTML entities like +, −, ×, etc. in documentation strings, the C# generator copied them directly into XML documentation comments. Since C# XML only recognizes 5 predefined entities (<, >, &, ", '), this caused compilation errors:

error CS1570: XML comment has badly formed XML -- 'Reference to undeclared entity 'plus'.'

This affected real-world SDKs like the Square .NET SDK, blocking CI builds and preventing PR merges.

Solution

Added decodeHtmlEntities() method to XmlDocWriter that:

  1. Decodes ~140 common HTML entities to their Unicode characters (e.g., ++, ××)
  2. Handles numeric character references (e.g.,   → space,   → space)
  3. Runs BEFORE XML escaping, so valid XML entities are still properly escaped

Example transformation:

// Before (invalid XML):
/// <summary>Format is UTC &plus; offset (e.g., &plus;05:30)</summary>

// After (valid XML):
/// <summary>Format is UTC + offset (e.g., +05:30)</summary>

Changes Made

  • Modified generators/csharp/codegen/src/ast/core/XmlDocWriter.ts:
    • Added decodeHtmlEntities() method with comprehensive entity mapping
    • Integrated into escapeXmlDocContent() pipeline
  • Added test fixture csharp-xml-entities to verify the fix
  • Generated seed output demonstrates HTML entities are properly decoded while valid XML entities are preserved

Testing

  • Seed tests pass with new csharp-xml-entities fixture
  • Generated C# code compiles successfully
  • Verified HTML entities (&plus;, &minus;, &times;, &divide;, &nbsp;, &hellip;, &middot;, &copy;) are decoded to Unicode
  • Verified valid XML entities (&lt;, &gt;, &amp;) are preserved correctly

Review Focus

Critical to verify:

  1. Order of operations: HTML entity decoding happens BEFORE XML escaping (see line 117 in XmlDocWriter.ts). Reversing this would break the fix.
  2. Entity map completeness: The map includes ~140 entities covering common mathematical, typographical, Greek, and special characters. Edge cases with obscure entities should fall back to numeric character reference handling.
  3. Generated output: Review seed/csharp-sdk/csharp-xml-entities/no-custom-config/src/SeedCsharpXmlEntities/Types/TimeZoneModel.cs to see the entities properly decoded in the final output.

Performance note: The implementation uses replaceAll for each entity in a loop. For typical documentation strings this is negligible, but it's worth noting for very large doc strings.

Fixes bug where HTML entities like &plus;, &minus;, &times;, etc. in API
descriptions were being copied directly into C# XML documentation comments,
causing compilation errors. XML only recognizes 5 predefined entities
(&lt;, &gt;, &amp;, &quot;, &apos;), so other HTML entities are invalid.

The fix decodes HTML entities to their actual Unicode characters before
writing them to XML documentation, preventing invalid XML entity errors
while preserving valid XML entities.

Added test fixture csharp-xml-entities to verify the fix works correctly.

Co-Authored-By: judah@buildwithfern.com <jsklan.development@gmail.com>
@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

Co-Authored-By: judah@buildwithfern.com <jsklan.development@gmail.com>
@fern-support fern-support merged commit 96ca948 into main Nov 10, 2025
498 of 500 checks passed
@fern-support fern-support deleted the devin/1762542000-fix-csharp-xml-entity-bug branch November 10, 2025 20:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants