Skip to content

Fix openapi schema xml comments handling for referenced schemas #62213

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

desjoerd
Copy link
Contributor

@desjoerd desjoerd commented Jun 2, 2025

Fix openapi schema xml comments handling for nested schemas

Fix applying annotations from xml on properties which would apply to referenced schemas.

Description

Changes:

  • Change the order of applying XML comments on schemas to make sure the property comment is leading (for applying the annotation on the reference)
  • Don't apply property comments on referenced schemas because this causes issues when a schema is referenced multiple times, it would get the description from the last property.

Fixes #61965

@github-actions github-actions bot added the needs-area-label Used by the dotnet-issue-labeler to label those issues which couldn't be triaged automatically label Jun 2, 2025
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Jun 2, 2025
@martincostello martincostello added feature-openapi area-minimal Includes minimal APIs, endpoint filters, parameter binding, request delegate generator etc and removed needs-area-label Used by the dotnet-issue-labeler to label those issues which couldn't be triaged automatically labels Jun 2, 2025
@martincostello
Copy link
Member

Did you mean to include 14dd22b in this PR?

@desjoerd
Copy link
Contributor Author

desjoerd commented Jun 2, 2025

No, I am planning to rebase/cherry-pick when #62212 is merged. Otherwise I cannot run the unit tests 😅. I marked it as draft because it's also not completely correct at the moment. There needs to be some extra conditions to check whether it's a reference or not (to handle the two different descriptions, on the property with reference and on the schema in components).

I currently only get either the correct description on the reference and wrong on the schema or the other way around.

I am open to have a call about it, maybe tomorrow? Or to continue the discussion here or on the referenced issue.

@desjoerd
Copy link
Contributor Author

desjoerd commented Jun 3, 2025

I've done some more fiddling. With the current ordering of applying schema transformers and resolving references it's as far as I can see impossible to give a different description to the Schema Reference and the Schema itself.

Current behavior for the following classes:

/// <summary>
/// An address.
/// </summary>
public class AddressNested
{
    public string Street { get; set; }
}

public class Company
{
    /// <summary>
    /// Billing address.
    /// </summary>
    public AddressWithSummary BillingAddressClassWithSummary { get; set; }
    
    /// <summary>
    /// Visiting address.
    /// </summary>
    public AddressWithSummary VisitingAddressClassWithSummary { get; set; }
}
  1. Apply schema transformer on Company
  2. Apply schema transformer on AddressWithSummary with the PropertyInfo of BillingAddressClassWithSummary
  3. Apply schema transformer on AddressWithSummary with the PropertyInfo of VisitingAddressClassWithSummary
  4. Recursivly resolve schema references.

If we would copy the description from the AddressWithSummary schema it is either the description from the class or the last property referencing this schema.

I think the optional solution should be:
BillingAddress Property -> SchemaReference with description BillingAddress
VisitingAddress Property -> SchemaReference with description Visiting Address
AddressWithSummary Schema -> Schema with description An Address

Without changing the flow of applying schema transformers and resolving schema references we currently have to maybe go for:

if schema has metadata `x-schema-reference-id`:
    set description based on class
else
    if property has description
        set description based on property comment
    else
        set description based on class

This would mean that property comments are not used for schemas which become references, which in my opinion is the most "valid" option without changing the order.

If we would change the order I propose to go for something like:

  1. Resolve Schema References on Company
  2. Apply schema transformer on Company
    2.1. XmlCommentTransformer Set description on company schema based on class
    2.2. XmlCommentTransformer Apply descriptions on all properties of Company (which are already references)
  3. Apply schema transformer on AddressWithSummary
    2.1. XmlCommentTransformer Set description on address schema based on class
    2.2. XmlCommentTransformer Apply descriptions on all properties of Address which is the Street property

@desjoerd
Copy link
Contributor Author

For now I fixed the issue of having two schema's inherit the last property comment as a description by excluding referenced schemas from getting the description from the property. This can be seen as a downgrade in functionality, but it would otherwise give wrong output.

A schema transformer doesn't get a hold of the actual SchemaReference itself but after transforming it get's resolved to a reference in the OpenApiSchemaService.ResolveReferenceForSchema. To support getting the description on the SchemaReference it requires a very delicate order of the schema transformers. This would be quite a big change.

The SchemaReference description should also ONLY come from the property and not from the type, as that would give duplicate descriptions and a noisy openapi specification.

@desjoerd desjoerd marked this pull request as ready for review July 31, 2025 14:46
@Copilot Copilot AI review requested due to automatic review settings July 31, 2025 14:46
@desjoerd desjoerd requested review from captainsafia and a team as code owners July 31, 2025 14:46
Copilot

This comment was marked as outdated.

@@ -516,9 +516,6 @@ await SnapshotTestHelper.VerifyOpenApi(compilation, document =>
Assert.Equal("This class validates the behavior for mapping\ngeneric types to open generics for use in\ntypeof expressions.", genericParent.Description, ignoreLineEndingDifferences: true);
Assert.Equal("This property is a nullable value type.", genericParent.Properties["id"].Description);
Assert.Equal("This property is a nullable reference type.", genericParent.Properties["name"].Description);
Assert.Equal("This property is a generic type containing a tuple.", genericParent.Properties["taskOfTupleProp"].Description);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to remove these assertions as the schema is a reference and the comment is from a property.

Other solution/fix would be to inline these schemas by adding a CreateSchemaReferenceId which returns null for Tuples.

@desjoerd desjoerd changed the title Fix/openapi schema xml comments ordering Fix openapi schema xml comments handling for referenced schemas Jul 31, 2025
@desjoerd
Copy link
Contributor Author

@captainsafia, let me know if I need to split this PR into "the fix" and "the extra openapi snapshot tests" which I've added. I mostly added them so I can review the documents themselves. It's maybe making this PR bigger and harder to review and merge.

@captainsafia
Copy link
Member

@desjoerd I wonder if we should consider an approach that doesn't rely on encapsulating the behavior entirely in the transformer for now. Since OpenAPI 3.1 permits any keywords adjacent to a $ref we could update the implementation of the XML transformer to support setting property-specific descriptions in metadata.

Then in the reference resolution flow for properties (ref) we can add some special handling that plucks the property-level descriptions from metadata and sets them alongisde the references that are generated there. In 3.1, the following:

var schemaReference = new OpenApiSchemaReference(referenceId, workingDocument);
schemaReference.Description = originalSchema.Metadata["x-xml-description"];

Should produce:

{
  "description": "Description from x-xml-description metadata",
  "$ref": "#/components/schemas/TargetSchema"
}

@desjoerd
Copy link
Contributor Author

desjoerd commented Aug 3, 2025

@captainsafia it works but it would mean creating x-reference-<annotation> for all annotations possible on a reference (https://json-schema.org/understanding-json-schema/reference/annotations#annotations).

I created a PR on top of this one, very quick and dirty, but for the gist: desjoerd#11

I think the issue is also present for [Attributes] in

if (context.PropertyInfo is { AttributeProvider: { } attributeProvider })
{
if (attributeProvider.GetCustomAttributes(inherit: false).OfType<ValidationAttribute>() is { } validationAttributes)
{
schema.ApplyValidationAttributes(validationAttributes);
}
if (attributeProvider.GetCustomAttributes(inherit: false).OfType<DefaultValueAttribute>().LastOrDefault() is DefaultValueAttribute defaultValueAttribute)
{
schema.ApplyDefaultValue(defaultValueAttribute.Value, context.TypeInfo);
}
if (attributeProvider.GetCustomAttributes(inherit: false).OfType<DescriptionAttribute>().LastOrDefault() is DescriptionAttribute descriptionAttribute)
{
schema[OpenApiSchemaKeywords.DescriptionKeyword] = descriptionAttribute.Description;
}
}
because that is also applied on all referenced schemas. When a schema is used multiple times in different properties it would override the description, but it should actually be on the reference.

The moment we add all x-reference-<annotation> we enable modifying the Schema Reference from schema transformers, but it's not a nice API like OpenApiSchemaReference.

I think the choice is between exposing the attributes of OpenApiSchemaReference by using Metadata vs some other way. The other way is harder to do because that would require a .NET Api change instead of a behavioral api change.

@martincostello maybe you have some ideas as well?

@desjoerd
Copy link
Contributor Author

desjoerd commented Aug 3, 2025

I've been thinking, the options I can think of are:

  1. Transform IJsonSchema instead of the actual class which is implemented by both the schema and the reference. It's binary incompatible but it should not affect new compiled code as it has the same properties.

  2. Add the JsonSchemaReference to the context Schema Transformer Context.

  3. Work with Metadata and hoist it over to the reference when resolving.

As a library author and customizer of my OpenApi I would prefer to be able to do almost everything from transformers or the exposed Api via the OpenApi Options. Things like applying data annotations are now internal, but I think they could be done as Schema Transformers. Giving the option also to disable DataAnnotations support for Openapi (for example when using Fluent Validations).

(ps: braindump from my phone)

@captainsafia
Copy link
Member

Yeah, I think in an ideal world we would expand the API for schema transformers to support application on both target schemas and schema references but that would require some considerable API work and likely wouldn't land in .NET 10 at this point. I figure that special casing handling for summary/description on XML comments might be a suitable tactical change.

I do like the idea of shifting more things out of the JsonNode layer to the OpenApiSchema -- particular the validation attributes you mentioned. I could see a model where these defaults are registered as transformers that users can add/remove from their OpenApiOptions.

With regard the the options presented, I considered #1 but wasn't sure if that would be confusing considering there is a legacy of transformers/filters/processors applying on raw schemas and not references. However, that decision does predate the existence of OpenApiSchemaReference.

#2 feels doable. I imagine the new property would be nullable. When non-null it indicates the transformer is applying on a reference. In the implementation, we would have two passes of transformers, on the target and the reference. This one seems doable to fit in for .NET 10 and might solve this problem.

(ditto: brain dump from my phone 😅)

@desjoerd
Copy link
Contributor Author

desjoerd commented Aug 4, 2025

For Option 1: I forgot that only the get is exposed on IOpenApiSchema. So that would be a breaking change for all existing transformers. Requiring a typecheck and cast before transforming. So I think this is unwanted.

Maybe then Option 2 is best. Only weird thing is that it is a property on the Context (which should be treated as read-only), I've seen some confusion around this. Could also be an extra parameter (schema, schemaReference, context) => but that is also a breaking change, but less impactfull then Option 1.
On the multiple passes, it's one extra as there is already a transform for each property, only not for the schema treated as a root itself. In theory for this instance it would not even need an extra pass on the schema alone.

I think we can do either one of the options as a follow up PR. And treat this PR as a mitigation for modifying the referenced schema from the property comments.

I think the follow ups would be:

  • Check for Attributes as well whether the same problem is there for referenced schemas (inheriting comments & metadata from properties where the last one wins)
  • Extend the api (or use metadata) to allow customizing references
  • Enhance the openapi references based on the property metadata.

@captainsafia
Copy link
Member

Only weird thing is that it is a property on the Context (which should be treated as read-only), I've seen some confusion around this. Could also be an extra parameter (schema, schemaReference, context) => but that is also a breaking change, but less impactfull then Option 1.

Another option is to introduce a completely new transformer type IOpenApISchemaReferenceTransformer with it's own context object and transform signature and avoid clobbering over the existing one.

I think the follow ups would be:

Agree with this ordering. For 2, we might have to rely on the metadata approach to start given the need to do API review, especially with RC1 snap coming closer.

@desjoerd
Copy link
Contributor Author

desjoerd commented Aug 4, 2025

I thought so in regards to Api changes.

I was also thinking about a Schema Reference Transformer. I do not know whether it is allowed to have an Api change so late in this stage, else it's for .NET 11.

Tomorrow evening (CEST) I've got some time and will add the schema "Metadata" approach in this PR and do some checks on the Description attribute. Then it will be correct and have the ability to set the description for references.

Only note is for the Metadata approach that people (like me 😅) should not depend on it because it will change when adding a "neat" way. But then again, adding annotations on references is something which is OpenApi 3.1 specific. In Openapi 3.0 a ref completely replaces the whole object.

@captainsafia
Copy link
Member

Only note is for the Metadata approach that people (like me 😅) should not depend on it because it will change when adding a "neat" way.

Yeah, admittedly, we don't do anything very aggressive to guard against people taking a dependency on the values we populate into Metadata. They're just harder to discover because they are never serialized. We may want to put a note in the docs somewhere that populated metadata is an implementation detail and not part of the public API contract.

…d Update tests to write in the InvariantCulture
Update handling of nested schemas and referenced schemas
Update XmlCommentGenerator.Emitter.cs to skip applying property descriptions when it's a schema reference.

Schema transformers get the full schema and not the schema reference. So when the description is updated it would update it for all locations.
@desjoerd desjoerd force-pushed the fix/openapi-schema-xml-comments-ordering branch from 7b131bb to 81df677 Compare August 5, 2025 13:18
@desjoerd desjoerd force-pushed the fix/openapi-schema-xml-comments-ordering branch from d781b57 to c6f4f23 Compare August 5, 2025 16:43
@desjoerd
Copy link
Contributor Author

desjoerd commented Aug 5, 2025

@captainsafia I've pushed my changes.

  • Removed the openapi.json snapshots
  • Add metadata properties for the reference description and example
  • Added unit tests and reverted some removed assertions as eveything arounc XML comments should be working.

Remarks:
I noticed that for the [Description] attribute we've got the same problem around OpenApiReferences (where the last property Description wins).

Also with the current ordering of applying comments and the [Description] attribute would mean that the XML comments win. Where I would say it should be the [Description] attribute should win.

I would say, let's address these in follow ups as well.

@desjoerd desjoerd requested a review from Copilot August 5, 2025 21:06
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes an issue with OpenAPI XML comment handling for referenced schemas. The main problem was that property comments were incorrectly applied to referenced schemas, causing conflicts when the same schema was referenced multiple times.

Key changes:

  • Reordered XML comment application to prioritize type comments over property comments for referenced schemas
  • Added logic to differentiate between inlined schemas and schema references
  • Implemented metadata-based approach to store property comments for schema references

Reviewed Changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
XmlCommentGenerator.Emitter.cs Modified schema transformer to handle referenced vs inlined schemas differently
OpenApiDocumentExtensions.cs Added logic to apply stored metadata as description/examples on schema references
OpenApiSchemaService.cs Updated validation attribute application to only apply to inlined schemas
OpenApiConstants.cs Added constants for reference description and example annotations
SchemaTests.cs Added comprehensive test for XML comments on schema references
Various snapshot files Updated test snapshots to reflect the new XML comment handling logic

@@ -101,7 +101,8 @@ internal sealed class OpenApiSchemaService(
{
schema.ApplyNullabilityContextInfo(jsonPropertyInfo);
}
if (context.PropertyInfo is { AttributeProvider: { } attributeProvider })
var isInlinedSchema = schema["x-schema-id"] is null;
if (isInlinedSchema && context.PropertyInfo is { AttributeProvider: { } attributeProvider })
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is isInlinedSchema added here? I'm slightly surprised this doesn't break any tests since this would mean that we wouldn't apply attributes to properties that are referenced by ref. A scenario like the following:

public class ComplexType
{
  [Description("FooBarBaz")]
  public AnotherComplexType PropertyName { get; }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was suprised as well that the Unit Tests where still passing. I put this in by accident for the follow-up PR to fix the Description annotations.

They are also applied to the Actual Schemas (the last property description wins).

I will revert this to keep it focused.

Copy link
Contributor Author

@desjoerd desjoerd Aug 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted. The follow up will be something like desjoerd#12 (need to add some tests which would fail now :)).

@desjoerd
Copy link
Contributor Author

desjoerd commented Aug 7, 2025

@captainsafia I've reverted the change on the schema service, will be part of a follow-up 😎 (this one: desjoerd#12).

@desjoerd desjoerd requested a review from captainsafia August 7, 2025 10:31
@captainsafia captainsafia merged commit 8d2a495 into dotnet:main Aug 7, 2025
29 checks passed
@dotnet-policy-service dotnet-policy-service bot added this to the 10.0-rc1 milestone Aug 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-minimal Includes minimal APIs, endpoint filters, parameter binding, request delegate generator etc community-contribution Indicates that the PR has been added by a community member feature-openapi
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[OpenApi] .NET 10, XML schema transformer applies property comments on referenced schema
3 participants