Skip to content

Wrong summary for generated regex #114889

Closed
@pawchen

Description

@pawchen

Description

    [GeneratedRegex(@"\b(ifn?def|endif)?")]
    private static partial Regex MyRegex1 { get; }

The summary generated should state it matches the string if before the optional n, but instead it states Match an empty string.:

Image

I examined the generated matching code and it did look for the 'i' then the 'f', so it appears only the summary was wrong.

Reproduction Steps

    [GeneratedRegex(@"\b(ifn?def|endif)?")]
    private static partial Regex MyRegex1 { get; }

Expected behavior

summary should be stated correctly

Actual behavior

summary was wrong

Regression?

No response

Known Workarounds

Don't look at it?

Configuration

VS 17.13.6
.NET 9

Other information

No response

Activity

dotnet-policy-service

dotnet-policy-service commented on Apr 22, 2025

@dotnet-policy-service
Contributor

Tagging subscribers to this area: @dotnet/area-system-text-regularexpressions
See info in area-owners.md if you want to be subscribed.

steveharter

steveharter commented on Apr 25, 2025

@steveharter
Contributor
pawchen

pawchen commented on Apr 25, 2025

@pawchen
Author

@steveharter I believe the source code for the generated summary is in this repo.

huoyaoyuan

huoyaoyuan commented on Apr 28, 2025

@huoyaoyuan
Member

Yes, the summary is generated by the Regex source generator. /cc @stephentoub

self-assigned this
on Apr 28, 2025
added and removed
untriagedNew issue has not been triaged by the area owner
on Apr 28, 2025
stephentoub

stephentoub commented on Apr 28, 2025

@stephentoub
Member

The problem is here:

case RegexNodeKind.Concatenate when child.Child(0) == startingLiteralNode && (startingLiteralNode.Kind is RegexNodeKind.One or RegexNodeKind.Set or RegexNodeKind.Multi):
// This is a concatenation where its first node is the starting literal we found and that starting literal
// is one of the nodes above that we know how to handle completely. This is a common
// enough case that we want to special-case it to avoid duplicating the processing for that character
// unnecessarily. So, we'll shave off that first node from the concatenation and then handle the remainder.
// Note that it's critical startingLiteralNode is something we can fully handle above: if it's not,
// we'll end up losing some of the pattern due to overwriting `remainder`.
remainder = child;
child = child.Child(0);
remainder.ReplaceChild(0, new RegexNode(RegexNodeKind.Empty, remainder.Options));
goto HandleChild; // reprocess just the first node that was saved; the remainder will then be processed below

The emitter is mutating the regex node tree, for convenience and under the assumption that after this point no one else will be looking at the tree. But the XML comment generator does.

locked and limited conversation to collaborators on Jun 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

    Development

    Participants

    @pawchen@stephentoub@huoyaoyuan@steveharter

    Issue actions

      Wrong summary for generated regex · Issue #114889 · dotnet/runtime