Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Obsolete some protected members of Regex{Runner} #62573

Closed
stephentoub opened this issue Dec 9, 2021 · 8 comments · Fixed by #84812
Closed

Obsolete some protected members of Regex{Runner} #62573

stephentoub opened this issue Dec 9, 2021 · 8 comments · Fixed by #84812
Assignees
Labels
api-approved API was approved in API review, it can be implemented area-System.Text.RegularExpressions breaking-change Issue or PR that represents a breaking API or functional change over a prerelease. needs-breaking-change-doc-created Breaking changes need an issue opened with https://github.com/dotnet/docs/issues/new?template=dotnet
Milestone

Comments

@stephentoub
Copy link
Member

stephentoub commented Dec 9, 2021

Regex and RegexRunner have a relatively large protected surface area in support of the old CompileToAssembly, which is now obsolete. Some of that functionality is used by code generated by the source generator, but some of the surface area is completely defunct, and some was never used (who knows why it was exposed initially).

namespace System.Text.RegularExpressions;

public class Regex
{
+   [Obsolete(...)]
    protected void InitializeReferences();

+   [Obsolete(...)]
    protected bool UseOptionC()

+   [Obsolete(...)]
    protected bool UseOptionR();
}

public abstract class RegexRunner
{
+   [Obsolete(...)]
    protected Match? Scan(Regex regex, string text, int textbeg, int textend, int textstart, int prevlen, bool quick) => Scan(regex, text, textbeg, textend, textstart, prevlen, quick, regex.MatchTimeout);

+   [Obsolete(...)]
    protected internal Match? Scan(Regex regex, string text, int textbeg, int textend, int textstart, int prevlen, bool quick, TimeSpan timeout)

+   [Obsolete(...)]
    protected static bool CharInSet(char ch, string set, string category);
}

I would like to also make that last CharInSet not only obsolete but also throw new NotSupportedException(). This was never used by the implementation, and we've never generated assets that use whatever format it's trying to support.

@stephentoub stephentoub added this to the 7.0.0 milestone Dec 9, 2021
@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Dec 9, 2021
@ghost
Copy link

ghost commented Dec 9, 2021

Tagging subscribers to this area: @dotnet/area-system-text-regularexpressions
See info in area-owners.md if you want to be subscribed.

Issue Details

Regex and RegexRunner have a relatively large protected surface area in support of the old CompileToAssembly, which is now obsolete. Some of that functionality is used by code generated by the source generator, but some of the surface area is defunct (at least for external consumption... some is still used internally). We should consider obsoleting:
RegexRunner.CharInSet
RegexRunner.Crawl
RegexRunner.DoubleCrawl
RegexRunner.DoubleStack
RegexRunner.DoubleTrack
RegexRunner.EnsureStorage
RegexRunner.Scan
Regex.UseOptionC
Regex.UseOptionR

Author: stephentoub
Assignees: -
Labels:

area-System.Text.RegularExpressions

Milestone: 7.0.0

@GrabYourPitchforks
Copy link
Member

If there's any way to remove the RegexRunner.runtext* fields, that would help as well. I've been looking at possible codegen optimizations within the Regex type itself, but they're stymied by the fact that those fields are accessible by external code and thus can't be refactored in a way that would improve throughput.

@stephentoub
Copy link
Member Author

stephentoub commented May 7, 2022

If there's any way to remove the RegexRunner.runtext* fields, that would help as well.

Any in particular?

These are used by external Regex implementations, those created by CompileToAssembly and the source generator. Only some of them are used by the source generated implementations, but even if we found a different way to pass that state in and around, removing these fields would break any existing assemblies generated by CompileToAssembly.

That said, CompileToAssembly is obsoleted as of .NET 7, so we're talking about compat with assemblies generated on .NET Framework. At some point in the future, we could consider dropping that compat support.

We contemplated having the new Scan virtual method accept and return all the necessary state, but I'm not sure that would be a net win. It would also be complicated, e.g. various helpers like Capture need access to it.

Suggestions?

@GrabYourPitchforks
Copy link
Member

Suggestions?

Honestly, I don't know. I'm mostly lamenting that the int fields are true fields rather than properties, since it's preventing me from moving the actual storage to a more optimal location. Maybe it's one of those things that we just accept as overhead for now and revisit if we ever make more radical changes to Regex in the future.

@stephentoub
Copy link
Member Author

@joperezr, were you going to push on this for .NET 7?

@joperezr
Copy link
Member

Sorry @stephentoub I missed the ping. Unless you think otherwise, since this is just obsoleting I'm fine with doing this on 8.0

@joperezr joperezr modified the milestones: 7.0.0, 8.0.0 Jul 18, 2022
@stephentoub stephentoub self-assigned this Apr 1, 2023
@stephentoub stephentoub added the api-ready-for-review API is ready for review, it is NOT ready for implementation label Apr 13, 2023
@bartonjs bartonjs added the breaking-change Issue or PR that represents a breaking API or functional change over a prerelease. label Apr 13, 2023
@ghost ghost added the needs-breaking-change-doc-created Breaking changes need an issue opened with https://github.com/dotnet/docs/issues/new?template=dotnet label Apr 13, 2023
@ghost
Copy link

ghost commented Apr 13, 2023

Added needs-breaking-change-doc-created label because this issue has the breaking-change label.

  1. Create and link to this issue a matching issue in the dotnet/docs repo using the breaking change documentation template, then remove this needs-breaking-change-doc-created label.

Tagging @dotnet/compat for awareness of the breaking change.

@bartonjs
Copy link
Member

bartonjs commented Apr 13, 2023

Video

Because the proposal mentions removing the body of CharInSet, I've marked it as a breaking change.

Since CharInSet is having its body removed, it probably warrants obsolete-as-error, the rest being obsolete-as-warning.

Obsoletions should just use the next available SYSLIB code. (The same code can apply to all of them, even though it's mixing warning and error)

EditorBrowsable-Never is also fine if it makes you happy.

namespace System.Text.RegularExpressions;

public class Regex
{
+   [Obsolete(...)]
    protected void InitializeReferences();

+   [Obsolete(...)]
    protected bool UseOptionC()

+   [Obsolete(...)]
    protected bool UseOptionR();
}

public abstract class RegexRunner
{
+   [Obsolete(...)]
    protected Match? Scan(Regex regex, string text, int textbeg, int textend, int textstart, int prevlen, bool quick) => Scan(regex, text, textbeg, textend, textstart, prevlen, quick, regex.MatchTimeout);

+   [Obsolete(...)]
    protected internal Match? Scan(Regex regex, string text, int textbeg, int textend, int textstart, int prevlen, bool quick, TimeSpan timeout)

+   [Obsolete(...)]
    protected static bool CharInSet(char ch, string set, string category);
}

@bartonjs bartonjs added api-approved API was approved in API review, it can be implemented and removed api-ready-for-review API is ready for review, it is NOT ready for implementation labels Apr 13, 2023
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Apr 14, 2023
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Apr 20, 2023
@dotnet dotnet locked as resolved and limited conversation to collaborators May 20, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-approved API was approved in API review, it can be implemented area-System.Text.RegularExpressions breaking-change Issue or PR that represents a breaking API or functional change over a prerelease. needs-breaking-change-doc-created Breaking changes need an issue opened with https://github.com/dotnet/docs/issues/new?template=dotnet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants