New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Overhaul how regex generated code is structured #66432
Conversation
Tagging subscribers to this area: @dotnet/area-system-text-regularexpressions Issue DetailsFor every regex, we currently emit a private type next to the partial member we're implementing, and that private type contains all the code for the regex, including all the helpers it needs. There are multiple issues with this:
With this PR, we instead move all of the types into a separate namespace and parent class, which then enables a) having usings inside the namespace declaration so that we can avoid fully-qualifying everything and b) separating out all of the helper methods into a class that's then private to the parent wrapper class, shielding it from the rest of the assembly. Hopefully C# 11 will see a solution to dotnet/csharplang#5529, at which point we can use that to hide all of the separated out generated code other than the partial method definitions (with the changes in this PR, doing so should involve minimal tweaks to the code generation). Until then, we still have a similar problem as we do today for the generated Regex-derived types being visible to other code in the assembly, but at least the shared helpers aren't.
|
// We emit usings here now that we're inside of a namespace block and are no longer emitting code into | ||
// a user's partial type. We can now rely on binding rules mapping to these usings and don't need to | ||
// use global-qualified names for the rest of the implementation. | ||
writer.WriteLine($" using System;"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a new pragma warnings in the headers though? I believe it is common for people to add stylecop rules that require using statements to be outside of a namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Such rules by default shouldn't apply to generated code. But even if they did, I don't want us to add pragmas for warnings that aren't in the .NET SDK; otherwise we open the doors to an unending laundry list of arbitrary 3rd-party analyzers we'd need to start considering.
Note there is a hidden diagnostic issued for this by the C# compiler (CS8019), but pragmas appear to be ineffective at disabling it. @sharwell may know why.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CS8019
This is a hidden diagnostic used for fading out unnecessary usings in the IDE. It is intentionally reported even in generated files, and will not impact builds. IDE0005 is the complementary rule that is only reported in non-generated files, and might be raised to warning/error in builds.
Do we need a new pragma warnings in the headers though?
As long as you use a hint name ending in .g.cs or have // <auto-generated/>
as the first line of the file, it will be treated as generated code and code style analyzers will not run in the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
It is intentionally reported even in generated files
I was surprised that they were still issued even with:
#pragma warning disable CS8019
at the top of the file. Is that expected?
As long as you use a hint name ending in .g.cs or have // as the first line of the file
We do both.
src/libraries/System.Text.RegularExpressions/gen/RegexGenerator.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Text.RegularExpressions/gen/RegexGenerator.cs
Outdated
Show resolved
Hide resolved
57d603a
to
bfb1e4b
Compare
private static readonly string s_generatedCodeAttribute = $"[global::System.CodeDom.Compiler.GeneratedCodeAttribute(\"{typeof(RegexGenerator).Assembly.GetName().Name}\", \"{typeof(RegexGenerator).Assembly.GetName().Version}\")]"; | ||
/// <summary>Header comments and usings to include at the top of every generated file.</summary> | ||
private static readonly string[] s_headers = new string[] | ||
/// <summary>Emits the definition of the partial method. This method just delegates to the property cache on the generated Regex-derived type.</summary> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Smal NIT here, should we take the namespace and type declaration out of this method? My question is basically that if you have one partial type which has two source generated methods, it would be nice if the generated code only has one namespace and type declaration which contains both of the generated methods that fetch the instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds reasonable but will require some additional changes, and this was preexisting these changes. We can do it subsequently. Though it will add additional expense to figure out all the groupings... I'm not sure if it's really worth it to avoid the extra partial definitions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left some comments but this looks great, thanks @stephentoub
bfb1e4b
to
fc06231
Compare
Fixes #63502
Fixes #63512
For every regex, we currently emit a private type next to the partial member we're implementing, and that private type contains all the code for the regex, including all the helpers it needs. There are multiple issues with this:
With this PR, we instead move all of the types into a separate namespace and parent class, which then enables a) having usings inside the namespace declaration so that we can avoid fully-qualifying everything and b) separating out all of the helper methods into a class that's then private to the parent wrapper class, shielding it from the rest of the assembly.
Hopefully C# 11 will see a solution to dotnet/csharplang#5529, at which point we can use that to hide all of the separated out generated code other than the partial method definitions (with the changes in this PR, doing so should involve minimal tweaks to the code generation). Until then, we still have a similar problem as we do today for the generated Regex-derived types being visible to other code in the assembly, but at least the shared helpers aren't.
cc: @joperezr, @chsienki