-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trim Zero White Space #609
Conversation
@@ -125,6 +125,7 @@ public class SourceGenerator : ISourceGenerator | |||
|
|||
private const string NativeMethodsTxtAdditionalFileName = "NativeMethods.txt"; | |||
private const string NativeMethodsJsonAdditionalFileName = "NativeMethods.json"; | |||
private static readonly char[] ZeroWhiteSpace = new char[] { '\uFEFF', '\u200B' }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first one is BOM, the seconds one is zero width space.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ZERO WIDTH SPACE (U+200B)
and ZERO WIDTH NO-BREAK SPACE (U+FEFF)
, should I add a comment documenting each one? Also should I only be checking ZERO WIDTH SPACE (U+200B)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, comments with these meanings would be great.
That said, I would expect our TextReader to remove the BOM. Is it not doing that?
And I wonder why we'd want to tolerate other strange unicode characters. We already have code elsewhere in the source generator to detect any non-ANSI character and emit a warning (or error maybe) so the user can correct it. This change looks like it would just allow these characters to creep in. Is that really a win?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We ran into it dealing with a merge conflict via github. I think it would be helpful to sanitize the inputs as best as possible. A warning instead of an error would then pickup the ZWSP for fixing. I don't think it should stop it from compiling and it should probably let the user know its there as well.
Edit: this is kind of what I mean https://en.wikipedia.org/wiki/Robustness_principle
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can also see your point on the ZWSP. But the code already runs a Trim() against it. Which is pretty similar. so ActivateKeyboardLayout
will work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, fair enough. I'll take the PR. Can you just add the comments we talked about?
Fixes #608