Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stricter Contextual Unicode Highlighting: Consider Same Script #143720

Closed
hediet opened this issue Feb 23, 2022 · 4 comments
Closed

Stricter Contextual Unicode Highlighting: Consider Same Script #143720

hediet opened this issue Feb 23, 2022 · 4 comments
Assignees
Milestone

Comments

@hediet
Copy link
Member

hediet commented Feb 23, 2022

This doesn't look like a good and secure implementation to me.
Consider the following code:

if (variable_א || variable_ב || vаriable_ג) {
    admin = true;
}

As far as I understand, nothing will be highlighted after the change. But paste the code to an editor before the change and you'll see why it's bad.

I think it should only skip the highlighting if the word contains only characters from a single language.

Also, there must be a way to disable this word-based highlighting exclusion.

Originally posted by @justanotheranonymoususer in #140960 (comment)

@hediet hediet changed the title This doesn't look like a good and secure implementation to me. Stricter Contextual Unicode Highlighting: Consider Same Script Feb 23, 2022
@hediet
Copy link
Member Author

hediet commented Feb 23, 2022

Thanks for your input!

As far as I understand, nothing will be highlighted after the change.

This is correct.

This doesn't look like a good and secure implementation to me.

To be secure, we recommand configuring "editor.unicodeHighlight.nonBasicASCII": true. We don't guarantee security with the other settings, just heuristic protection against common unicode spoofing attacks.

Also, if you use non-basic ascii characters for identifiers, you are making yourself much more vulnerable for such attack.

I think it should only skip the highlighting if the word contains only characters from a single language.

Might be worth to consider.

The rule would be:

  • Highlight characters that look like ASCII or that are invisible
  • ... unless this character is in a word where at least one character cannot be confused with an ASCII character and all characters in that word have the same script

(bold is new)

@justanotheranonymoususer
Copy link
Contributor

To be secure, we recommand configuring "editor.unicodeHighlight.nonBasicASCII": true.

Is that the default? I'd prefer VSCode to be secure by default.

Also, if you use non-basic ascii characters for identifiers, you are making yourself much more vulnerable for such attack.

The code might be authored by somebody else.

The rule would be:

Sounds good.

@hediet
Copy link
Member Author

hediet commented Feb 23, 2022

Is that the default? I'd prefer VSCode to be secure by default.

This is the default for untrusted workspaces. You should only open a workspace as trusted when you can rule out malicious intent.

The code might be authored by somebody else.

Then you should reject that PR.
In your example, λ and ג might actually be confusable.

@hediet hediet self-assigned this Feb 23, 2022
@hediet
Copy link
Member Author

hediet commented Feb 24, 2022

Also see #143796.

Since this feature is all about ASCII/non-ASCII confusion, I tweaked the algorithm to only skip highlighting if all characters are non-ASCII.

@hediet hediet closed this as completed Feb 24, 2022
@github-actions github-actions bot locked and limited conversation to collaborators Apr 10, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants