Reproducible in vscode.dev or in VS Code Desktop?
Reproducible in the monaco editor playground?
Monaco Editor Playground Code
monaco.editor.create(document.getElementById('container'), {
value: '// \\u000a is a line break in Java unicode escapes\n// This comment \\u000a int[] x = {0}; // is actually code\nclass Test {\n // \\u0048\\u0065\\u006C\\u006C\\u006F\n public static void main(String[] args) {}\n}',
language: 'java'
});
Description
Java processes unicode escape sequences (\uXXXX) at a very early stage — before tokenisation. This means \u000a inside a comment is actually a line break, and code after it is executable. The syntax highlighter doesn't account for this, so what appears to be a comment can hide real code.
This is a known Java "feature" that can be used to hide malicious code: https://wh0.github.io/2019/11/16/easter-egg-inspection.html
Ideally the Java tokeniser would process \uXXXX sequences the same way javac does, or at minimum flag them visually.
Cross-reference: compiler-explorer/compiler-explorer#4223
Reproducible in vscode.dev or in VS Code Desktop?
Reproducible in the monaco editor playground?
Monaco Editor Playground Code
Description
Java processes unicode escape sequences (
\uXXXX) at a very early stage — before tokenisation. This means\u000ainside a comment is actually a line break, and code after it is executable. The syntax highlighter doesn't account for this, so what appears to be a comment can hide real code.This is a known Java "feature" that can be used to hide malicious code: https://wh0.github.io/2019/11/16/easter-egg-inspection.html
Ideally the Java tokeniser would process
\uXXXXsequences the same wayjavacdoes, or at minimum flag them visually.Cross-reference: compiler-explorer/compiler-explorer#4223