-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Range analysis and useless-comparison query: don't treat all unicode surrogates as if they are U+FFFD #7239
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Range analysis and useless-comparison query: don't treat all unicode surrogates as if they are U+FFFD #7239
Conversation
I don't think this is the right fix. Instead I think that |
The predicate |
a6ccd43
to
db39c0b
Compare
👍 pushed the in-place fix instead |
@@ -731,7 +743,11 @@ class CharacterLiteral extends Literal, @characterliteral { | |||
* this literal. The result is the same as if the Java code had cast | |||
* the character to an `int`. | |||
*/ | |||
int getCodePointValue() { result.toUnicode() = this.getValue() } | |||
int getCodePointValue() { | |||
if this.getLiteral().matches("'\\u%'") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make the match pattern slightly more precise.
if this.getLiteral().matches("'\\u%'") | |
if this.getLiteral().matches("'\\u____'") |
@@ -713,6 +713,18 @@ class DoubleLiteral extends Literal, @doubleliteral { | |||
override string getAPrimaryQlClass() { result = "DoubleLiteral" } | |||
} | |||
|
|||
// Implementation taken from @p0 at https://github.com/github/codeql/issues/4145 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for this comment, I think.
@aschackmull done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's technically possible to construct examples where getCodePointValue
still returns the wrong value 65533 (when a surrogate literal doesn't match "'\\u____'"
), but these examples are extremely obscure, so I'm not sure that it's worth the effort to properly support them until we see an actual need.
No description provided.