-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Ruby: interpret string escape sequences in getConstantValue() #8164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -3,6 +3,7 @@ private import AST | |||
private import Constant | |||
private import TreeSitter | |||
private import codeql.ruby.controlflow.CfgNodes | |||
private import codeql.NumberUtils | |||
|
|||
int parseInteger(Ruby::Integer i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation of parseInteger
should probably use NumberUtils
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I'll have a go at that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've done this locally, but my implementation is more conservative. My new parseHexInt
returns no value for 0xdeadbeef
, while the current parseInteger
implementation returns -559038737
(i.e. wraps to a negative number). Ruby treats 0xdeadbeef
as the positive/unsigned integer 3735928559
.
Should I attempt to update my implementation of parseHexInt
to allow wrapping, even if it's not correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no problem changing the meaning of parseInteger
if the old definition is incorrect. Returning no value is probably better than incorrect wrapping.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I've made that change. I made some improvements to the binary/hex/octal parsing predicates in the process. Now they can all parse 2^31-1, return no result for 2^31, and cope with an arbitrary number of leading zeros (e.g. 0x00000000000000000007f). I've also added those edge cases to the tests.
The DCA run showed a significant slowdown on one project in particular. I'm investigating. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but as you say, performance needs to be investigated.
5f343bd
to
f8ecd0b
Compare
And make parse{Binary,Octal,Hex}Int hold only for values in the range 0 to 2^31-1 (incl.)
f8ecd0b
to
488c8ef
Compare
Motivation: being able to write queries that can tell the difference between the string values
'\n'
(U+005C, U+006E) and"\n"
(U+000A).I've added a new test in
ruby/ql/test/library-tests/ast/escape_sequences
that I hope covers all the edge cases, and I think all the other test changes make sense.I'm not sure if any of the new unescape predicates need to be cached. I'll run DCA and see if anything stands out.