-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Python: Improve computation of regex fragments inside string parts #14317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python: Improve computation of regex fragments inside string parts #14317
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thanks!
@@ -154,6 +154,24 @@ class StringPart extends StringPart_, AstNode { | |||
override string toString() { result = StringPart_.super.toString() } | |||
|
|||
override Location getLocation() { result = StringPart_.super.getLocation() } | |||
|
|||
/** Holds if the content of string `StringPart` is surrounded by `prefix` and `quote`. */ | |||
predicate context(string prefix, string quote) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like callers of this predicate are really only interested in the length of the prefix and the quote; perhaps it makes sense to just expose that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did consider that; it might also be easier on the string pool. It just did not feel as generally useful, but perhaps that is a silly concern. (And given that the prefix and the quote are part of the string, it is actually the same information.)
* `localOffset` will be the offset of this `RegExpTerm` inside `result`. | ||
*/ | ||
StringPart getPart(int localOffset) { | ||
exists(int index, int prefixLength | index = max(int i | this.getPartOffset(i) < start) | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this <
be a <=
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, yes, I think so...good catch!
exists(int re_start, int prefix_len | prefix_len = re.getPrefix().length() | | ||
re.getLocation().hasLocationInfo(filepath, startline, re_start, endline, _) and | ||
startcolumn = re_start + start + prefix_len and | ||
endcolumn = re_start + end + prefix_len - 1 | ||
/* inclusive vs exclusive */ | ||
) | ||
or | ||
exists(StringPart part, int localOffset | part = this.getPart(localOffset) | | ||
filepath = part.getLocation().getFile().getAbsolutePath() and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you considered using part.getLocation().hasLocationInfo
(or perhaps even part.hasLocationInfo
if it exists) to bind all of these variables in one go?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I went this way because it look clear, but actually going via hasLocationInfo
is preferable also because it abstracts away that we choose the absolute path for the file.
Since we calculate the end column by offset, we must believ that the end line is the same as the start line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay. Had to double-check that everything still made sense for StringPart
s arising from f-strings. (Luckily, it did.)
I think this looks good. 👍
No worries, that was exactly the kind of check, I was looking for 👍 |
No description provided.