-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Rust: Speedup inferMethodCallTypeSelf
#21305
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
1715a43 to
9ab93d5
Compare
9ab93d5 to
49f24ca
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Improves performance of Rust method-call type inference by reducing recursion in inferMethodCallTypeSelf and avoiding multi-capture regex patterns that cause fan-out in evaluation.
Changes:
- Refactors
inferMethodCallTypeSelfto account for implicit borrows without repeated recursion and updates call sites accordingly. - Replaces
regexpCaptureusage withindexOf/prefix/suffixindecodeDerefChainBorrowto avoid multi-group capture fan-out. - Optimizes
UnboundListdeconstruction predicates to avoid multiple regex capture groups and adds a helper for string-length vs list-length.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| shared/util/codeql/util/UnboundList.qll | Avoids multi-capture regexpCapture in list deconstruction and adds a helper for raw string length. |
| rust/ql/lib/codeql/rust/internal/typeinference/TypeInference.qll | Refactors self-receiver type inference and replaces regex splitting with index-based splitting for performance. |
Comments suppressed due to low confidence (1)
shared/util/codeql/util/UnboundList.qll:136
- This comment should be capitalized (“Same remark …”) to match the surrounding documentation style.
// same remark as above about not using multiple capture groups
prefix = this.regexpCapture("^(|.+\\.)[0-9]+\\.$", 1) and
| // it is more efficient to not create a capture group for the suffix, since | ||
| // `regexpCapture` will then always join in both groups, only to afterwards filter | ||
| // based on the requested group (the group number is not part of the binding set | ||
| // of `regexpCapture`) | ||
| elem = this.regexpCapture("^([0-9]+)\\..*$", 1) and |
Copilot
AI
Feb 11, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new explanatory comment should start with a capital letter and could be tightened for readability (it currently spans multiple lines and uses a long parenthetical). Consider rephrasing with shorter sentences and consistent terminology (e.g., backticking bindingset).
This issue also appears on line 135 of the same file.
paldepind
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
| // it is more efficient to not create a capture group for the suffix, since | ||
| // `regexpCapture` will then always join in both groups, only to afterwards filter | ||
| // based on the requested group (the group number is not part of the binding set | ||
| // of `regexpCapture`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this saying that each call to regexpCapture would produce both capture groups and in the first call the second capture group would be thrown away and vice versa for the second call? Or something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Here is a simple example
predicate p1(string a, string b, string c) {
a = "b;c" and
exists(string reg |
reg = "^(.*);(.*)$" and
b = a.regexpCapture(reg, 1) and
c = a.regexpCapture(reg, 2)
)
}which generates the following RA:
[2026-02-10 15:54:01] (0s) Tuple counts for quickquery::p1#QuickEval#c8bd3cdf#query/3@b482101n after 2ms:
1 ~0% {2} r1 = CONSTANT(unique string, unique string)["b;c","^(.*);(.*)$"]
2 ~0% {4} | JOIN WITH PRIMITIVE regexpCapture#bbff ON Lhs.0,Lhs.1
2 ~0% {5} | SCAN OUTPUT In.0, In.1, In.2, In.3 'c', _
{4} | REWRITE WITH Tmp.4 := 2, TEST InOut.2 = Tmp.4 KEEPING 4
1 ~0% {3} | SCAN OUTPUT In.3 'c', _, _
1 ~0% {3} | REWRITE WITH Out.1 := "b;c", Out.2 := "^(.*);(.*)$"
2 ~0% {5} | JOIN WITH PRIMITIVE regexpCapture#bbff ON Lhs.1,Lhs.2
2 ~0% {6} | SCAN OUTPUT In.0 'c', In.1, In.2, In.3, In.4 'b', _
{5} | REWRITE WITH Tmp.5 := 1, TEST InOut.3 = Tmp.5 KEEPING 5
1 ~0% {3} | SCAN OUTPUT _, In.4 'b', In.0 'c'
1 ~0% {3} | REWRITE WITH Out.0 'a' := "b;c"
return r1
The workaround
predicate p4(string a, string b, string c) {
a = "b;c" and
exists(string reg |
reg = "^(.*);.*$" and
b = a.regexpCapture(reg, 1) and
c = a.suffix(b.length() + 1)
)
}with just a single capture group performs better:
[2026-02-10 15:57:35] (0s) Tuple counts for quickquery::p4#QuickEval#b3644a80#query/3@e8e0caoi after 0ms:
1 ~0% {2} r1 = CONSTANT(unique string, unique string)["b;c","^(.*);.*$"]
1 ~0% {4} | JOIN WITH PRIMITIVE regexpCapture#bbff ON Lhs.0,Lhs.1
1 ~0% {5} | SCAN OUTPUT In.0, In.1, In.2, In.3 'b', _
{4} | REWRITE WITH Tmp.4 := 1, TEST InOut.2 = Tmp.4 KEEPING 4
1 ~0% {4} | SCAN OUTPUT In.3 'b', _, _, _
{2} | REWRITE WITH Tmp.1 := "b;c", Tmp.2 := length(InOut.0), Tmp.3 := 1, Tmp.2 := (Tmp.2 + Tmp.3), Out.1 'c' := suffix(Tmp.1,Tmp.2) KEEPING 2
1 ~0% {3} | SCAN OUTPUT _, In.0 'b', In.1 'c'
1 ~0% {3} | REWRITE WITH Out.0 'a' := "b;c"
The first commit simplifies
inferMethodCallTypeSelfby making it less recursive (we only need to account for implicit borrows once).The second commit replaces some multi-group regexp capture calls with something that doesn't use multiple capture groups, which means avoiding fan-out:
Before:
After:
DCA shows a nice speedup, especially on
mist-os.