-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Rust: Improve handling of implicit derefs/borrows in data flow #20891
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
a11fac5 to
d996956
Compare
d996956 to
d0260fe
Compare
|
CC @geoffw0 for the changes to models. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR improves the handling of implicit dereferencing and borrowing in Rust data flow analysis to align with type inference conclusions. The changes update data flow models to use .Reference where implicit borrows occur and add implicit deref steps for field expression qualifiers.
- Updates summary models to use
Argument[self].Referenceinstead ofArgument[self]for methods that take&selfor&mut self - Adds intermediate data flow nodes representing implicit borrows/derefs
- Changes some test queries from ValueFlow to TaintFlow to better align with the refined analysis
Reviewed changes
Copilot reviewed 40 out of 40 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
rust/ql/test/query-tests/security/CWE-770/UncontrolledAllocationSize.expected |
Updated expected test results showing refined data flow edges with .Reference and intermediate borrow nodes |
rust/ql/test/query-tests/security/CWE-614/InsecureCookie.expected |
Shows removal of direct edges and addition of intermediate nodes for method calls on borrowed values |
rust/ql/test/query-tests/security/CWE-312/CleartextStorageDatabase.expected |
Updates flow edges to include explicit borrow steps before method calls |
rust/ql/test/query-tests/security/CWE-312/CleartextLogging.expected |
Similar refinements with intermediate nodes for borrowed method receivers |
rust/ql/test/query-tests/security/CWE-117/LogInjection.expected |
Model updates showing .Reference for iterator methods |
rust/ql/test/query-tests/security/CWE-089/SqlInjection.expected |
Extensive edge additions for intermediate borrow nodes in SQL query construction |
rust/ql/test/query-tests/security/CWE-022/TaintedPath.expected |
Path canonicalization flow now includes intermediate borrow steps |
rust/ql/test/library-tests/type-inference/type-inference.ql |
Function signature updates to accept additional parameter |
rust/ql/test/library-tests/dataflow/taint/TaintFlowStep.expected |
New edges showing borrowed state tracking |
rust/ql/test/library-tests/dataflow/strings/main.rs |
Comment update noting changed semantics for as_str() |
rust/ql/test/library-tests/dataflow/strings/inline-taint-flow.expected |
Flow edge refinements and model updates |
rust/ql/test/library-tests/dataflow/sources/web_frameworks/InlineFlow.expected |
Addition of intermediate nodes for path parameter extraction |
rust/ql/test/library-tests/dataflow/sources/stdin/InlineFlow.expected |
Extensive model updates for stdin read operations with .Reference |
rust/ql/test/library-tests/dataflow/sources/net/InlineFlow.expected |
Network I/O model updates with intermediate borrow nodes |
rust/ql/test/library-tests/dataflow/sources/file/InlineFlow.expected |
File I/O operations now track intermediate borrows explicitly |
rust/ql/test/library-tests/dataflow/sources/env/InlineFlow.expected |
Environment variable access model update |
rust/ql/test/library-tests/dataflow/pointers/inline-flow.ql |
Test query updated from ValueFlow to TaintFlow |
rust/ql/test/library-tests/dataflow/modeled/inline-flow.ql |
Test query updated from ValueFlow to TaintFlow |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
geoffw0
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mode, test and DCA changes LGTM, apart from a couple of small issues / questions below. FWIW I think the fact that missing out .Reference on Argument[self].Reference used to work caused me some confusion. In future, we can get them right in the first place.
| // We allow flow into post-update node for receiver expressions (from the | ||
| // synthetic post receiever node). | ||
| n.(Node::PostUpdateNode).getPreUpdateNode().asExpr() = any(Node::ReceiverNode r).getReceiver() | ||
| n.(Node::PostUpdateNode).getPreUpdateNode().asExpr() = any(Node::DerefBorrowNode r).getNode() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the comment above this change be clarified?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, this should no longer be needed.
| - ["core::pin::Pin", "Argument[0]", "ReturnValue", "value", "manual"] | ||
| - ["<core::pin::Pin>::new", "Argument[0]", "ReturnValue", "value", "manual"] | ||
| - ["<core::pin::Pin>::new_unchecked", "Argument[0].Reference", "ReturnValue", "value", "manual"] | ||
| - ["<core::pin::Pin>::new_unchecked", "Argument[0]", "ReturnValue.Field[core::pin::Pin::pointer]", "value", "manual"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whether we store taint in the Pin itself or in the pointer field is, I think, a design choice. Either way, I'd like to have consistency, so if we're keeping this change can we also change:
- the model above (for
<core::pin::Pin>::new) similarly to this one. - the model one line below (for
<core::pin::Pin>::into_inner) should readArgument[0].Field[core::pin::Pin::pointer]. - the model three lines below (for
<core::pin::Pin>::set) should write intoArgument[self].Reference.Field[core::pin::Pin::pointer].
Also the model at the top of the block (for core::pin::Pin) appears to be a mistake, lets get rid of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I'll update the models.
|
LGTM. There are a couple of minor test failures to address. |
|
Putting back into draft mode to resolve issues. |
acc72b2 to
cbddcab
Compare
cbddcab to
d75976d
Compare
d75976d to
353070b
Compare
The test failures uncovered bugs in how we handle explicit derefs in data flow; this has now been fixed in the latest commit. I ran a new DCA experiment, and it still looks good. |
|
|
||
| // summary=<test::option::MyOption>::insert;Argument[0];Argument[self].Reference.Field[test::option::MyOption::MySome(0)];value;dfc-generated | ||
| // summary=<test::option::MyOption>::insert;Argument[0];ReturnValue.Reference;value;dfc-generated | ||
| // This summary is currently missing because of access path limit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will be fixed by #20915.
This PR updates the handling of implicit dereferencing/borrowing in data flow to strictly follow what has been concluded by the type inference library. This revealed inconsistencies in some models (typically
Argument[self]instead ofArgument[self].Reference), which has been fixed.This PR also adds implicit deref steps for qualifiers of field expressions.
Update: This PR also changes how explicit dereferences are treated in data flow: for
*xwe do the equivalent of*Deref::deref(&x)(see https://doc.rust-lang.org/std/ops/trait.Deref.html#deref-coercion), and this also works whenxhas type&or&mut, because of https://doc.rust-lang.org/std/ops/trait.Deref.html#impl-Deref-for-%26T and https://doc.rust-lang.org/std/ops/trait.Deref.html#impl-Deref-for-%26mut+T, which we add new models for.DCA seems good; we loose some results, but I suspect the vast majority to be FPs due to the previous imprecision.