New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix out-of-bounds memory access with zero-variable joints #2617
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #2617 +/- ##
==========================================
+ Coverage 50.60% 51.00% +0.41%
==========================================
Files 386 387 +1
Lines 32085 32285 +200
==========================================
+ Hits 16232 16465 +233
+ Misses 15853 15820 -33 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The setter changes are all not necessary. memcpy
doesn't access any memory if the number of bytes to write is zero. https://godbolt.org/z/9MrnWP4Kh
The getter changes became required just because of your original PR. Originally, forming the address, e.g. position_ + getFirstVariableIndex()
was not accessing the memory yet (now it does and fails) and subsequent methods didn't access the memory either for fixed joints.
The RobotState code was highly optimized before and you degrade that more and more. I have not seen a single example of a real range violation.
We are trying to modernize an old, decaying code base so it's easier to work with and to be able to make more meaningful contributions in the future. As such, I don't think blaming someone for "degrading" the code in this effort is productive. We won't get it right the first time, but we'll ideally get somewhere better. If you have any concrete suggestions on how we can more effectively bring these abstractions to use best C++ practices from this decade, we would welcome your input. But keeping things stagnant because "it's how it's always been done" doesn't seem like a great option either for the reasons @marioprats already mentioned. |
I didn't want to come across offending. Sorry, if I did. While Mario's initial PR was kind of innocent regarding performance, the fixup here introduces many more additional guards, which will definitely make the code slower - not only for initialization. The fundamental problem of this "old-style" code as you name it, is the use of raw pointers in the API. However, this problem is not (yet) addressed. I think a modern way to address this issue is using ranges. However, this would require many many API adaptions... Alternatively, in order to more easily catch invalid memory accesses (if they occur), you could revert all the changes and assign fixed joints a In any case, my major point is that moving away from the single memory chunk is a bad idea. The security risks you want to address are not directly related to this optimization, but the raw pointer API. There are mitigations, which keep the memory optimization and harden the API at no extra performance costs, however with huge refactorings... |
Thank you, that's very helpful! I think we are looking at the possibility of big refactorings because of considerations like accounting for dynamics (not just kinematics), and more importantly having a mutable robot model where joints/links can be added/removed after a URDF is loaded. Tools like Pinocchio support this quite well. So while throwing something with dynamic memory like |
I think dynamics can be easily handled with the current code base already. Actually, there is already some dynamics code. |
I added a basic test to check that |
This is wishful thinking, many invalid memory accesses existed in our test with the old model of not-initalizing the memory in the old memory pool implementation. Relying on ub to catch logic errors has not been shown to be reliable enough.
|
I am really curious to see them! So far, all the ones @marioprats pointed out turned out to be edge cases that don't occur in practice (like accessing the joint position of a fixed joint).
I can't follow this reasoning. Can you give some examples to support your claims? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested with this change and the original bug I found is resolved.
I think the other comments in this thread should be taken into consideration for further big architecture decisions, but this PR as is definitely fixes the immediate bug and we should get it in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy with this change. Thank you for the test. The only real way to avoid all bounds checking and make safe interfaces at runtime is to make sizes compile-time constants. If we don't like the performance impacts of bounds-checking-per-access in loops we need to build interfaces for looping with indexes.
When working on this change: 8d2eb90 |
There are several reasons why A handful of range adapters (filter is a good example) create range types with internal state that has to be mutated when you advance the range. As a result of this range types do not support exterior const. This form of As referenced above, interior const-ness can't be done through language features, and therefore, we have Raw pointers have first-class interior and exterior construction support in a way that these library types never will. Another surprising side-effect is if you take a const range and pass in a non-const range a new pointer-wrapping-object has to be created.
And this call site:
Due to implicit conversions, and
No extra copies are created when the user calls it. The "safety" argument for ranges is that they combine the pointer with the size. This only matters if you check that size, they do nothing help you avoid use-after-free bugs. Size can just as easily be passed into functions as a separate argument that take a pointer without loosing interior const-ness. |
Here is a primer on some of the issues with ranges: Generally, though, I'm coming around to believing that types that wrap pointers with a value type are an anti-pattern and a lousy can of worms being sold to use by "modern C++ evangelists." |
I consider this a feature and not an issue: You can select which code to execute depending on the constness of your range argument. You are right, that conversion from (inner) non-const to const range comes at the cost of a new object creation by default. However, I think you could replace the default behavior with a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important fix! But in addition to checking for this specific edge case, I think we should generally check for joint model compatibility with the state's robot model. Otherwise, we can never be sure that the variables of the requested joint model match the internal data structure.
Thanks for pointing out this talk. Very interesting. But note: in the talk, he doesn't blame ranges, only views! |
My understanding is that For our uses, though, where we need improvements over our API boundaries (function parameters, return values, state in classes), they don't help much. Also, being a C++20 feature, it'll be a while before we can use them in a ROS 2 project. All that being said, I think this is a good change and we should merge this. |
Maybe you are right. Essentially a range just defines an iterable sequence via In any case, I still consider #2546 and this PR a degradation of the code base and not an improvement. |
I think a large part of the problem with ranges is that they are based on the iterator concept (begin and end). I've started experimenting with this library: https://github.com/tcbrindle/flux. The cursor approach taken here does not suffer from the iterator invalidation problems that the stl rangest library does. |
The problem with the cursor concept is that it doesn't scale to non-random access containers. |
This change updates some methods to avoid an out-of-bounds access after the recent changes to use
std::vector
as the underlying storage inRobotState
(#2546).Before that change, there was nothing stopping us from passing or returning invalid memory addresses. But now that will trigger
range_check
exceptions.This change fixes an out-of-bounds access that can happen with fixed joint models with a variable count of 0, by special-handling that case where needed. It also moves the implementation of those functions to the source file, to not clutter the header file further.