-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix error handling when copying C sequence messages #671
Conversation
Do not hold reallocated (i.e. invalidated) data pointer if any new sequence item initialization fails Signed-off-by: Michel Hidalgo <michel@ekumenlabs.com>
@ros-pull-request-builder retest this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left a couple of comments on how to clean things up here. Let me know what you think.
return false; | ||
} | ||
} | ||
output->data = data; | ||
output->capacity = input->size; | ||
} | ||
output->size = input->size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't directly related to what you are fixing here, but should we move this after the loop below that copies the individual fields? That way if one of them fails, we aren't updating output->size
erroneously (this is probably never a problem in practice, but I still thought I would point it out).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@clalancette even more so, we should probably be incrementing it as we copy.
} | ||
allocator.deallocate(data, allocator.state); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I agree with the error handling here. Let's assume for a minute that we started copying the elements here one at a time, and one of them failed, so this method returned false. In that case, the most reasonable thing for a caller of these functions to do is to call the __fini
method on this array. The problem with that sequence, though, is that we've already freed up some of the elements here, and the __fini
method will try to free those elements again, leading to a double-free. What I think we should do here is probably set output->size
to 0 on error, and then have the __fini
method free only up to output->size
; that way we won't double-free things. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, the most reasonable thing for a caller of these functions to do is to call the __fini method on this array.
fini
the input or the output?
What I think we should do here is probably set output->size to 0 on error
I think this loop is only finalizing the things beyond output
's capacity before copy
was called, so if size was set to 0 then all of output needed to be finalized in this function too.
and then have the __fini method free only up to output->size; that way we won't double-free things. What do you think?
It looks like all elements up to capacity
can be initialized at the moment. Say there are two arrays foo
and bar
. foo
has size and capacity 5 and bar
has size and capacity 7. Say foo
is copied into bar
. Because bar
has excess capacity, the whole if(output->capacity < input->size)
block is skipped. foo
's 5 elements get copied into bar
and bar
now has size 5 and capacity 7. If fini
only free's up to output->size
then the last two elements in bar
will be leaked because copy
didn't free them.
Finalizing up to size
seems like it could be made to work, but everything needs to make sure only elements up to size
are initialized. There would need to be another loop in copy to fini output
's elements that go beyond the new size when input->size < output->size
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this loop is only finalizing the things beyond output's capacity before copy was called, so if size was set to 0 then all of output needed to be finalized in this function too.
That's correct.
The problem with that sequence, though, is that we've already freed up some of the elements here, and the __fini method will try to free those elements again, leading to a double-free.
A double-free isn't a possible failure mode, as message finalization functions are typically idempotent. We're not consistent about that across core packages though.
What I think we should do here is probably set output->size to 0 on error, and then have the __fini method free only up to output->size; that way we won't double-free things.
Finalizing up to size seems like it could be made to work, but everything needs to make sure only elements up to size are initialized.
__fini
functions operate under the assumption that elements up to capacity will have been default initialized. __copy
functions simply follow suit. We can revisit that assumption though, to allocate up to capacity and initialize up to size instead.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hidmic was kind enough to work through this with me, and I now understand a lot more about what we are doing here. In short, if the realloc
on line 469 succeeds, we must assign it to output->data
, because that memory may have been moved and now the original pointer is bogus. So that explains the first line of the patch.
The rest falls out of that. __fini
expects that up to ->capacity
is valid memory that has been default initialized. If we failed to initialize one of the elements during the copy, then we roll back the new elements we've copied, leaving the existing ones alone. We also do not update ->capacity
. That fulfills __fini
's requirements (we end up wasting the additional memory that we just allocated, but we don't lose it or leak it).
So with all of that said, I think this refactor is the correct thing. The only thing I'll request is a small comment somewhat explaining the situation, since it is not obvious (at least, it wasn't obvious to me).
I'm still going to approve, since the code itself seems correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hidmic was kind enough to work through this with me
Thank you @clalancette for sparing the time to look at this.
Added comments in 3a09837. PTAL!
} | ||
allocator.deallocate(data, allocator.state); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hidmic was kind enough to work through this with me, and I now understand a lot more about what we are doing here. In short, if the realloc
on line 469 succeeds, we must assign it to output->data
, because that memory may have been moved and now the original pointer is bogus. So that explains the first line of the patch.
The rest falls out of that. __fini
expects that up to ->capacity
is valid memory that has been default initialized. If we failed to initialize one of the elements during the copy, then we roll back the new elements we've copied, leaving the existing ones alone. We also do not update ->capacity
. That fulfills __fini
's requirements (we end up wasting the additional memory that we just allocated, but we don't lose it or leak it).
So with all of that said, I think this refactor is the correct thing. The only thing I'll request is a small comment somewhat explaining the situation, since it is not obvious (at least, it wasn't obvious to me).
I'm still going to approve, since the code itself seems correct.
Signed-off-by: Michel Hidalgo <michel@ekumenlabs.com>
Since 3a09837 is comments only and the Rpr job passed, I'll merge. |
This is an issue I spotted while reviewing #669. We shouldn't hold reallocated (i.e. invalidated) data pointers if any new sequence item initialization fails. It's quite hard to run into this problem (e.g. RAM exhaustion), but it is theoretically possible.
CI up to
rosidl_generator_c
androsidl_runtime_c
: