Fix error handling when copying C sequence messages #671

hidmic · 2022-03-21T18:03:28Z

This is an issue I spotted while reviewing #669. We shouldn't hold reallocated (i.e. invalidated) data pointers if any new sequence item initialization fails. It's quite hard to run into this problem (e.g. RAM exhaustion), but it is theoretically possible.

CI up to rosidl_generator_c and rosidl_runtime_c:

Linux
Linux-aarch64
Windows

Do not hold reallocated (i.e. invalidated) data pointer if any new sequence item initialization fails Signed-off-by: Michel Hidalgo <michel@ekumenlabs.com>

clalancette · 2022-03-22T15:33:51Z

@ros-pull-request-builder retest this please

clalancette

I've left a couple of comments on how to clean things up here. Let me know what you think.

clalancette · 2022-03-22T15:41:47Z

rosidl_generator_c/resource/msg__functions.c.em

        return false;
      }
    }
-    output->data = data;
    output->capacity = input->size;
  }
  output->size = input->size;


This isn't directly related to what you are fixing here, but should we move this after the loop below that copies the individual fields? That way if one of them fails, we aren't updating output->size erroneously (this is probably never a problem in practice, but I still thought I would point it out).

@clalancette even more so, we should probably be incrementing it as we copy.

clalancette · 2022-03-22T15:50:18Z

rosidl_generator_c/resource/msg__functions.c.em

        }
-        allocator.deallocate(data, allocator.state);


I'm not sure I agree with the error handling here. Let's assume for a minute that we started copying the elements here one at a time, and one of them failed, so this method returned false. In that case, the most reasonable thing for a caller of these functions to do is to call the __fini method on this array. The problem with that sequence, though, is that we've already freed up some of the elements here, and the __fini method will try to free those elements again, leading to a double-free. What I think we should do here is probably set output->size to 0 on error, and then have the __fini method free only up to output->size; that way we won't double-free things. What do you think?

In that case, the most reasonable thing for a caller of these functions to do is to call the __fini method on this array.

fini the input or the output?

What I think we should do here is probably set output->size to 0 on error

I think this loop is only finalizing the things beyond output's capacity before copy was called, so if size was set to 0 then all of output needed to be finalized in this function too.

and then have the __fini method free only up to output->size; that way we won't double-free things. What do you think?

It looks like all elements up to capacity can be initialized at the moment. Say there are two arrays foo and bar. foo has size and capacity 5 and bar has size and capacity 7. Say foo is copied into bar. Because bar has excess capacity, the whole if(output->capacity < input->size) block is skipped. foo's 5 elements get copied into bar and bar now has size 5 and capacity 7. If fini only free's up to output->size then the last two elements in bar will be leaked because copy didn't free them.

Finalizing up to size seems like it could be made to work, but everything needs to make sure only elements up to size are initialized. There would need to be another loop in copy to fini output's elements that go beyond the new size when input->size < output->size.

I think this loop is only finalizing the things beyond output's capacity before copy was called, so if size was set to 0 then all of output needed to be finalized in this function too.

That's correct.

The problem with that sequence, though, is that we've already freed up some of the elements here, and the __fini method will try to free those elements again, leading to a double-free.

A double-free isn't a possible failure mode, as message finalization functions are typically idempotent. We're not consistent about that across core packages though.

What I think we should do here is probably set output->size to 0 on error, and then have the __fini method free only up to output->size; that way we won't double-free things.

Finalizing up to size seems like it could be made to work, but everything needs to make sure only elements up to size are initialized.

__fini functions operate under the assumption that elements up to capacity will have been default initialized. __copy functions simply follow suit. We can revisit that assumption though, to allocate up to capacity and initialize up to size instead.

Thoughts?

@hidmic was kind enough to work through this with me, and I now understand a lot more about what we are doing here. In short, if the realloc on line 469 succeeds, we must assign it to output->data, because that memory may have been moved and now the original pointer is bogus. So that explains the first line of the patch.

The rest falls out of that. __fini expects that up to ->capacity is valid memory that has been default initialized. If we failed to initialize one of the elements during the copy, then we roll back the new elements we've copied, leaving the existing ones alone. We also do not update ->capacity. That fulfills __fini's requirements (we end up wasting the additional memory that we just allocated, but we don't lose it or leak it).

So with all of that said, I think this refactor is the correct thing. The only thing I'll request is a small comment somewhat explaining the situation, since it is not obvious (at least, it wasn't obvious to me).

I'm still going to approve, since the code itself seems correct.

@hidmic was kind enough to work through this with me

Thank you @clalancette for sparing the time to look at this.

Added comments in 3a09837. PTAL!

clalancette · 2022-04-01T21:24:28Z

rosidl_generator_c/resource/msg__functions.c.em

        }
-        allocator.deallocate(data, allocator.state);


@hidmic was kind enough to work through this with me, and I now understand a lot more about what we are doing here. In short, if the realloc on line 469 succeeds, we must assign it to output->data, because that memory may have been moved and now the original pointer is bogus. So that explains the first line of the patch.

The rest falls out of that. __fini expects that up to ->capacity is valid memory that has been default initialized. If we failed to initialize one of the elements during the copy, then we roll back the new elements we've copied, leaving the existing ones alone. We also do not update ->capacity. That fulfills __fini's requirements (we end up wasting the additional memory that we just allocated, but we don't lose it or leak it).

So with all of that said, I think this refactor is the correct thing. The only thing I'll request is a small comment somewhat explaining the situation, since it is not obvious (at least, it wasn't obvious to me).

I'm still going to approve, since the code itself seems correct.

Signed-off-by: Michel Hidalgo <michel@ekumenlabs.com>

hidmic · 2022-04-01T23:43:11Z

Since 3a09837 is comments only and the Rpr job passed, I'll merge.

Fix error handling when copying C sequence messages

c9ecd81

Do not hold reallocated (i.e. invalidated) data pointer if any new sequence item initialization fails Signed-off-by: Michel Hidalgo <michel@ekumenlabs.com>

hidmic added the bug Something isn't working label Mar 21, 2022

hidmic requested a review from sloretz March 22, 2022 11:07

clalancette reviewed Mar 22, 2022

View reviewed changes

clalancette approved these changes Apr 1, 2022

View reviewed changes

Comment copy error handling nuances

3a09837

Signed-off-by: Michel Hidalgo <michel@ekumenlabs.com>

clalancette approved these changes Apr 1, 2022

View reviewed changes

hidmic merged commit 6486f6b into master Apr 1, 2022

delete-merged-branch bot deleted the hidmic/fix-c-sequence-copies-on-error branch April 1, 2022 23:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix error handling when copying C sequence messages #671

Fix error handling when copying C sequence messages #671

hidmic commented Mar 21, 2022

clalancette commented Mar 22, 2022

clalancette left a comment

clalancette Mar 22, 2022

hidmic Mar 28, 2022

clalancette Mar 22, 2022

sloretz Mar 25, 2022

hidmic Mar 28, 2022

clalancette Apr 1, 2022

hidmic Apr 1, 2022

clalancette Apr 1, 2022

hidmic commented Apr 1, 2022

Fix error handling when copying C sequence messages #671

Fix error handling when copying C sequence messages #671

Conversation

hidmic commented Mar 21, 2022

clalancette commented Mar 22, 2022

clalancette left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hidmic commented Apr 1, 2022