Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix potencial leak / seg fault #726

Merged
merged 6 commits into from
Nov 17, 2023
Merged

Conversation

ahcorde
Copy link
Contributor

@ahcorde ahcorde commented Jul 14, 2021

Signed-off-by: ahcorde ahcorde@gmail.com

This PR is a follow up of this other PR #701

@ahcorde ahcorde requested a review from mjeronimo July 14, 2021 09:14
@ahcorde
Copy link
Contributor Author

ahcorde commented Jul 14, 2021

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status

@ahcorde ahcorde self-assigned this Jul 14, 2021
@ahcorde ahcorde requested a review from clalancette July 15, 2021 11:23
@ahcorde
Copy link
Contributor Author

ahcorde commented Jul 19, 2021

friendly ping @clalancette

Copy link
Contributor

@clalancette clalancette left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left some comments inline. One other thing that I noticed in this piece of code is that it isn't clear to me what mutex_ is protecting. Is it just the image_ and new_image_? Is it all of the variables in the class? What can run on different threads here? I don't know the answer to all of that, but I think it is worth investigating to see if we need to fix up locking here.

Comment on lines 425 to 429
if (!bufferptr_) {
bufferptr_ = std::make_shared<std::vector<uint8_t>>(image_data.size_);
} else if (static_cast<size_t>(bufferptr_->size()) != image_data.size_) {
bufferptr_->resize(image_data.size_, 0);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few different things about this code:

  1. I feel like we can simplify this idiom by allocating at least one byte to bufferptr_ in the constructor. Then all of the places we do this, we can change this to:
if (bufferptr_->size() != image_data.size_) {
  bufferptr_->resize(image_data.size_, 0);
}
  1. I don't think we need the static_cast to size_t; both bufferptr_->size() and image_data.size_ are size_t.
  2. Technically, we only need to resize the buffer if it is smaller than what we currently need. That could save some reallocation, at the expense of some wasted memory. I don't feel strongly about this point, though.
  3. I can't quite put my finger on it, but the resize logic looks incorrect. In particular, just prior to this code, we set image_data.size_ to the number of uint16_t elements in the incoming piece of data. But then we allocate that number of uint8_t bytes, which seems like we wouldn't have enough data in the bufferptr. For example, assume that we have 10 uint16_t elements. In that case, we'll allocate 10 bytes to the bufferptr, not 20. It feels like we can overrun this when we are normalizing, but I'm not sure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these two lines are calculating the uint_8 size of the buffer.

image_data.size_ /= sizeof(uint16_t);
...
image_data.size_ /= sizeof(float);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't see the reason to divide here. From looking at the origin of the data, we'd just want to allocate image_data.size_? I believe it's already the number of bytes not the number of elements?

ImageData(image->encoding, image->data.data(), image->data.size()));

https://github.com/ros2/common_interfaces/blob/a3a0dde2ba184b01cdc59a3003728906de3240a9/sensor_msgs/msg/Image.msg#L26

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to do this because we are creating an image to show in RVIZ which is a uint8_t. We need to know the number of elements for each one of the formats.

Take a look to the normalize() function.

@ros-discourse
Copy link

This pull request has been mentioned on ROS Discourse. There might be relevant details there:

https://discourse.ros.org/t/ros-2-tsc-meeting-minutes-2021-9-16/22372/1

@ahcorde ahcorde requested a review from gbalke October 26, 2021 19:45
@audrow audrow changed the base branch from ros2 to rolling June 28, 2022 14:24
@ahcorde
Copy link
Contributor Author

ahcorde commented Sep 14, 2023

friendly ping @clalancette

ahcorde and others added 5 commits November 16, 2023 13:32
Signed-off-by: ahcorde <ahcorde@gmail.com>
Signed-off-by: ahcorde <ahcorde@gmail.com>
Signed-off-by: ahcorde <ahcorde@gmail.com>
Signed-off-by: ahcorde <ahcorde@gmail.com>
The main goal here is to stop using a pointer
from a freed std::vector.  Instead, we allocate
memory on the heap and delete it when this piece
of code no longer needs it (Ogre copies it internally).

Along the way, we do a huge cleanup of this code to
make it easier to understand and fix some obvious
problems with it:

1. Improve the locking.
2. Make some methods const.
3. Move implementations in the C++ file.
4. Remove unnecessary functions.
5. Rename a few things to make their function clearer.

Signed-off-by: Chris Lalancette <clalancette@gmail.com>
@clalancette
Copy link
Contributor

clalancette commented Nov 16, 2023

@ahcorde I know this one has been sitting a long time. Sorry about that.

I ended up taking some time to look at this in more detail, and I ended up with a much larger rewrite here. However, I believe that this rewrite takes things in the correct direction.

In particular, what we do here is to only store copied data into the ImageData as needed, and only for as long as needed. That is enabled by the fact that Ogre::Image::load_raw_data() actually copies the data into itself. Thus, we only need to keep that data around long enough for the function to be called, and we can then free it. Further, we recognize that we only sometimes need to do a copy (really only when doing a conversion); otherwise, we can point to the data already stored in the message.

Beyond that, I did a bunch of other cleanup to make this a lot easier to understand. I now think that this is in good shape, but I'd like you to take a look and let me know what you think about it.

@ahcorde
Copy link
Contributor Author

ahcorde commented Nov 17, 2023

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

Signed-off-by: Chris Lalancette <clalancette@gmail.com>
@clalancette
Copy link
Contributor

There was a warning on Windows (which showed a real bug). That is now fixed. Here is new CI:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@ahcorde ahcorde merged commit 9050571 into rolling Nov 17, 2023
2 checks passed
@delete-merged-branch delete-merged-branch bot deleted the ahcorde/improve_16_32_image branch November 17, 2023 20:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants