-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[C++] Fixing use-after-free and constructor bugs in UnAckedMessageTrackerEnabled #11630
Conversation
@oversearch There seems to be a segfault during the execution of Python tests: https://github.com/apache/pulsar/pull/11630/checks?check_run_id=3295994482 |
Thanks for your contribution. For this PR, do we need to update docs? (The PR template contains info about doc, which helps others know more about the changes. Can you provide doc-related info in this and future PR descriptions? Thanks) |
Ah yes, sorry about leaving out the docs section. I had assumed leaving it out would imply "no" but I'll make that explicit. I'm also looking into the python test segfault. I realized the python tests weren't building properly in my environment, and I'm working on getting that sorted out.... I might need to upgrade the Python version in my dev environment (currently 3.6). |
…. This was caught by ASAN in my company's build system.
…ckerEnabled This is very similar to a previous fix I submitted in commit 87ebe80. It's the same basic problem, but this class isn't part of the HandlerBase hierarchy, so it needs an independent fix. Essentially, when we create Boost ASIO timer objects from a connection pointer, they maintain a bare reference to the corresponding io_service object inside the connection object. When the destructor runs, we need to destroy the timer *before* the connection object. Keeping the correct order of these member variables is crucial to ensure we don't hit a use-after-free scenario. This was crashing some of our service code in rare circumstances, and is easily caught with Valgrind or ASAN. I also noticed a rather serious bug in one of the UnAckedMessageTrackerEnabled constructors: I believe the intent here was to use the c++11 "delegating constructors" feature, but I think it's written using the Java style, which doesn't work in C++ (see https://stackoverflow.com/questions/13961037/delegate-constructor-c). The semantics of the existing code would just create a new, separate UnAckedMessageTrackerEnabled on the stack in the constructor scope, then immediately destroy it! I corrected the syntax to ensure this works correctly, and fixed up the other constructor to properly use C++ initializer list syntax. Finally, I removed some dangerous c-style casts (which should *never* be used) in favor of C++ static_cast.
871cb51
to
2daa64c
Compare
I wasn't able to reproduce the stack dump in the tests in my local environment. Everything passes except for an obscure error about a bytes/string type mismatch in the |
…ckerEnabled (#11630) * [C++] Fix an off-by-one error in my original change from commit 83f0345. This was caught by ASAN in my company's build system. * [C++] Fixing use-after-free and constructor bugs in UnAckedMessageTrackerEnabled This is very similar to a previous fix I submitted in commit 87ebe80. It's the same basic problem, but this class isn't part of the HandlerBase hierarchy, so it needs an independent fix. Essentially, when we create Boost ASIO timer objects from a connection pointer, they maintain a bare reference to the corresponding io_service object inside the connection object. When the destructor runs, we need to destroy the timer *before* the connection object. Keeping the correct order of these member variables is crucial to ensure we don't hit a use-after-free scenario. This was crashing some of our service code in rare circumstances, and is easily caught with Valgrind or ASAN. I also noticed a rather serious bug in one of the UnAckedMessageTrackerEnabled constructors: I believe the intent here was to use the c++11 "delegating constructors" feature, but I think it's written using the Java style, which doesn't work in C++ (see https://stackoverflow.com/questions/13961037/delegate-constructor-c). The semantics of the existing code would just create a new, separate UnAckedMessageTrackerEnabled on the stack in the constructor scope, then immediately destroy it! I corrected the syntax to ensure this works correctly, and fixed up the other constructor to properly use C++ initializer list syntax. Finally, I removed some dangerous c-style casts (which should *never* be used) in favor of C++ static_cast. * [C++] Applied clang-format to previous change * [C++] Apparently clang-format didn't like this one either.... (cherry picked from commit 0748ecb)
…ckerEnabled (apache#11630) * [C++] Fix an off-by-one error in my original change from commit 83f0345. This was caught by ASAN in my company's build system. * [C++] Fixing use-after-free and constructor bugs in UnAckedMessageTrackerEnabled This is very similar to a previous fix I submitted in commit 87ebe80. It's the same basic problem, but this class isn't part of the HandlerBase hierarchy, so it needs an independent fix. Essentially, when we create Boost ASIO timer objects from a connection pointer, they maintain a bare reference to the corresponding io_service object inside the connection object. When the destructor runs, we need to destroy the timer *before* the connection object. Keeping the correct order of these member variables is crucial to ensure we don't hit a use-after-free scenario. This was crashing some of our service code in rare circumstances, and is easily caught with Valgrind or ASAN. I also noticed a rather serious bug in one of the UnAckedMessageTrackerEnabled constructors: I believe the intent here was to use the c++11 "delegating constructors" feature, but I think it's written using the Java style, which doesn't work in C++ (see https://stackoverflow.com/questions/13961037/delegate-constructor-c). The semantics of the existing code would just create a new, separate UnAckedMessageTrackerEnabled on the stack in the constructor scope, then immediately destroy it! I corrected the syntax to ensure this works correctly, and fixed up the other constructor to properly use C++ initializer list syntax. Finally, I removed some dangerous c-style casts (which should *never* be used) in favor of C++ static_cast. * [C++] Applied clang-format to previous change * [C++] Apparently clang-format didn't like this one either....
Motivation
This is very similar to a previous fix I submitted in commit 87ebe80. It's the same basic problem, but this class isn't part of the HandlerBase hierarchy, so it needs an independent fix. Essentially, when we create Boost ASIO timer objects from a connection pointer, they maintain a bare reference to the corresponding io_service object inside the connection object. When the destructor runs, we need to destroy the timer before the connection object. Keeping the correct order of these member variables is crucial to ensure we don't hit a use-after-free scenario. This was crashing some of our service code in rare circumstances, and is easily caught with Valgrind or ASAN.
While I was at it, I also noticed a rather serious bug in one of the UnAckedMessageTrackerEnabled constructors: I believe the intent here was to use the c++11 "delegating constructors" feature, but I think it's written using the Java style, which doesn't work in C++ (see https://stackoverflow.com/questions/13961037/delegate-constructor-c). The semantics of the existing code would just create a new, separate UnAckedMessageTrackerEnabled on the stack in the constructor scope, then immediately destroy it! I corrected the syntax to ensure this works correctly, and fixed up the other constructor to properly use C++ initializer list syntax. Finally, I removed some dangerous c-style casts (which should never be used) in favor of C++ static_cast.
Modifications
Fixed the variable ordering and constructor syntax, as well as my old off-by-one bug. I checked for other possible instances of this bug, but it appears all of the other deadline timers are declared after the objects containing their io_service.
Docs
No documentation changes are necessary - it's just a bug fix.
Verifying this change
Change is covered by existing tests, or is otherwise difficult to test. All unit tests pass.