Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++] Fixing use-after-free and constructor bugs in UnAckedMessageTrackerEnabled #11630

Merged
merged 4 commits into from
Aug 12, 2021

Conversation

oversearch
Copy link
Contributor

@oversearch oversearch commented Aug 10, 2021

Motivation

This is very similar to a previous fix I submitted in commit 87ebe80. It's the same basic problem, but this class isn't part of the HandlerBase hierarchy, so it needs an independent fix. Essentially, when we create Boost ASIO timer objects from a connection pointer, they maintain a bare reference to the corresponding io_service object inside the connection object. When the destructor runs, we need to destroy the timer before the connection object. Keeping the correct order of these member variables is crucial to ensure we don't hit a use-after-free scenario. This was crashing some of our service code in rare circumstances, and is easily caught with Valgrind or ASAN.

While I was at it, I also noticed a rather serious bug in one of the UnAckedMessageTrackerEnabled constructors: I believe the intent here was to use the c++11 "delegating constructors" feature, but I think it's written using the Java style, which doesn't work in C++ (see https://stackoverflow.com/questions/13961037/delegate-constructor-c). The semantics of the existing code would just create a new, separate UnAckedMessageTrackerEnabled on the stack in the constructor scope, then immediately destroy it! I corrected the syntax to ensure this works correctly, and fixed up the other constructor to properly use C++ initializer list syntax. Finally, I removed some dangerous c-style casts (which should never be used) in favor of C++ static_cast.

Modifications

Fixed the variable ordering and constructor syntax, as well as my old off-by-one bug. I checked for other possible instances of this bug, but it appears all of the other deadline timers are declared after the objects containing their io_service.

Docs

No documentation changes are necessary - it's just a bug fix.

Verifying this change

  • Make sure that the change passes the CI checks.

Change is covered by existing tests, or is otherwise difficult to test. All unit tests pass.

@merlimat merlimat marked this pull request as ready for review August 10, 2021 22:56
@merlimat merlimat added component/c++ type/bug The PR fixed a bug or issue reported a bug labels Aug 10, 2021
@merlimat merlimat added this to the 2.9.0 milestone Aug 10, 2021
@codelipenghui codelipenghui reopened this Aug 11, 2021
@merlimat
Copy link
Contributor

@oversearch There seems to be a segfault during the execution of Python tests: https://github.com/apache/pulsar/pull/11630/checks?check_run_id=3295994482

@Anonymitaet
Copy link
Member

Thanks for your contribution. For this PR, do we need to update docs?

(The PR template contains info about doc, which helps others know more about the changes. Can you provide doc-related info in this and future PR descriptions? Thanks)

@oversearch
Copy link
Contributor Author

Ah yes, sorry about leaving out the docs section. I had assumed leaving it out would imply "no" but I'll make that explicit.

I'm also looking into the python test segfault. I realized the python tests weren't building properly in my environment, and I'm working on getting that sorted out.... I might need to upgrade the Python version in my dev environment (currently 3.6).

…. This was caught by ASAN in my company's build system.
…ckerEnabled

This is very similar to a previous fix I submitted in commit 87ebe80.  It's the same basic problem, but this class isn't part of the HandlerBase hierarchy, so it needs an independent fix.  Essentially, when we create Boost ASIO timer objects from a connection pointer, they maintain a bare reference to the corresponding io_service object inside the connection object.  When the destructor runs, we need to destroy the timer *before* the connection object.  Keeping the correct order of these member variables is crucial to ensure we don't hit a use-after-free scenario.  This was crashing some of our service code in rare circumstances, and is easily caught with Valgrind or ASAN.

I also noticed a rather serious bug in one of the UnAckedMessageTrackerEnabled constructors: I believe the intent here was to use the c++11 "delegating constructors" feature, but I think it's written using the Java style, which doesn't work in C++ (see https://stackoverflow.com/questions/13961037/delegate-constructor-c). The semantics of the existing code would just create a new, separate UnAckedMessageTrackerEnabled on the stack in the constructor scope, then immediately destroy it!  I corrected the syntax to ensure this works correctly, and fixed up the other constructor to properly use C++ initializer list syntax.  Finally, I removed some dangerous c-style casts (which should *never* be used) in favor of C++ static_cast.
@oversearch
Copy link
Contributor Author

I wasn't able to reproduce the stack dump in the tests in my local environment. Everything passes except for an obscure error about a bytes/string type mismatch in the test_encryption Python test that I think is related to the older version of Python I'm running. I rebased on master and pushed again, and it seems to have passed this time. Fingers crossed on the rest of the checks...

@merlimat merlimat added the doc-not-needed Your PR changes do not impact docs label Aug 12, 2021
@merlimat merlimat merged commit 0748ecb into apache:master Aug 12, 2021
@oversearch oversearch deleted the fix/cpp_bugs branch August 13, 2021 07:40
codelipenghui pushed a commit that referenced this pull request Sep 9, 2021
…ckerEnabled (#11630)

* [C++] Fix an off-by-one error in my original change from commit 83f0345.  This was caught by ASAN in my company's build system.

* [C++] Fixing use-after-free and constructor bugs in UnAckedMessageTrackerEnabled

This is very similar to a previous fix I submitted in commit 87ebe80.  It's the same basic problem, but this class isn't part of the HandlerBase hierarchy, so it needs an independent fix.  Essentially, when we create Boost ASIO timer objects from a connection pointer, they maintain a bare reference to the corresponding io_service object inside the connection object.  When the destructor runs, we need to destroy the timer *before* the connection object.  Keeping the correct order of these member variables is crucial to ensure we don't hit a use-after-free scenario.  This was crashing some of our service code in rare circumstances, and is easily caught with Valgrind or ASAN.

I also noticed a rather serious bug in one of the UnAckedMessageTrackerEnabled constructors: I believe the intent here was to use the c++11 "delegating constructors" feature, but I think it's written using the Java style, which doesn't work in C++ (see https://stackoverflow.com/questions/13961037/delegate-constructor-c). The semantics of the existing code would just create a new, separate UnAckedMessageTrackerEnabled on the stack in the constructor scope, then immediately destroy it!  I corrected the syntax to ensure this works correctly, and fixed up the other constructor to properly use C++ initializer list syntax.  Finally, I removed some dangerous c-style casts (which should *never* be used) in favor of C++ static_cast.

* [C++] Applied clang-format to previous change

* [C++] Apparently clang-format didn't like this one either....

(cherry picked from commit 0748ecb)
@codelipenghui codelipenghui added the cherry-picked/branch-2.8 Archived: 2.8 is end of life label Sep 9, 2021
bharanic-dev pushed a commit to bharanic-dev/pulsar that referenced this pull request Mar 18, 2022
…ckerEnabled (apache#11630)

* [C++] Fix an off-by-one error in my original change from commit 83f0345.  This was caught by ASAN in my company's build system.

* [C++] Fixing use-after-free and constructor bugs in UnAckedMessageTrackerEnabled

This is very similar to a previous fix I submitted in commit 87ebe80.  It's the same basic problem, but this class isn't part of the HandlerBase hierarchy, so it needs an independent fix.  Essentially, when we create Boost ASIO timer objects from a connection pointer, they maintain a bare reference to the corresponding io_service object inside the connection object.  When the destructor runs, we need to destroy the timer *before* the connection object.  Keeping the correct order of these member variables is crucial to ensure we don't hit a use-after-free scenario.  This was crashing some of our service code in rare circumstances, and is easily caught with Valgrind or ASAN.

I also noticed a rather serious bug in one of the UnAckedMessageTrackerEnabled constructors: I believe the intent here was to use the c++11 "delegating constructors" feature, but I think it's written using the Java style, which doesn't work in C++ (see https://stackoverflow.com/questions/13961037/delegate-constructor-c). The semantics of the existing code would just create a new, separate UnAckedMessageTrackerEnabled on the stack in the constructor scope, then immediately destroy it!  I corrected the syntax to ensure this works correctly, and fixed up the other constructor to properly use C++ initializer list syntax.  Finally, I removed some dangerous c-style casts (which should *never* be used) in favor of C++ static_cast.

* [C++] Applied clang-format to previous change

* [C++] Apparently clang-format didn't like this one either....
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cherry-picked/branch-2.8 Archived: 2.8 is end of life doc-not-needed Your PR changes do not impact docs release/2.8.2 type/bug The PR fixed a bug or issue reported a bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants