-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] Fix the bug of SinkIOBuffer user-after-free #40674
Conversation
Signed-off-by: trueeyu <lxhhust350@qq.com>
LOG(WARNING) << "SinkIOBuffer join queue failed: " << _exec_queue_id->value; | ||
} | ||
} | ||
} | ||
|
||
virtual Status prepare(RuntimeState* state, RuntimeProfile* parent_profile) = 0; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The most risky bug in this code is:
Memory leak due to not deleting _exec_queue_id
after its use.
You can modify the code like this:
@@ -53,7 +53,19 @@ class SinkIOBuffer {
public:
SinkIOBuffer(int32_t num_sinkers) : _num_result_sinkers(num_sinkers), _exec_queue_id(nullptr) {}
virtual ~SinkIOBuffer() {
if (_exec_queue_id != nullptr) {
// If `Operator` prepare failed, there is no chance to stop queue, so will should stop queue here.
// It is safe to call stop multiple times.
if (bthread::execution_queue_stop(*_exec_queue_id) != 0) {
LOG(WARNING) << "SinkIOBuffer stop queue failed: " << _exec_queue_id->value;
}
if (bthread::execution_queue_join(*_exec_queue_id) != 0) {
LOG(WARNING) << "SinkIOBuffer join queue failed: " << _exec_queue_id->value;
}
+ delete _exec_queue_id; // Free the allocated memory for _exec_queue_id.
+ _exec_queue_id = nullptr; // Safely point _exec_queue_id to nullptr after deletion.
}
}
virtual Status prepare(RuntimeState* state, RuntimeProfile* parent_profile) = 0;
This modification ensures that the memory allocated for _exec_queue_id
is properly released when the SinkIOBuffer
destructor is called, thus preventing a memory leak.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_exec_queue_id is std::unique_ptr, so it's safe here.
[FE Incremental Coverage Report]✅ pass : 0 / 0 (0%) |
@Mergifyio backport branch-3.2 |
@Mergifyio backport branch-3.1 |
@Mergifyio backport branch-3.0 |
@Mergifyio backport branch-2.5 |
✅ Backports have been created
|
✅ Backports have been created
|
✅ Backports have been created
|
✅ Backports have been created
|
[BE Incremental Coverage Report]❌ fail : 6 / 8 (75.00%) file detail
|
Signed-off-by: trueeyu <lxhhust350@qq.com> (cherry picked from commit fe74760)
Signed-off-by: trueeyu <lxhhust350@qq.com> (cherry picked from commit fe74760)
Signed-off-by: trueeyu <lxhhust350@qq.com> (cherry picked from commit fe74760) # Conflicts: # be/src/exec/pipeline/sink/export_sink_operator.cpp # be/src/exec/pipeline/sink/file_sink_operator.cpp # be/src/exec/pipeline/sink/mysql_table_sink_operator.cpp # be/test/exec/sink/sink_io_buffer_test.cpp
Signed-off-by: trueeyu <lxhhust350@qq.com> (cherry picked from commit fe74760) # Conflicts: # be/src/exec/pipeline/sink/export_sink_operator.cpp # be/src/exec/pipeline/sink/file_sink_operator.cpp # be/src/exec/pipeline/sink/mysql_table_sink_operator.cpp # be/test/exec/sink/sink_io_buffer_test.cpp
#40708) Signed-off-by: trueeyu <lxhhust350@qq.com> Co-authored-by: trueeyu <lxhhust350@qq.com>
#40707) Co-authored-by: trueeyu <lxhhust350@qq.com>
#40709) Signed-off-by: trueeyu <lxhhust350@qq.com> Co-authored-by: trueeyu <lxhhust350@qq.com>
#40706) Co-authored-by: trueeyu <lxhhust350@qq.com>
Why I'm doing:
This problem has been fixed about 6 times, but it still hasn’t been completely fixed.
Previous PRs that fixed this issue:
Currently branch-main can reproduce this bug by adding sleep:
#40633
The bug in PR3 is that: join io queue in bthread io task so gets stuck.
The bug in PR5 is that the io thread has not exited and _is_finished has already been set to True, causing the
Operator
to be destructed first.The bug in PR6. The reason is that set_finishing puts the request into the queue first, and then ++_num_pending_chunks, causing problems with DCHECK.
What I'm doing:
Repair principles:
Operator
is destructed.How to fix:
SinkIOBuffer
is destructed, you need to wait for the bthread queue to exit.Operator
prepare failed, there is no chance to stop queue, so will should stop queue in destructor ofSinkIOBuffer
. It is safe to call queue::stop multiple times.What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist:
Bugfix cherry-pick branch check: