-
Notifications
You must be signed in to change notification settings - Fork 6.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix a timer crash caused by invalid memory management #9656
Conversation
@jay-zhuang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch and thanks @jay-zhuang for the fix! Overall LGTM except a few minor comments/questions.
util/timer.h
Outdated
auto it = map_.find(fn_name); | ||
if (it == map_.end()) { | ||
heap_.push(fn_info.get()); | ||
map_.emplace(std::make_pair(fn_name, std::move(fn_info))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can just do map_.emplace(fn_name, std::move(fn_info))
.
const std::string& fn_name, | ||
uint64_t start_after_us, | ||
uint64_t repeat_every_us) { | ||
bool Add(std::function<void()> fn, const std::string& fn_name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should update the comments in line 51. Specifically, fn_name
has to be unique, and we disallow adding function with the same name to timer.
auto it = map_.find(fn_name); | ||
if (it == map_.end()) { | ||
heap_.push(fn_info.get()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use try_emplace()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you mean map_.try_emplace()
, updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I mean something like
auto res = map_.try_emplace(fn_name, std::move(fn_info));
if (!res.second) {
return false;
}
heap_.push(res.first.second.get());
uint64_t repeat_every_us) { | ||
bool Add(std::function<void()> fn, const std::string& fn_name, | ||
uint64_t start_after_us, uint64_t repeat_every_us) { | ||
InstrumentedMutexLock l(&mutex_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: we can avoid dynamic memory allocation inside this mutex by calling NowMicros() and setting next run time after mutex acquision.
std::unique_ptr<> fn_info(new FunctionInfo(..., dummy_time,...));
InstrumentedMutexLock l(&mutex_);
now = NowMicros();
fn_info->SetNextRunTime(now + start_after_us);
...
util/timer.h
Outdated
assert(!executing_task_ || | ||
fn_info->next_run_time_us >= heap_.top()->next_run_time_us); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems we should return false
here instead of aborting the process if we hit this condition?
@jay-zhuang has updated the pull request. You must reimport the pull request before landing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jay-zhuang. LGTM except a minor comment and one of the CIs complaining about unused var.
auto it = map_.find(fn_name); | ||
if (it == map_.end()) { | ||
heap_.push(fn_info.get()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I mean something like
auto res = map_.try_emplace(fn_name, std::move(fn_info));
if (!res.second) {
return false;
}
heap_.push(res.first.second.get());
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wanted to confirm my understanding: Is the problem we remove the same name twice from map_
and the second time fails? If so, does the duplicate entry come from popping the wrong thing from heap_
due to something added with earlier running time than the executing function? Then we re-add an existing thing to heap_
which would cause two map_.erase()
s after it's Cancel()
d?
Anyways the new code to keep heap_
and map_
in sync looks great!
Yes, but it fails before we try to remove the item the second time as it's memory already be released including the string
Yes, the duplicated entry come from popping the wrong item (it suppose to be popping the current running one then update the next_run_time and re-add it back. but in this case, it's popping the new added one as it's timestamp is newer than the current running one, so after that the current running one is duplicated in the queue). |
@jay-zhuang has updated the pull request. You must reimport the pull request before landing. |
Timer crash when multiple DB instances doing heavy DB open and close operations concurrently. Which is caused by adding a timer task with smaller timestamp than the current running task. Fix it by moving the getting new task timestamp part within timer mutex protection. And other fixes: - Disallow adding duplicated function number to timer - Fix a memory leak in timer when current running task is cancelled
@jay-zhuang has updated the pull request. You must reimport the pull request before landing. |
@jay-zhuang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Summary: Timer crash when multiple DB instances doing heavy DB open and close operations concurrently. Which is caused by adding a timer task with smaller timestamp than the current running task. Fix it by moving the getting new task timestamp part within timer mutex protection. And other fixes: - Disallow adding duplicated function name to timer - Fix a minor memory leak in timer when a running task is cancelled Pull Request resolved: #9656 Reviewed By: ajkr Differential Revision: D34626296 Pulled By: jay-zhuang fbshipit-source-id: 6b6d96a5149746bf503546244912a9e41a0c5f6b
@@ -372,6 +372,7 @@ Compaction* UniversalCompactionBuilder::PickCompaction() { | |||
const int kLevel0 = 0; | |||
score_ = vstorage_->CompactionScore(kLevel0); | |||
sorted_runs_ = CalculateSortedRuns(*vstorage_); | |||
fprintf(stdout, "JJJ1\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete test code?
Timer crash when multiple DB instances doing heavy DB open and close
operations concurrently. Which is caused by adding a timer task with
smaller timestamp than the current running task. Fix it by moving the
getting new task timestamp part within timer mutex protection.
And other fixes: