Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hinted handoff: use after free during shutdown #4836

Closed
vladzcloudius opened this issue Aug 12, 2019 · 1 comment
Closed

Hinted handoff: use after free during shutdown #4836

vladzcloudius opened this issue Aug 12, 2019 · 1 comment
Labels
Milestone

Comments

@vladzcloudius
Copy link
Contributor

@vladzcloudius vladzcloudius commented Aug 12, 2019

Installation details
HEAD: 77686ab

Description
ASAN caught a use-after-free during regular shut-down process during unit testing the fix for #4685.
(I had to patch space_watchdog's code in order to simulate the race, see the patch below).
It turns out that the resource manager shutdown is not performed in the correct way. It shuts down managers (which in turn shut down all underlying end_point_managers and deletes all end_point_managers instances) and only after that it shuts down a space_watchdog which may have been blocked waiting for an <some ep_manager object>::file_update_mutex().

The proper shut down procedure should be:

  1. Prevent hints from being generated.
  2. Stop a resource_manager which should be stopping a space_watchdog.
  3. Stop all managers.

Patch used to trigger the race

diff --git a/db/hints/resource_manager.hh b/db/hints/resource_manager.hh
index 453d51b51..864db57d0 100644
--- a/db/hints/resource_manager.hh
+++ b/db/hints/resource_manager.hh
@@ -51,7 +51,7 @@ class manager;
 class space_watchdog {
 private:
     using ep_key_type = gms::inet_address;
-    static const std::chrono::seconds _watchdog_period;
+    static const std::chrono::microseconds _watchdog_period;
 
     enum class state {
         stopping,               // stop() was called
diff --git a/db/hints/resource_manager.cc b/db/hints/resource_manager.cc
index 2ec4d67f0..dd990eaa2 100644
--- a/db/hints/resource_manager.cc
+++ b/db/hints/resource_manager.cc
@@ -60,7 +60,7 @@ future<semaphore_units<semaphore_default_exception_factory>> resource_manager::g
     return get_units(_send_limiter, hint_memory_budget);
 }
 
-const std::chrono::seconds space_watchdog::_watchdog_period = std::chrono::seconds(1);
+const std::chrono::microseconds space_watchdog::_watchdog_period = std::chrono::microseconds(1);
 
 space_watchdog::space_watchdog(shard_managers_set& managers, per_device_limits_map& per_device_limits_map)
     : _shard_managers(managers)
@@ -144,8 +144,10 @@ void space_watchdog::on_timer() {
                     // continue to enumeration - there is no one to change them.
                     auto it = shard_manager.find_ep_manager(de.name);
                     if (it != shard_manager.ep_managers_end()) {
-                        return with_lock(it->second.file_update_mutex(), [this, &shard_manager, dir = std::move(dir), ep_name = std::move(de.name)]() mutable {
-                            return scan_one_ep_dir(dir / ep_name, shard_manager, ep_key_type(ep_name));
+                        return with_lock(it->second.file_update_mutex(), [this, &shard_manager, dir = std::move(dir), ep_name = std::move(de.name)] () mutable {
+                            return sleep_abortable(std::chrono::seconds(5)).then([this, &shard_manager, dir = std::move(dir), ep_name = std::move(ep_name)] () mutable {
+                                return scan_one_ep_dir(dir / ep_name, shard_manager, ep_key_type(ep_name));
+                            });
                         });
                     } else {
                         return scan_one_ep_dir(dir / de.name, shard_manager, ep_key_type(de.name));
vladzcloudius pushed a commit to vladzcloudius/scylla that referenced this issue Aug 12, 2019
space_watchdog blocks on the end_point_manager::file_update_mutex().
Therefore when we shut scylla down we need to first stop the space_watchdog
(which is a part of resource_manager) and only after that shut all instances
of db::hints::manager down. Otherwise there may be a use-after-free event.

However we should also make sure that hints do not break the disk space
contract during the time frame when space_watchdog is down but
hints::managers are still up. In order to ensure that we will disable
hints storing before stopping space_watchdog.

Fixes scylladb#4836

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
@tzach tzach added the bug label Aug 26, 2019
@tzach tzach added this to the 3.2 milestone Aug 26, 2019
@bhalevy

This comment has been minimized.

Copy link
Contributor

@bhalevy bhalevy commented Oct 2, 2019

Refs #5089

avikivity added a commit that referenced this issue Oct 2, 2019
space_watchdog blocks on the end_point_manager::file_update_mutex().
Therefore when we shut scylla down we need to first stop the space_watchdog
(which is a part of resource_manager) and only after that shut all instances
of db::hints::manager down. Otherwise there may be a use-after-free event.

However we should also make sure that hints do not break the disk space
contract during the time frame when space_watchdog is down but
hints::managers are still up. In order to ensure that we will disable
hints storing before stopping space_watchdog.

Fixes #4836

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Message-Id: <20190812201857.5716-2-vladz@scylladb.com>
avikivity added a commit that referenced this issue Oct 3, 2019
…Vlad

"
Fix races that may lead to use-after-free events and file system level exceptions
during shutdown and drain.

The root cause of use-after-free events in question is that space_watchdog blocks on
end_point_hints_manager::file_update_mutex() and we need to make sure this mutex is alive as long as
it's accessed even if the corresponding end_point_hints_manager instance
is destroyed in the context of manager::drain_for().

File system exceptions may occur when space_watchdog attempts to scan a
directory while it's being deleted from the drain_for() context.
In case of such an exception new hints generation is going to be blocked
- including for materialized views, till the next space_watchdog round (in 1s).

Issues that are fixed are #4685 and #4836.

Tested as follows:
 1) Patched the code in order to trigger the race with (a lot) higher
    probability and running slightly modified hinted handoff replace
    dtest with a debug binary for 100 times. Side effect of this
    testing was discovering of #4836.
 2) Using the same patch as above tested that there are no crashes and
    nodes survive stop/start sequences (they were not without this series)
    in the context of all hinted handoff dtests. Ran the whole set of
    tests with dev binary for 10 times.
"

* 'hinted_handoff_race_between_drain_for_and_space_watchdog_no_global_lock-v2' of https://github.com/vladzcloudius/scylla:
  hinted handoff: fix a race on a directory removal between space_watchdog and drain_for()
  hinted handoff: make taking file_update_mutex safe
  db::hints::manager::drain_for(): fix alignment
  db::hints::manager: serialize calls to drain_for()
  db::hints: cosmetics: identation and missing method qualifier
avikivity added a commit that referenced this issue Oct 5, 2019
…Vlad

"
Fix races that may lead to use-after-free events and file system level exceptions
during shutdown and drain.

The root cause of use-after-free events in question is that space_watchdog blocks on
end_point_hints_manager::file_update_mutex() and we need to make sure this mutex is alive as long as
it's accessed even if the corresponding end_point_hints_manager instance
is destroyed in the context of manager::drain_for().

File system exceptions may occur when space_watchdog attempts to scan a
directory while it's being deleted from the drain_for() context.
In case of such an exception new hints generation is going to be blocked
- including for materialized views, till the next space_watchdog round (in 1s).

Issues that are fixed are #4685 and #4836.

Tested as follows:
 1) Patched the code in order to trigger the race with (a lot) higher
    probability and running slightly modified hinted handoff replace
    dtest with a debug binary for 100 times. Side effect of this
    testing was discovering of #4836.
 2) Using the same patch as above tested that there are no crashes and
    nodes survive stop/start sequences (they were not without this series)
    in the context of all hinted handoff dtests. Ran the whole set of
    tests with dev binary for 10 times.
"

Fixes #4685
Fixes #4836

* 'hinted_handoff_race_between_drain_for_and_space_watchdog_no_global_lock-v2' of https://github.com/vladzcloudius/scylla:
  hinted handoff: fix a race on a directory removal between space_watchdog and drain_for()
  hinted handoff: make taking file_update_mutex safe
  db::hints::manager::drain_for(): fix alignment
  db::hints::manager: serialize calls to drain_for()
  db::hints: cosmetics: identation and missing method qualifier

(cherry picked from commit 3cb081e)
avikivity added a commit that referenced this issue Oct 5, 2019
…Vlad

"
Fix races that may lead to use-after-free events and file system level exceptions
during shutdown and drain.

The root cause of use-after-free events in question is that space_watchdog blocks on
end_point_hints_manager::file_update_mutex() and we need to make sure this mutex is alive as long as
it's accessed even if the corresponding end_point_hints_manager instance
is destroyed in the context of manager::drain_for().

File system exceptions may occur when space_watchdog attempts to scan a
directory while it's being deleted from the drain_for() context.
In case of such an exception new hints generation is going to be blocked
- including for materialized views, till the next space_watchdog round (in 1s).

Issues that are fixed are #4685 and #4836.

Tested as follows:
 1) Patched the code in order to trigger the race with (a lot) higher
    probability and running slightly modified hinted handoff replace
    dtest with a debug binary for 100 times. Side effect of this
    testing was discovering of #4836.
 2) Using the same patch as above tested that there are no crashes and
    nodes survive stop/start sequences (they were not without this series)
    in the context of all hinted handoff dtests. Ran the whole set of
    tests with dev binary for 10 times.
"

Fixes #4685
Fixes #4836.

* 'hinted_handoff_race_between_drain_for_and_space_watchdog_no_global_lock-v2' of https://github.com/vladzcloudius/scylla:
  hinted handoff: fix a race on a directory removal between space_watchdog and drain_for()
  hinted handoff: make taking file_update_mutex safe
  db::hints::manager::drain_for(): fix alignment
  db::hints::manager: serialize calls to drain_for()
  db::hints: cosmetics: identation and missing method qualifier

(cherry picked from commit 3cb081e)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.