-
-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Attempt to fix assertition failure in component shutdown handler #1473
Conversation
47dd15a
to
072a4bd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks alright to me. I have a few minor requests. Let's hope this fixes the actual issue.
On my local machine, this works as well: --- a/libvast/src/system/node.cpp
+++ b/libvast/src/system/node.cpp
@@ -611,16 +611,7 @@ node(node_actor::stateful_pointer<node_state> self, std::string name, path dir,
for (const auto& label : remaining)
schedule_teardown(label);
// Finally, bring down the filesystem.
- // FIXME: there's a super-annoying bug that makes it impossible to receive a
- // DOWN message from the filesystem during shutdown, but *only* when the
- // filesystem is detached! This might be related to a bug we experienced
- // earlier: https://github.com/actor-framework/actor-framework/issues/1110.
- // Until it gets fixed, we cannot add the filesystem to the set of
- // sequentially terminated actors but instead let it implicitly terminate
- // after the node exits when the filesystem ref count goes to 0. (A
- // shutdown after the node won't be an issue because the filesystem is
- // currently stateless, but this needs to be reconsidered when it changes.)
- // components.push_back(std::move(*filesystem));
+ components.push_back(std::move(*filesystem));
auto shutdown_kill_timeout = shutdown_grace_period / 5;
shutdown<policy::sequential>(self, std::move(components),
shutdown_grace_period, shutdown_kill_timeout); Let's see if CI is happy. I might add this as separate commit afterwards. |
This also still needs a changelog entry |
|
This just needs the changelog entry, then it's ready to go imo. |
Killed components were never removed from the registry because demonitoring means their DOWN handler doesn't get invoked. We remove the call to demonitor in order to make sure the cleanup code in the DOWN handler gets executed.
This commit introduces utility functions to register/deregister components in the common case to prevent unmatched monitor/demonitor calls.
📔 Description
This PR attempts to fix an assertition failure in the DOWN handler in the node. The invariant should be that components in the registry are always monitored.
📝 Checklist
🎯 Review Instructions
Commit-by-commit.