Skip to content

If the discovery service crashes we don't get a signal in the gateway #1202

@gbregman

Description

@gbregman

When I added some code in the discovery to send itself a SIGTERM after 30 seconds I saw that no SIGCHLD was sent to the gateway. So, even though the discovery service exited the gateway kept running.

It seems that the problem is that the discovery process doesn't exit when an exception is raised. So, the process is still stuck somewhere and the parent doesn't get a signal.

A related issue is that right now the discovery process doesn't die when we send it SIGTERM signal. It catches the signal and raise a SystemExit exception. But, as described above the process doesn't abort on exception. This can be easily fixed by:

diff --git a/control/server.py b/control/server.py
index 4f7e439..ec51896 100644
--- a/control/server.py
+++ b/control/server.py
@@ -445,6 +445,7 @@ class GatewayServer:
             self.omap_state = None
             self.name = None
             signal.signal(signal.SIGCHLD, signal.SIG_DFL)
+            signal.signal(signal.SIGTERM, signal.SIG_DFL)
             if self.server:
                 self.server.stop(None)
                 self.server = None

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    ✅ Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions