Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Managing system restarts from within the components #20

Closed
vspinu opened this issue May 18, 2017 · 7 comments
Closed

Managing system restarts from within the components #20

vspinu opened this issue May 18, 2017 · 7 comments

Comments

@vspinu
Copy link

vspinu commented May 18, 2017

Hi,

What is the idiomatic way to manage faults and restarts from within the system? Let's say I have a web socket and a couple of components that depend on that socket:

(def config
  {:feed/db {:name :blabla
             :foo (atom nil)
             :bar (atom nil)}}
  :feed/ws {:url "wss:/ws-feed.example.com"}
  :periodic/ping {:feed (ig/ref :feed/ws)
                  :period 1000
                  :db (ig/ref :feed/db)}
  :periodic/balances {:feed (ig/ref :feed/ws)
                      :period 60000
                      :db (ig/ref :feed/db)})

When web socket :feed/ws breaks for external reasons I would like to automatically restart it and all its dependencies. Thanks.

@vspinu
Copy link
Author

vspinu commented May 18, 2017

The above use case is not limited to the components restarting themselves. For example, a watchdog component that checks the system health should be able to restart parts of the system as needed. Thus, such components must be able to access the system var somehow. How would you go about implementing this?

@weavejester
Copy link
Owner

There are two broad solutions to this.

One solution is that the component itself can take care of restarting connections. A common example of this is a SQL connection pool. If a connection is broken, the pool will give the user a new connection. All I/O connections should be wrapped in boundaries anyway, both to allow easier testing and to loosen the coupling between the connection and the rest of your code.

Another solution is to introduce a watchdog component and pass in the components you want to watch as references. For example:

{:example/server {:port 8080}
 :example/watchdog #{#ig/ref :example/server}}

Perhaps the server adheres to a protocol that allows it to be restarted.

Alternatively, and usually preferably, this can be done at a system level. If something goes wrong and we lose a connection or the server goes down, we can just log the problem then exit the application with a failure code. If we're using something like systemd to manage our application, it will be restarted automatically.

@vspinu
Copy link
Author

vspinu commented May 18, 2017

All I/O connections should be wrapped in boundaries anyway,

I guess this is what I was missing. I was exposing too much of the component to its children.

Are there reasonably complete examples of systems built on integrant somewhere? Something like those in system/examples? I have never used component nor mount, so I am having a bit of a struggle on the "grand-design" side of things.

@vspinu
Copy link
Author

vspinu commented May 18, 2017

{:example/server {:port 8080} :example/watchdog #{#ig/ref :example/server}}

But in order for this to work I need to design the :example/server as mutable object such that I could restart it in-place. Is this what you meant?

@weavejester
Copy link
Owner

Ah, sorry, I should have explained further. Boundaries are a Duct concept, and as I've been writing a lot of Duct recently, and the new Duct alpha makes heavy use of Integrant, I forgot I was replying to an issue on the Integrant repository, and not the Duct repository.

So let me start again 😃 .

I've found it good practice to avoid tightly coupling my functional code with the code that handles I/O. Some languages, like Haskell, enforce this distinction; in Clojure we have to have a little more self-discipline.

A websocket is a little complex for an example, because we need to handle channel closing, errors, and so forth. Instead, consider a SQL connection pool, which already does all those things for us. We could interact directly with the connection:

(defn get-user [spec email]
  (jdbc/query spec ["SELECT * FROM users WHERE email = ?" email]))

(defmethod ig/init-key :database/sql [_ options]
  {:datasource (db/connect-pool options)})

But I've found it's useful to add a layer inbetween to loosen the coupling:

(defprotocol Users
  (get-user [db email]))

(defrecord DatabaseBoundary [spec]
  Users
  (get-user [_ email]
    (jdbc/query spec ["SELECT * FROM users WHERE email = ?" email])))

(defmethod ig/init-key :database/sql [_ options]
  (->DatabaseBoundary {:datasource (db/connect-pool options)}))

In this example, the DatabaseBoundary record does very little, but it still provides us with a way of mocking out the database when required. In a more complex example, we can manage connections and reconnections, and use the protocol to abstract that away.

@weavejester
Copy link
Owner

But in order for this to work I need to design the :example/server as mutable object such that I could restart it in-place. Is this what you meant?

The :example/watchdog key would need some internal mutation, but not necessarily the :example/server key, so long as no other keys depended on the server.

However, now that I've thought about it a little further, I don't think I'd recommend a :example/watchdog approach. In general, I think the safest option is to stop the system when something unexpected happens. Checking the system's health shouldn't be necessary if we kill it off the moment it looks sick.

For connections and so forth, we can give the component itself a connection pool, or some way of restarting the connection when it's dropped. This is a common problem, so there may be libraries out there to simplify this.

@vspinu
Copy link
Author

vspinu commented May 18, 2017

Boundaries are a Duct concept,

Yes. I read your note on boundaries here and will try to follow the advice from now on.

Checking the system's health shouldn't be necessary if we kill it off the moment it looks sick.

I have multiple websockets open (listening for transactions) and restarting the full application on every websocket disconnect is really not an option.

This is a common problem, so there may be libraries out there to simplify this.

I am using manifold which provides on-close callback but AFAICS has no provision for reconnection. So for now I will simply keep the connection in an atom or a mutable deftype and reset it when needed.

Thank you for all the input!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants