Skip to content
This repository has been archived by the owner on Nov 20, 2020. It is now read-only.

Fault tolerance hierarchies #142

Closed
eulerfx opened this issue May 3, 2017 · 0 comments
Closed

Fault tolerance hierarchies #142

eulerfx opened this issue May 3, 2017 · 0 comments

Comments

@eulerfx
Copy link
Contributor

eulerfx commented May 3, 2017

A Kafunk consumer and producer consist of a few independently running processes, which despite running independently have a relationship. Take for example the consumer. It consists of a heartbeat process, as well as a fetch process. An error code returned to the heartbeat process should cause the fetch process to stop and restart. An error code returned to the fetch process indicating a topology change should only cause the fetch process to restart after fetching metadata. These relationships also exist at lower levels. For example, in the TCP channel module, a the TCP receive process mail fail because of a TCP connectivity error or a closed connection. This should cause the TCP channel to attempt re-connection, eventually escalating the error to the Kafka cluster connection.

Async.StartChild starts a process from within an existing process. The processes are linked via shared CancellationToken, however failure of the child process does not cause the paren't process to fail. Likewise, failure of the parent process does not propagate to the child. A proposed extension to this is here.

The proposal consists of types:

/// A node represents a site where computations can occur.
/// It can be linked to other nodes, such that errors on one cause errors in the other.
type Node ~= IVar<unit>

val node : unit -> Node
val startChild : Node -> Async<unit> -> Node
val tryAsync : Node -> Async<'a> -> Async<'a>

/// A process is a node-dependent computation.
type Proc<'a> ~= Node -> Async<'a>

/// Example
let go (a:Async<'a>) = proc {
  // evaluates an async computation in the context
  // of the ambient node
  let! a = a

  // starts a linked async computation
  let! t = Proc.startChildAsTask (async { printfn "hello" })
  
  let! node = Proc.node
  do! Proc.fail (exn("oh no"))

  return 1 }

Another pattern is that of NodeState<'s> and a node-state-dependent operation currently expressed as Resource. A node state may be a Socket instance and a dependent operation may be a read from a socket Socket -> ArraySegment<byte> -> Async<int>. The result of the operation may indicate that the node must be restarted or failed.

We borrow design principles from the Erlang supervision hierarchy.

@eulerfx eulerfx closed this as completed Mar 19, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant