A library that implements the sagas pattern for Erlang
Switch branches/tags
Nothing to show
Clone or download
mrallen1 Merge pull request #1 from mrallen1/rename
Rename the abstractions
Latest commit cf5604f May 30, 2017
Permalink
Failed to load latest commit information.
include
src
test
.gitignore
LICENSE
README.md
rebar.config

README.md

Gisla

This is a library for Erlang that implements the Saga pattern for error recovery/cleanup in distributed transactions. The saga pattern describes two call flows, a forward flow that represents progress, and an opposite rollback flow which represents recovery and cleanup activities.

Concept

The sagas pattern is a useful way to recover from long-running distributed transactions.

For example:

  • you want to set up some cloud infrastructure which is dependent on other cloud infrastructure, and if that other cloud infrastructure fails, you want to clean up the resource allocations already performed.

  • you have a microservice architecture and you have resources which need to be cleaned up if a downstream service fails

Using this library, you can create a "transaction" - a matched set of operations organized into "steps" which will be executed one after the other. The new state from the previous operation will be fed into subsequent function calls.

Each operation in a step represents "forward" progress and "rollback". When a "forward" operation fails, gisla will use the accumulated state to execute rollback operations attached to already completed steps in reverse order.

Pipeline | F1 -> F2 -> F3 (boom) -> F4 ...
 Example | R1 <- R2 <- R3 <-+

Use

Forward and rollback operations

First, you need to write the forward and rollback closures for each step. Each closure should take at least one parameter, which is the state of the transaction. State can be any arbitrary Erlang term but proplists are the recommended format.

These closures may either be functions defined by fun(State) -> ok end or fun do_something/1 or they can be tuples in the form of {Module, Function, Arguments = []} MFA tuples will automatically get the transaction state as the last parameter of the argument list.

The transaction state will also get the pid of the gisla process so that your operation may optionally return "checkpoint" state changes - these would be incremental state mutations during the course of a step which you may want to use during any unwinding operations that may come later. It's very important that any checkpoint states include all current state too - do not just checkpoint the mutations. Gisla does not merge checkpoint changes - you are responsible for doing that.

When a step is finished, the operation must return its final (possibly mutated) state. This will automatically be reported back to gisla as the "step-complete" state, which would then be passed into the next stage of the transaction.

An example might look something like this:

example_forward(State) ->
    % definitely won't fail!
    Results0 = {terah_id, Id} = terah:assign_id(),
    NewState0 = [ Results0 | State ],

    %% The pid of the gisla process is injected automatically
    %% to the pipeline state for checkpointing purposes.
    gisla:checkpoint(NewState0),

    % might fail - TODO: fix real soon
    % but we checkpointed out new ID assignment
    true = unstable_network:activate_terah_id(Id),
    NewState1 = [ {terah_id_active, true} | NewState0 ],
    gisla:checkpoint(NewState1),

    % final operation, this updates an ETS table,
    % probably no failure.
    {terah_ets_tbl, TableName} = lists:keyfind(terah_ets_tbl, 1, State),
    true = terah:update_ets(TableName, Id),
    NewState2 = [ {terah_ets_updated, true} | NewState1 ],
    NewState2.

The rollback operation might be something like:

example_rollback(State) ->
    %% gisla pid is in our state (if we want it)
    {terah_ets_tbl, TableName} = lists:keyfind(terah_ets_tbl, 1, State),
    true = terah:remove_ets(TableName, Id),

    {terah_id, Id} = lists:keyfind(terah_id, 1, State),
    true = unstable_network:deactivate_terah_id(Id),
    true = terah:make_id_failed(Id),
    [{ terah_id_rollback, Id } | State ].

In this example, we don't send any checkpoints during rollback, just the final state update at the end of the function.

Creating operations, steps and a transaction

Once the closures have been written, you are ready to create steps for your transaction.

There are three abstractions in this library, from most specific to most general:

Operations

Operation records wrap the closures which do the work of each operation. Timeout information is also stored here - the default is 5000 millseconds.

Operation records have the following extra fields to provide additional information on execution results:

  • state: can be either ready meaning ready to run, or complete meaning the function was executed.

  • result: success or failed, depending on the outcome of execution

  • reason: Contains the exit reason from a process on success or on an error.

They are created using the new_operation/0,1,2 functions. There is also an update_operation_timeout/2 function by which you may adjust a timeout value.

Steps

Steps are containers that have a name (which may be an atom, a binary string or a string), a single forward operation, and a matched rollback operation.

They are created using new_step/0 or new_stetp/3 functions. As a bit of syntactic sugar, you may call new_step/3 with either #operation records or with naked functions or MFA tuples.

Transactions

A transaction is a container for a name (which again may be an atom, a binary string or a string), and a ordered list of steps which will be executed left to right.

Transactions can be made using the new_transaction/0 along with add_step/2 and delete_step/2 or new_transaction/2. There is also a describe_transaction/1 function which outputs a simple list of the transaction name plus all step names in order of execution.

Executing a transaction

Once a transaction is constructed and the steps are organized, you are ready to execute it.

You can do that using execute/2. The State parameter should be in the form of a proplist.

When a transaction has been executed, it returns a tuple of {'ok'|'rollback', FinalT, FinalState} where FinalT is the original transaction with updated execution information in the operation records and FinalState is the accumulated state mutations across all steps.

State = [{foo, 1}, {bar, 2}, {baz, 3}],
Step1 = new_step(<<"step 1">>, fun frobulate/1, fun defrobulate/1),
Step2 = new_step(<<"step 2">>, fun activate_frob/1, fun deactivate_frob/1),
Transaction = gisla:new_transaction(<<"example">>, [ Step1, Step2 ]),
{Outcome, FinalT, FinalState} = gisla:execute(Transaction, State),

case Outcome of
  ok -> ok;
  rollback ->
      io:format("Transaction failed. Execution details: ~p, Final state: ~p~n", 
        [FinalT, FinalState]),
      error(transaction_rolled_back)
end.

Errors / timeouts during rollback

If a crash or timeout occurs during rollback, gisla will itself crash.

Build

gisla is built using rebar3. It has a dependency on the hut logging abstraction library in hopes that this would make using it in both Erlang and Elixir easier. By default hut uses the built in Erlang error_logger facility to log messages. Hut also supports a number of other logging options including Elixir's built in logging library and lager.

About the name

It was inspired by the Icelandic saga Gisla.