Skip to content

Elixir implementation of circuit breaker pattern, wrapped around Erlang's fuse library

License

Notifications You must be signed in to change notification settings

DoggettCK/breaker_box

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BreakerBox

Hex Version Hex Downloads License

Description

BreakerBox is an implementation of the circuit breaker pattern, wrapping the Fuse Erlang library with a supervised server for ease of breaker configuration and management.

Examples

Breaker configuration

breaker_config =
 %BreakerBox.BreakerConfiguration{}
 |> BreakerBox.BreakerConfiguration.trip_on_failure_number(5)
 |> BreakerBox.BreakerConfiguration.within_minutes(1)
 |> BreakerBox.BreakerConfiguration.reset_after_minutes(1)

BreakerBox is intended to be user-friendly for configuration, wrapping Fuse's options in a way that's easier to understand.

For example, Fuse's configuration allows you to set the number of errors tolerated in a given time window, but in testing, developers found that confusing, as they expected the breaker to be tripped after the Nth error was encountered, only to find that it actually tripped on error N+1. This means that behind the scenes, BreakerBox is telling Fuse to tolerate N-1 errors.

Both within_* and after_* methods have variants accepting minutes, seconds, or milliseconds.

A default %BreakerBox.BreakerConfiguration{} will trip on the 5th failure within 1 second, automatically resetting to untripped after 5 seconds.

Registering a breaker manually

BreakerBox.register("BreakerName", breaker_config)

Breakers must be registered with a unique name and configuration. Re-registering a breaker with the same name will overwrite the existing breaker.

Names can be strings, atoms, or for ease of use in automatic registration, module names.

Registering a breaker automatically

BreakerBox is designed to be used with Elixir's supervision system, so we've provided a way to automatically register breakers at application startup, provided they implement a Behaviour from BreakerBox.BreakerConfiguration.

# breaker_one.ex
defmodule BreakerOne do
  @behaviour BreakerBox.BreakerConfiguration

  @impl true
  def registration do
    # Fail after 3rd error in one minute, resetting after a minute
    breaker_config =
      %BreakerBox.BreakerConfiguration{}
      |> BreakerBox.BreakerConfiguration.trip_on_failure_number(3)
      |> BreakerBox.BreakerConfiguration.within_minutes(1)
      |> BreakerBox.BreakerConfiguration.reset_after_minutes(1)

    {__MODULE__, breaker_config}
  end
end

# application.ex
defmodule YourApplication do
  use Application

  @circuit_breaker_modules [
    BreakerOne
  ]

  def start(_type, _args) do
    import Supervisor.Spec

    children = [
      supervisor(Repo, []),
      worker(BreakerBox, [@circuit_breaker_modules])
    ]

    opts = [strategy: :one_for_one, name: Supervisor]
    Supervisor.start_link(children, opts)
  end
end

This will register the breaker using the module's own name as the breaker name, though as mentioned earlier, you can use whatever you want. BreakerBox uses the Behave package to ensure that whatever modules you pass in for automatic registration implement the BreakerBox.BreakerConfiguration behaviour, warning you via Logger messages at startup if anything is misconfigured.

View registered breakers

iex> BreakerBox.registered
%{
  BreakerOne => %BreakerBox.BreakerConfiguration{
    failure_window: 60000,
    max_failures: 3,
    reset_window: 60000
  },
  BreakerTwo => %BreakerBox.BreakerConfiguration{
    failure_window: 60000,
    max_failures: 5,
    reset_window: 30000
  }
}

View breaker status(es)

# View status of all breakers
iex> BreakerBox.status()
%{BreakerOne => {:ok, BreakerOne}, BreakerTwo => {:ok, BreakerTwo}}

# View status of a particular breaker
iex> BreakerBox.status(BreakerOne)
{:ok, BreakerOne}

Status of a breaker will be returned as one of:

  • {:ok, breaker_name}
  • {:error, {:breaker_tripped, breaker_name}}
  • {:error, {:breaker_not_found, breaker_name}}

Tell a breaker there has been an error

Now that you have your breakers set up, how do you let them know there's a problem?

BreakerBox.increment_error(breaker_name)

Unless your breaker has been set up to be super-sensitive, one error probably won't trip it.

iex> Breaker.increment_error(BreakerOne)
:ok
iex> BreakerBox.status(BreakerOne)
{:ok, BreakerOne}
iex> 1..10 |> Enum.each(fn _ -> BreakerBox.increment_error(BreakerOne) end)
:ok
iex> BreakerBox.status(BreakerOne)
{:error, {:breaker_tripped, BreakerOne}}

# Wait 60 seconds or call BreakerBox.reset(BreakerOne)
iex> BreakerBox.status(BreakerOne)
{:ok, BreakerOne}

Manually enabling/disabling/resetting a breaker

By default, breakers that have been tripped will reset to untripped after the reset_window specified in your configuration. If you want to reset it sooner, for example in a test scenario, you can call BreakerBox.reset(breaker_name) to set it back to untripped.

What if you know a particular external service is going to be down for awhile, and want to disable all traffic to it?

iex> BreakerBox.disable(BreakerOne)
:ok
iex> BreakerBox.status()
%{
  BreakerOne => {:error, {:breaker_tripped, BreakerOne}},
  BreakerTwo => {:ok, BreakerTwo}
}

# Wait as long as you want, it won't automatically reset
iex> BreakerBox.status(BreakerOne)
{:error, {:breaker_tripped, BreakerOne}}

Re-enabling it when you know or suspect the service is available again is just as simple.

iex> BreakerBox.enable(BreakerOne)
:ok
iex> BreakerBox.status()
%{
  BreakerOne => {:ok, BreakerOne},
  BreakerTwo => {:ok, BreakerTwo}
}

More than one breaker box

If you have a need for more than one set of circuit breakers, and don't want any overlap, for example, if you want to run tests that may interfere with each other in parallel, you can specify a process_name when calling BreakerBox, as of version 0.4.0, which will default to the module name BreakerBox.

iex> BreakerBox.start_link([])
{:ok, #PID<0.233.0>}
iex> BreakerBox.start_link([], :OtherPanel)
{:ok, #PID<0.236.0>}
iex> BreakerBox.register("Breaker1", %BreakerBox.BreakerConfiguration{})
:ok
iex> BreakerBox.register("OtherPanelBreaker", %BreakerBox.BreakerConfiguration{}, :OtherPanel)
:ok
iex> BreakerBox.registered
%{
  "Breaker1" => %BreakerBox.BreakerConfiguration{
    failure_window: 1000,
    max_failures: 5,
    reset_window: 5000
  }
}
iex> BreakerBox.registered(:OtherPanel)
%{
  "OtherPanelBreaker" => %BreakerBox.BreakerConfiguration{
    failure_window: 1000,
    max_failures: 5,
    reset_window: 5000
  }
}
iex> BreakerBox.status("Breaker1")
{:ok, "Breaker1"}
iex> BreakerBox.status("Breaker1", :OtherPanel)
{:error, {:breaker_not_found, "Breaker1"}}
iex> BreakerBox.status("OtherBreaker")         
{:error, {:breaker_not_found, "OtherBreaker"}}
iex> BreakerBox.status("OtherBreaker", :OtherPanel)
{:ok, "OtherBreaker"}

Behind the scenes, Module.concat/2 is used to make a unique name for the breaker name for the underlying Fuse library, since otherwise it would allow the same name in two different breaker boxes to overwrite each other.

Tying it all together

In this example, we're going to POST a request to an external service at url. If we get a valid HTTPoison response back in an {:ok, response} tuple, we'll return the response body to the caller, no matter what it was, but if it wasn't a 200 OK, we'll tell the breaker there was an error. You may not want to be this strict if you're using a GET request with a 301 Moved Permanently response, but for my usual use case, a non-200 means something bad's happening.

If we specifically get an HTTPoison.Error struct back, usually in cases of a timeout or non-existent domain, increment the error there, too. If we got back that the breaker has already been tripped, we don't increment it again, but instead just pass back the error to be handled in the controller or fallback controller, where we'll typically create a 503 Service Unavailable response to tell consumers of our API to try again later. Lastly, any other unexpected errors increment the error count and return.

We just want to ensure specifically that we're not incrementing again when the breaker is already tripped, as we haven't actually made the call to the external service.

{breaker_name, _} = BreakerOne.registration()

with {:ok, ^breaker_name} <- BreakerBox.status(breaker_name),
     {:ok, response} <- HTTPoison.post(url, body, headers, options) do
  if response.status_code != 200 do
    BreakerBox.increment_error(breaker_name)
  end

  {:ok, response.body}
else
  {:error, %HTTPoison.Error{}} = error ->
    BreakerBox.increment_error(breaker_name)
    error

  {:error, {:breaker_tripped, ^breaker_name}} = error ->
    error

  other ->
    BreakerBox.increment_error(breaker_name)
    other
end

Installation

BreakerBox can be installed by adding breaker_box to your list of dependencies in mix.exs:

def deps do
  [
    {:breaker_box, "~> 0.5.0"}
  ]
end

Documentation

Documentation can be found at https://hexdocs.pm/breaker_box.

About

Elixir implementation of circuit breaker pattern, wrapped around Erlang's fuse library

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages