Skip to content

Commit

Permalink
Merge pull request #47 from alphagov/healthcheck-classes
Browse files Browse the repository at this point in the history
Health check convenience classes
  • Loading branch information
thomasleese committed Jul 17, 2018
2 parents 036c26f + 9d9bdc2 commit 3fbd11f
Show file tree
Hide file tree
Showing 19 changed files with 578 additions and 50 deletions.
5 changes: 5 additions & 0 deletions CHANGELOG.md
@@ -1,3 +1,8 @@
# Unreleased

* Add various convenience health check classes which make it easier to add
custom checks into apps without writing lots of code.

# 1.6.0

* Make health checks classes rather than instances, allowing internal data to
Expand Down
50 changes: 3 additions & 47 deletions README.md
Expand Up @@ -105,54 +105,10 @@ GovukStatsd.gauge "bork", 100
GovukStatsd.time("account.activate") { @account.activate! }
```

## Healthchecks
## Health Checks

Set up a route in your rack-compatible Ruby application, and pick the built-in
or custom checks you wish to perform.

Custom checks must be a class which implements
[this interface](spec/lib/govuk_healthcheck/shared_interface.rb):

```ruby
class CustomCheck
def name
:custom_check
end

def status
ThingChecker.everything_okay? ? OK : CRITICAL
end

# Optional
def message
"This is an optional custom message"
end

# Optional
def details
{
extra: "This is an optional details hash",
}
end
end
```

For Rails apps:
```ruby
get "/healthcheck", to: GovukHealthcheck.rack_response(
GovukHealthcheck::SidekiqRedis,
GovukHealthcheck::ActiveRecord,
CustomCheck,
)
```

This will check:
- Redis connectivity (via Sidekiq)
- Database connectivity (via ActiveRecord)
- Your custom healthcheck

Each check class gets instanced each time the health check end point is called.
This allows you to cache any complex queries speeding up performance.
This Gem provides a common "health check" framework for apps. See [the health
check docs](docs/healthchecks.md) for more information on how to use it.

## Rails logging

Expand Down
156 changes: 156 additions & 0 deletions docs/healthchecks.md
@@ -0,0 +1,156 @@
# Health Checks

## Check interface

A check is expected to be a class with the following methods:

```ruby
class CustomCheck
def name
:the_name_of_the_check
end

def status
if critical_condition?
:critical
elsif warning_condition?
:warning
else
:ok
end
end

# Optional
def message
"This is an optional custom message"
end

# Optional
def details
{
extra: "This is an optional details hash",
}
end

# Optional
def enabled?
true # false if the check is not relevant at this time
end
end
```

It is expected that these methods may cache their results for performance
reasons, if a user wants to ensure they have the latest value they should
create a new instance of the check first.

## Including checks in your app

Set up a route in your rack-compatible Ruby application, and pick the built-in
or custom checks you wish to perform.

For Rails apps:

```ruby
get "/healthcheck", to: GovukHealthcheck.rack_response(
GovukHealthcheck::SidekiqRedis,
GovukHealthcheck::ActiveRecord,
CustomCheck,
)
```

## Built-in Checks

A convention used when naming these classes is that it should end with `Check`
if it must be subclassed to work, but a concrete class which works on its own
doesn't need that suffix. You should aim to follow this convention in your own
apps, ideally putting custom health checks into a `Healthcheck` module.

### `SidekiqRedis`

This checks that the app has a connection to Redis via Sidekiq.

### `ActiveRecord`

This checks that the app has a connection to the database via ActiveRecord.

### `ThresholdCheck`

This class is the basis for a check which compares a value with a warning or a
critical threshold.

```ruby
class MyThresholdCheck < GovukHealthcheck::ThresholdCheck
def name
:my_threshold_check
end

def value
# get the value to be checked
end

def total
# (optional) get the total value to be included in the details as extra
# information
end

def warning_threshold
# if the value is above this threshold, its status is warning
end

def critical_threshold
# if the value is above this threshold, its status is critical
end
end
```

### `SidekiqQueueLatencyCheck`

This class is the basis for a check which compares the Sidekiq queue latencies
with warning or critical thresholds.

```ruby
class MySidekiqQueueLatencyCheck < GovukHealthcheck::SidekiqQueueLatencyCheck
def warning_threshold(queue:)
# the warning threshold for a particular queue
end

def critical_threshold(queue:)
# the critical threshold for a particular queue
end
end
```

### `SidekiqQueueSizeCheck`

This class is the basis for a check which compares the Sidekiq queue sizes
with warning or critical thresholds.

```ruby
class MySidekiqQueueSizeCheck < GovukHealthcheck::SidekiqQueueSizeCheck
def warning_threshold(queue:)
# the warning threshold for a particular queue
end

def critical_threshold(queue:)
# the critical threshold for a particular queue
end
end
```


### `SidekiqRetrySizeCheck`

Similar to `SidekiqQueueSizeCheck`, this class is the basis for a check which
compares the Sidekiq retry set size with a warning and critical threshold.

```ruby
class MySidekiqRetrySizeCheck < GovukHealthcheck::SidekiqRetrySizeCheck
def warning_threshold
# the warning threshold for the retry set
end

def critical_threshold
# the critical threshold for the retry set
end
end
```
1 change: 1 addition & 0 deletions govuk_app_config.gemspec
Expand Up @@ -29,6 +29,7 @@ Gem::Specification.new do |spec|
spec.add_development_dependency "bundler", "~> 1.15"
spec.add_development_dependency "rake", "~> 10.0"
spec.add_development_dependency "rspec", "~> 3.6.0"
spec.add_development_dependency "rspec-its", "~> 1.2.0"
spec.add_development_dependency "climate_control"
spec.add_development_dependency "webmock"
spec.add_development_dependency "pry"
Expand Down
5 changes: 5 additions & 0 deletions lib/govuk_app_config/govuk_healthcheck.rb
@@ -1,6 +1,11 @@
require "govuk_app_config/govuk_healthcheck/checkup"
require "govuk_app_config/govuk_healthcheck/active_record"
require "govuk_app_config/govuk_healthcheck/sidekiq_redis"
require "govuk_app_config/govuk_healthcheck/threshold_check"
require "govuk_app_config/govuk_healthcheck/sidekiq_queue_check"
require "govuk_app_config/govuk_healthcheck/sidekiq_queue_latency_check"
require "govuk_app_config/govuk_healthcheck/sidekiq_queue_size_check"
require "govuk_app_config/govuk_healthcheck/sidekiq_retry_size_check"
require "json"

module GovukHealthcheck
Expand Down
10 changes: 7 additions & 3 deletions lib/govuk_app_config/govuk_healthcheck/checkup.rb
Expand Up @@ -43,9 +43,13 @@ def status?(status)
end

def build_component_status(check)
component_status = details(check).merge(status: check.status)
component_status[:message] = check.message if check.respond_to?(:message)
component_status
if check.respond_to?(:enabled?) && !check.enabled?
{ status: :ok, message: "currently disabled" }
else
component_status = details(check).merge(status: check.status)
component_status[:message] = check.message if check.respond_to?(:message)
component_status
end
end

def details(check)
Expand Down
62 changes: 62 additions & 0 deletions lib/govuk_app_config/govuk_healthcheck/sidekiq_queue_check.rb
@@ -0,0 +1,62 @@
module GovukHealthcheck
class SidekiqQueueCheck
def status
queues.each do |name, value|
if value >= critical_threshold(queue: name)
return :critical
elsif value >= warning_threshold(queue: name)
return :warning
end
end

:ok
end

def message
messages = queues.map do |name, value|
critical = critical_threshold(queue: name)
warning = warning_threshold(queue: name)

if value >= critical
"#{name} (#{value}) is above the critical threshold (#{critical})"
elsif value >= warning
"#{name} (#{value}) is above the warning threshold (#{warning})"
end
end

messages = messages.compact

if messages.empty?
"all queues are below the critical and warning thresholds"
else
messages.join("\n")
end
end

def details
{
queues: queues.each_with_object({}) do |(name, value), hash|
hash[name] = {
value: value,
thresholds: {
critical: critical_threshold(queue: name),
warning: warning_threshold(queue: name),
},
}
end,
}
end

def queues
raise "This method must be overriden to be a hash of queue names and data."
end

def critical_threshold(queue:)
raise "This method must be overriden to be the critical threshold."
end

def warning_threshold(queue:)
raise "This method must be overriden to be the warning threshold."
end
end
end
@@ -0,0 +1,13 @@
module GovukHealthcheck
class SidekiqQueueLatencyCheck < SidekiqQueueCheck
def name
:sidekiq_queue_latency
end

def queues
@queues ||= Sidekiq::Stats.new.queues.keys.each_with_object({}) do |name, hash|
hash[name] = Sidekiq::Queue.new(name).latency
end
end
end
end
11 changes: 11 additions & 0 deletions lib/govuk_app_config/govuk_healthcheck/sidekiq_queue_size_check.rb
@@ -0,0 +1,11 @@
module GovukHealthcheck
class SidekiqQueueSizeCheck < SidekiqQueueCheck
def name
:sidekiq_queue_size
end

def queues
@queues ||= Sidekiq::Stats.new.queues
end
end
end
11 changes: 11 additions & 0 deletions lib/govuk_app_config/govuk_healthcheck/sidekiq_retry_size_check.rb
@@ -0,0 +1,11 @@
module GovukHealthcheck
class SidekiqRetrySizeCheck < ThresholdCheck
def name
:sidekiq_retry_size
end

def value
Sidekiq::Stats.new.retry_size
end
end
end

0 comments on commit 3fbd11f

Please sign in to comment.