Skip to content

A Symfony implementation of the FT's healthcheck standard.

License

Notifications You must be signed in to change notification settings

Financial-Times/php-health-check

Repository files navigation

Health checks

CircleCI

Health checks can be found under the __health endpoint. Each health check tests a part of the site continually, and in doing so indicates the overall health of the site.

Health checks in this package conform to the FT Health Check Standard.

Requires PHP v7.1 +

Installation

Add this to your app/config/routing.yml config file:

FT\HealthCheckBundle:
    resource: "@HealthCheckBundle/Resources/config/routing.yml"
    prefix:   /

Creating a health check

To create a health check you can copy this code under src/[Your bundle]/Healthchecks/PlaceholderHealthCheck.php

<?php

namespace YourBundle\HealthCheck;

use FT\HealthCheckBundle\HealthCheck\HealthCheck;
use FT\HealthCheckBundle\Interfaces\HealthCheckHandlerInterface;

class PlaceholderHealthCheck implements HealthCheckHandlerInterface
{
    const HEALTH_CHECK_ID = 'HealthCheck';

    public function __construct(){
        //Inject services as need be to run the healthcheck
    }

    /**
     * @inheritDoc
     */
    public function runHealthCheck(): HealthCheck
    {
        //Run your health check and gather results
        $ok = true;

        $healthCheck = new HealthCheck();

        //See FT\HealthCheckBundle\HealthCheck\HealthCheck for more details on what each of these methods do.

        return $healthCheck
        /* [REQUIRED] An identifier for this check, unique for the given System Code.  Must only consist of lowercase alphanumeric characters and hyphens. */
        ->withId(self::HEALTH_CHECK_ID)

        /* [REQUIRED] Human readable name for the health check (Must be unique) */
        ->withName('The health Check name')

        /* [REQUIRED] Whether the check is currently passing.*/
        ->withOk($ok)

        /* [REQUIRED] An expression of the scale of impact that the problem will cause.  Must be an integer set to one of the following values:
          - 1 (high): Critical Issue with serious impact to the editorial team or user (e.g database is down).
          - 2 (Medium): Serious issue that can be tolerated for a short duration of time. This can involve lowed redundancy or minimal user impact.
          - 3 (low): Minor fault. No end user impact and no major risk caused by this alert.
        */
        ->withSeverity(3)

        /* [REQUIRED] A set of steps that may be carried out by a support engineer to further diagnose and potentially resolve the issue. */
        ->withPanicGuide("Oh no something went wrong... Here is how to possibly fix it!")

        /* [REQUIRED] Technical summary which may include information about the test being done, the potential problem from a technical perspective, and the systems that are involved, giving context to the issue.

        This is the most freeform alert property, and can be used by the alert author to pass over any relevant technical information to help the reader understand the way the application is set up.
        */
        ->withTechnicalSummary("A call to the __placeholder endpoint gave back a 404 error.")

        /*
         [REQUIRED] Description of the effect of the problem on the business operations of the FT, which features are affected, and how it might affect our internal users or external customers.

        It should consist of a few sentences, not more than a short paragraph and should make sense to anyone in IT and the relevant Business unit.

         There can be cases where this is simply “None” in the case of a redundant system failure.

        The business impact must be understandable to a non-technical reader.*/
        ->withBusinessImpact('Users will not be able to see the component and the editorial team cannot edit the component.')
        
        /* Console output, exception message or debug data generated by the test. */
        ->withCheckOutput('Pinging /__placeholder... Response gave non 200 status code! (404)');
    }

    /**
     * @inheritDoc
     */
    public function getHealthCheckId(): string
    {
        //Set this to an identifier unique to the health check
        return self::HEALTH_CHECK_ID;
    }

    /**
     * @inheritDoc
     */
    public function getHealthCheckInterval(): int
    {
        //Set this to how often this health check should be run in seconds. (if 0 will result in health check running every time a request to the __health endpoint is made)
        return 10;
    }
}

And add to your src/[Your Bundle]/Resources/config/services.yml a service definition for the healthcheck. Change the priority to make health checks appear higher or lower in the __health endpoint. (The larger the priority the higher the check will appear in the list of healthchecks).

services:
    # Health Checks
    app.placeholder.health_check:
        class: YourBundle\HealthCheck\PlaceholderHealthCheck
        tags: [{ name: health_check, priority: 20 }]

Config options

There are a few config options that need to be set up for this bundle to correctly function.

parameters:
    #The system code for your product (An internal unique identifier for your product)
    health_check.system_code: ''

    #The human readable name your product
    health_check.name: ''

    #The description of your product
    health_check.description: ''

    #Add the service id for a already initialized PSR-6 Compatible cache pool. This option needed to be set in order for health check caching to work. In the event that this is not set all healthchecks will be run every time the __health endpoint is called. (For eZ Publish/Platform use 'ezpublish.cache_pool')
    health_check.cache_pool: ''

    # This option is used to force the __health endpoint to run before anything else. This is useful for when event listeners that run before requests rely on external services that are covered in other healthchecks. (For instance an auth service that runs before every request that could fail if the session service was down) 
    health_check.run_first: false

    # This parameter is used to set the priority of the event listener added in 'health_check.run_first'. This can be used to stop the event conflicting with other high priority events that might need to run at the start of each request 
    health_check.run_first.priority: 255

Useful links

Configurable health checks

Configurable health checks can be used in place of regular health checks. These healthchecks can be modified through parameter values loaded during a cache clear. To make a healthcheck configurable define the health check as you normally would but use the health_check.configurable tag instead.

services:
    # Health Checks
    app.placeholder.health_check:
        class: YourBundle\HealthCheck\PlaceholderHealthCheck
        tags: [{ name: health_check.configurable, priority: 20 }]

Configurable health checks can have various parts of themselves overridden by using parameters. Parameters are given in the form ${serviceId}.${config}. For instance:

parameters:
    # Override the name of the healthcheck
    app.placeholder.health_check.name: "A new name"
    # Override the healthcheck severity 
    app.placeholder.health_check.severity: 2

Would configure the app.placeholder.health_check to have a new name and priority.

Config attribute Description Default (If applicable)
priority The priority of the healthcheck (in the order in which they run) N/A
run If the health check should be run or not. In the event false is given the service definition for the health check is removed and the health check is not run. true
name Gives the option to override the health check name (Equivalent to calling withName on health check). N/A
severity Gives the option to override the health check severity (Equivalent of calling withSeverity on health check). N/A
business_impact Gives the option to override the health check business impact entry (Equivalent of calling withBusinessImpact on health check). N/A
panic_guide Gives the option to override the health check panic guide (Equivalent of calling withPanicGuide on health check). N/A
technical_summary Gives the option to override the health check technical summary (Equivalent of calling withTechnicalSummary on health check). N/A