CLI application for monitoring the http based, online services.

The watchfor detects undesired responses from monitored services. Failures are reported via emails so far (more options will be implemented soon).


  • checks responses of the HTTP/HTTPS services
  • YAML configuration
  • notifications via emails
  • debuging tool for configuration tests
  • HTTP headers validation
  • HTTP status validation
  • parsing HTML/XML
  • images validation
  • robots.txt validation
  • notifies once when service goes down
  • TODO: notifies when service goes up

Installation, setup and usage

1. Install watchfor in your local python virtual enviroment:

ℹ️ The ./watchforapp is a non-existing directory path where the application will be installed.

python3.8 -m venv ./watchforapp
./watchforapp/bin/pip install -e 'git+'

☑️ This is how it should look like: installation [...] installation

2. Create a simple configuration

ℹ️ A configuration directory is a place where definitions of monitored services is placed. It should be placed somewhere else then installation of the application (./watchforapp).

👍 It is good idea to place this directory under a repository of source version control, like a git.

mkdir ~/my_services
(cd ~/my_services; git init .)  # optional, or just get this directory from an exiting repo

Create a config file for an online service ~/my_services/ and fill with a following content (YAML format):

schema: 1
method: GET
protocol: https
timeout: 10.0
  accept-encoding: gzip, deflate, br
  accept-language: pl,en-US;q=0.9,en;q=0.8
  accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8
  user-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36
  - request: /
      - ValidResponse

The above configuration tests a site, checks a main page / and expects to get a valid http response. See tests/data1 for more complex examples and a documentation (will be available soon) for all options.

3. Test your configuration

./watchforapp/bin/watchfor debug -d ~/my_services/

☑️ This is how it should look like: watchfor debug

4. Alarms

Alarms, the notifications of failures, are definied by a _alarms.yml file in the configuration directory (YAML format). The basic configuration contains a list of actions to be taken when a check fails:

schema: 1
    # for cases with danger=0 (or more)
    - danger: 0
      # sound an alarm after 3 fails in the row
      fails: 3
      # mark a service as recovered after 2 raises in the row
      raises: 2
      # what should happen when this alarm is raise: just send an email.
         - ""

The configuration provides some kind of mitigation to cover single and not persistant failure, with a fails and raises counters.

⚡ TODO: document a danger levels.

⚡ TODO: document different types of actions.

5. Setup MTA - a mailing gateway

MTA configuration is defined by a _mta.yml file in the configuration directory. The content of this file looks as follow (YAML format):

host: ""
port: 587
user: ""
password: "xxx-xxx"
ssl: false
tls: true
from: ""

⚡ TODO: configuration for a local sendmail.

To test a _mta.yml configuration, change a online service ~/my_services/ to have a failure and run debug command with -e <email> parameter:

./watchforapp/bin/watchfor debug -d ~/my_services/ -e ""

☑️ This is how it should look like: watchfor debug

☑️ The email with notofication looks like: watchfor debug

ℹ️ The debug command with a -e parameter always sends notifications in case of any failure. The check uses a results manager to tracks failures to prevent spamming of the same notifications.

6. Run PRODUCTION checks

⚡ TODO: configuration for crontab

This one reads all configurations from a directory ~/my_services/ (*.yml files), runs all checks and store results in a file /tmp/watchfor-my_services.pickle (python pickle format). In case of failure an email is send according to setup in _mta.yml.

./watchforapp/bin/watchfor check -d ~/my_services/ -s /tmp/watchfor-my_services.pickle

👍 Results file (-s <file> parameter) keeps latest statuses to prevent spam in the notifications - too much emails in case of longer service downtime/failure. For long service failure only one email is issued for each day.

This one reads all configurations from a directory ~/my_services/ (*.yml files), runs all checks and store results in a file /tmp/watchfor-my_services.pickle (python pickle format). In case of failure an email is send according to setup from _mta.yml. Additionally all results are stored in the /tmp/watchfor-my_services.html.

./watchforapp/bin/watchfor check -d ~/my_services/ -s /tmp/watchfor-my_services.pickle -o /tmp/watchfor-my_services.html

Configuration schema

Each YAML file in your data directory contains a configuration for a signle web service to check.

File names:

  • must ends with .yml
  • started with _ are omited (it is reserved for special cases like _mta.yml).
# Schema versions are for compatibility checking, current supported schema version is 1.
schema: 1

# The base hostname (domain) of the service.

# Default HTTP method of calls. It can be overwriten by each call.
method: GET

# Default HTTP protocol. Available options are: http, https.
protocol: https

# Default timeout period. So far it cannot be overwriten.
timeout: 10.0

# Default headers send by HTTP requests. It can be overwriten by each request.
  accept-encoding: gzip, deflate, br
  accept-language: pl,en-US;q=0.9,en;q=0.8
  accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8
  user-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36

# List "checks" contains objects of a definition of the request and expected response.
# Each nested "checks" works the same way.
  # The "request" key it is a string with path fragment of URL to be called.
  - request: /

  # The "response" key is a list of validations to be performed on HTTP response.
  # For details see: *List of available response validations* below.
      - ValidResponse

List of available response validations:

Each response validation is:

  • a string with the name of the validator or
  • an object with key validator (name of the validator) and custom parameters.

ValidResponse - checks a status of the response. Default statuses are [200, 201]. To expect other status use:

  - request: /missing-page-or-forbidden
     - validator: ValidResponse
         - 400
         - 403

HasHeaders - checks headers of the response, Validator requires a custom parameter headers with an object of expected headers:

  - request: /missing-page-or-forbidden
      - validator: HasHeaders
          content-type: text/html; charset=utf-8
          content-encoding: gzip

👍 The headers keys (names) are case insensitive.

👍 In case of content-type, all values are parsed with python-mimeparse (normalisation).

ValidImage - expect a response to be an valid image.

Custom parameters checks a minimal size (in pixels) of the image (useful for open-graph images):

  - request: /logo.png
      - validator: ValidImage
        min_size: 100x100

In case of extra support for the webp format based on accept header (see WebP via Accept Content Negotiation), it can be checked as follow:

  - request:
      src: /image.webp.jpeg
        accept: text/html,image/webp,image/apng
      - validator: HasHeaders
          content-type: image/webp
      - validator: ValidImage
        format: WEBP

  - request:
      src: /image.webp.jpeg
        accept: text/html,image/apng
      - validator: HasHeaders
          content-type: image/jpeg
      - validator: ValidImage
        format: JPEG

Images are processed by a Pillow library. See here for complete list of formats.

ValidFavicon - checks only content-type response headers.

ValidContent - TODO Options are min_length, max_length.

ValidText - checks only content-type response headers.

ValidRobotsTxt - TODO (unimplemented)

ValidXML - checks only content-type response headers. Useful for sitemaps.

UnGzip - expects the response to be a content compressed with gzip and decompresses it for following validations. Useful for /sitemap.xml.gz.

ParseHTML - reads a content of the response and parse it as a HTML document.

Following example reads <meta property="og:image" content="SOME-URL-TO-IMAGE" /> from the response and checks used image:

  - request: /
      - ValidResponse
      - reader: ParseHTML
          selector: html head meta[property="og:image"]
          action: ReadProperty
          property: content
            - request:
              - ValidResponse
              - validator: ValidImage
                min_size: 100x100

ParseXML - reads a content of the response and parse it as a XML document. Useful for sitemaps.

Following example reads the content of the /sitemap.xml (witch contains a sitemapindex) and grabs first sitemap to get real page (depends on site configuration):

  - request: /sitemap.xml
      - ValidResponse
      - ValidXML
      - reader: ParseXML
          # the content is <sitemapindex><sitemap><loc>/sitemap-page-01.xml.gz</loc></sitemap></sitemapindex>
          selector: sitemapindex sitemap:first-of-type loc
          action: ReadContent
            # request would call: /sitemap-page-01.xml.gz
            - request:
                - ValidResponse
                - UnGzip
                - reader: ParseXML
                    # the content is <urlset><url><loc>/content-page-0001.html</loc></url></urlset>
                    selector: urlset url:first-of-type loc
                    action: ReadContent
                      # request would call: /content-page-0001.html
                      - request:
                          - ValidResponse
                          - validator: HasHeaders
                              content-type: text/html; charset=utf-8
                          - reader: ParseHTML
                              # just read the page and check og:image
                              selector: html head meta[property="og:image"]
                              action: ReadProperty
                              property: content
                                - request:
                                    - ValidResponse
                                    - validator: ValidImage
                                      min_size: 100x100

💣 Please note for selector: selector: urlset url:first-of-type loc - this takes only first occurance of the <url></url> in the <urlset></urlset>. Without the :first-of-type all <url></url> would be processed - it may take some time to visit all pages in the sitemap index.

👍 For available options for selectors see paring library and general specifycation .

Installation for a development

Clone watchfor repo from github and create local python virtual enviroment:

git clone ''
cd watchfor
python3.8 -m venv ./venv
./venv/bin/pip install -r requirements.txt
./venv/bin/pip install -r requirements-dev.txt
Run debug for sample configuration:
./venv/bin/python -m watchfor debug -d ./tests/data1/
Run unit-tests (pytest) only once:
./venv/bin/pytest watchfor/tests
Run unit-tests (pytest + pytest-watch) after each code change (auto reload):
./venv/bin/ptw watchfor -- watchfor/tests
Run debug after each code change (auto reload with watchdog)
./venv/bin/watchmedo auto-restart --ignore-directories --recursive -d . -p '*.py;*.
mako;*.yml' -- ./venv/bin/python -m watchfor debug -d ./tests/data1/


