Skip to content

Tool to validate the Prometheus rules metadata and expression properties to match requirements and constrains of the particular Prometheus cluster setup.

License

Notifications You must be signed in to change notification settings

FUSAKLA/promruval

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

image

Go Report Card GitHub Workflow Status Docker Pulls GitHub binaries download

Promtool allows users to verify syntactic correctness and test PromQL expressions. Promruval aims to validate the rules' metadata and expression properties to match requirements and constraints of the particular Prometheus cluster setup. User defines his validation rules in a simple yaml configuration and passes them to the promruval which validates specified files with Prometheus rules same way promtool does. Usually it would be used in the CI pipeline. You can read a blog post about the motivation and usage here or watch a lightning talk about it from PromCon .

Example use-cases

  • Make sure the playbook, linked by an alert, is a valid URL and really exists.
  • Ensure the range selectors in the expr are not lower than three times your Prometheus scrape interval.
  • Avoid querying more data than is retention of used Prometheus by inspecting if the expr does not use older data than specified.
  • Make sure expr does not use any of the specified labels. Useful when using Thanos, to forbid usage of external labels when alerting on Prometheus to avoid confusion.
  • Ensure alerts has the required labels expected by routing in Alertmanager possibly with allowed values.
  • Make sure Alerts has the expected annotations for rendering the alert template.
  • Forbid usage of some labels or annotations if it got deprecated.
  • and many more...

As a good starting point you can use the docs/default_validation.yaml which contains some basic validations that are useful for most of the users.

Validations are quite variable, so you can use them as you fit.

👉 Full list of supported validations can be found here.

In case you would like to add some, please create a feature request!

Installation

Using prebuilt binaries, Docker image of build it yourself.

go install github.com/fusakla/promruval/v3@latest

or

make build

Supported platforms

Promruval is tested only on the linux amd64. It should work on other platforms as well, but it's not tested. Each release contains the binaries for linux, darwin and windows and different architectures (amd64, arm64). So please use them with caution and report any issues.

Usage

$ ./promruval --help-long
usage: promruval [<flags>] <command> [<args> ...]

Prometheus rules validation tool.

Flags:
      --[no-]help   Show context-sensitive help (also try --help-long and --help-man).
  -c, --config-file=CONFIG-FILE ...
                    Path to validation config file. Can be passed multiple times, only validationRules will be reflected from the additional configs.
      --[no-]debug  Enable debug logging.

Commands:
help [<command>...]
    Show help.


version
    Print version and build information.


validate [<flags>] <path>...
    Validate Prometheus rule files in YAML or jsonnet format using validation rules from config file(s).

    -d, --disable-rule=DISABLE-RULE ...
                                   Allows to disable any validation rules by it's name. Can be passed multiple times.
    -e, --enable-rule=ENABLE-RULE ...
                                   Only enable these validation rules. Can be passed multiple times.
    -o, --output=[text,json,yaml]  Format of the output.
        --[no-]color               Use color output.
        --[no-]support-loki        Support Loki rules format.
        --[no-]support-mimir       Support Mimir rules format.
        --[no-]support-thanos      Support Thanos rules format.

validation-docs [<flags>]
    Print human readable form of the validation rules from config file.

    -o, --output=[text,markdown,html]
      Format of the output.

Jsonnet support

Promruval supports the default YAML format (.yaml or .yml) of rule files but also supports rules written in Jsonnet (.jsonnet). If will be rendered using the go-jsonnet library and then validated as usual, so you don't have to evaluate those by yourself before running the validation.

Configuration composition

The --config-file flag can be passed multiple times. Promruval will append the additional validation rules from the other configs and override the other configurations. The late wins. This allows you to use compose configuration for example if you have specific validations for rules.

Example:

rules/
  validations.yaml # Generic validations that apply to all rules
  prometheus/
     validations.yaml # Specific validations for Prometheus rules (different Prometheus URL, shorter data retention, no external labels etc)
     rules.yaml
  thanos/
     validations.yaml # Specific validations for Thanos (different URL, longer retention etc)
     rules.yaml

And Promruval would be run as

promruval validate --config-file ./rules/validation.yaml --config-file ./rules/prometheus/validation.yaml ./rules/prometheus/*.yaml

Configuration

Promruval uses a yaml configuration file to define the validation rules.

Basic structure is:

# OPTIONAL Overrides the annotation used for disabling rules.
customExcludeAnnotation: my_disable_annotation

prometheus:
  # URL of the running prometheus instance to be used
  url: https://foo.bar/
  # OPTIONAL Skip TLS verification
  insecureSkipTlsVerify: false
  # OPTIONAL Relative path to a file containing a bearer token to be used for authentication (Bearer token can by set also using the PROMETHEUS_BEARER_TOKEN env variable, which has higher priority)
  # NOTE: The value will have whitespace trimmed from the beginning and end.
  bearerTokenFile: bearer_token.txt
  # OPTIONAL Timeout for any request on the Prometheus instance
  timeout: 30s
  # OPTIONAL name of the file to save cache of the Prometheus calls for speedup
  cacheFile: .promruval_cache.json
  # OPTIONAL maximum age how old the cache can be to be used
  maxCacheAge: 1h
  # OPTIONAL offset(delay) of the query evaluation time (useful for consistency if using remote write for example).
  queryOffset: 1m
  # OPTIONAL how long into the past to look in queries supporting time range (just metadata queries for now).
  queryLookback: 20m
  # OPTIONAL HTTP headers to be added to the request
  httpHeaders:
    foo: bar

validationRules:
  # Name of the validation rule.
  - name: example-validation
    # What Prometheus rules to validate, possible values are: 'Group', 'Alert', 'Recording rule', 'All rules'.
    scope: All rules
    # List of validations to be used.
    validations:
      # Name of the validation type. See the /docs/validations.md.
      - type: hasLabels
        # Additional detaild that will be appended to the default error message. Useful to customize the error message.
        additionalDetails: "We do this because ..."
        # Parameters of the validation. See the /docs/validations.md for details on params of each validation.
        params:
          labels: [ "severity" ]
        # OPTIONAL If you want to load the parameters from a separate file, you can use this option.
        # Its value must be a relative path to the file from the location of the config file.
        # The content of the file must be in the exact form as the expected params would be.
        # The option is mutually exclusive with the `params` option.
        # paramsFromFile: ./params.yaml
      ...

For a complete list of supported validations see the docs/validations.md.

If you want to see example configuration see the examples/validation.yaml.

How to run it

If you downloaded the prebuilt binary or built it on your own:

promruval validate --config-file=examples/validation.yaml examples/rules.yaml

Or using Docker image

docker run -it -v $PWD:/rules fusakla/promruval validate --config-file=/rules/examples/validation.yaml /rules/examples/rules.yaml

Validation using live Prometheus instance

Event though these validations are useful, they may be flaky and dangerous for the Prometheus instance. If you have large number of rules and run the check often the number of queries can be huge or the instance might go down and your validation would be flaky.

Therefore, it's recommended to use this check as a warning and do not fail if it does not succeed. Also consider running it rather periodically (for example once per day) instead of running it on every commit in CI.

Disabling validations

There are three ways you can disable certain validation:

The last two are useful if you yse for example jsonnet to generate the rules. Then you can't use the YAML comments, but you can set the comments in the expression or alert annotations. Unfortunately those have limited scope of usage (recording rules cannot have annotations, cannot be disabled on the group or file level).

Using cmd line flag

If you want to temporarily disable any of the validation rules for all the tested files, you can use the --disable-rule flag with value corresponding to the name of the validation rule you want to disable. Can be passed multiple times.

Example:

# Promruval validation configuration
validationRules:
  - name: check-irate
    scope: Alert
    validations:
      - type: expressionDoesNotUseIrate
promruval validate --config-file examples/validation.yaml --disable-rule check-irate examples/rules.yaml

Using YAML comments

You can use comments in YAML to disable certain validations. This can be done on the file, group or rule level. The comment should be in format # ignore_validations: validationName1, validationName2, ... where the validationName is the name of the validation as defined in the docs/validations.md.

The ignore_validations prefix can be changed using the customDisableComment config option in the config.

Example:

# Disable for the whole file
# ignore_validations: expressionDoesNotUseIrate
groups:
  # Disable only for the following rule group
  # ignore_validations: expressionDoesNotUseIrate
  - name: group1
    partial_response_strategy: abort
    interval: 1m
    limit: 10
    rules:
      # Disable only for the following rule
      # ignore_validations: expressionDoesNotUseIrate
      - record: recorded_metrics
        expr: 1
        labels:
          foo: bar

Using PromQL expression comments

Same way as in the YAML comments, you can use comments in the PromQL expression to disable certain validations. The comment should be in the same format # ignore_validations: validationName1, validationName2, ... where the validationName is the name of the validation as defined in the docs/validations.md. The comment can be present multiple times in the expression and can be anywhere in the expression.

The ignore_validations prefix can be changed using the customDisableComment config option in the config.

Example:

groups:
  - name: test-group
    rules:
      - alert: test-alert
        expr: |
          # ignore_validations: expressionDoesNotUseIrate
          irate(http_requests_total[5m]) # ignore_validations: expressionDoesNotUseIrate

Using alert annotation

If you can't(or don't want to) use the comments to disable validations, you can use the special annotation disabled_validation_rules. It represents comma separated list of validation rule names to be skipped for the particular alert. Since annotations are only available for alerts, this method can be used only for alerts!

The disabled_validation_rules annotation name can be changed using the customExcludeAnnotation config option in the config.

Example:

# Promruval validation configuration
validationRules:
  - name: check-irate
    scope: Alert
    validations:
      - type: expressionDoesNotUseIrate
# Prometheus rule file
groups:
  - name: test-group
    rules:
      - alert: test-alert
        expr: 1
        annotations:
          disabled_validation_rules: check-irate # Will disable the check-irate validation rule check for this alert

Other monitoring solutions support

Thanos

If you want to validate Thanos rules, use the promruval validate --support-thanos flag, otherwise you might get errors on unknown fields such as partial_response_strategy.

You can validate it using the hasValidPartialResponseStrategy validation.

Mimir

If you want to validate Mimir rules, use the promruval validate --support-mimir flag, otherwise you might get errors on unknown fields such as source_tenants.

The source_tenants can be validated using the hasSourceTenantsForMetrics or hasAllowedSourceTenants validations for example.

Loki

If you want to validate Mimir rules, use the promruval validate --support-loki flag, otherwise you might get errors on unknown fields such as namespace or remote_write.

Since Loki has almost identical rule config as Prometheus, you can use the same validations for Loki rules. Loki has special validations for its expressions since it uses different query language LogQL. To see the LogQL specific validations see the here.

Human readable validation description

If you want more human readable validation summary (for a documentation or generating readable git pages) you can use the validation-docs command, see the usage. It should print out more human readable form than the configuration file is and supports multiple output formats such as text, markdown and HTML. See the examples for the output for Markdown and HTML.

promruval validation-docs --config-file examples/validation.yaml --output=html