Skip to content

kabisa/terraform-datadog-service-check-monitor

Repository files navigation

terraform-datadog-service-check-monitor

This module is a base module we use for service checks in datadog. A good example use can be found here

Getting Started

Pre-commit:

  • Install pre-commit. E.g. brew install pre-commit.
  • Run pre-commit install in this repo. (Every time you clone a repo with pre-commit enabled you will need to run the pre-commit install command)
  • That’s it! Now every time you commit a code change (.tf file), the hooks in the hooks: config .pre-commit-config.yaml will execute.

Requirements

Name Version
datadog ~> 3.4

Providers

Name Version
datadog 3.4.0

Modules

No modules.

Resources

Name Type
datadog_monitor.monitor resource

Inputs

Name Description Type Default Required
additional_tags Additional tags to set on the monitor. Good tagging can be hard but very useful to make cross sections of the environment. Datadog has a few default tags. https://docs.datadoghq.com/getting_started/tagging/ is a good place to start reading about tags list(string) [] no
alert_message Message to be sent when the alert threshold is hit string n/a yes
alerting_enabled If set to false no alerts will be sent based on this monitor bool true no
auto_resolve_time_h Time of hours after which a triggered monitor that receives no data is automatically resolved. number null no
by_tags List of tags for the "by" part of the query. This should only include the keys of key:value type tags. list(string) [] no
critical_threshold n/a number null no
custom_message This field give the option to put in custom text. Both 'note' and 'docs' are prefixed in the template with 'note:' and 'docs:' respectively. 'custom_message' allows for free format. string "" no
docs Field in the alert message that can be used to document why the alert was sent or what to do. It's best to include links to authoritative resources about what's being monitored. Try to capture why and what the engineer should do with this message string "" no
enabled If set to false the monitor resource will not be created bool true no
env This refers to the environment or which stage of deployment this monitor is checking. Good values are prd, acc, tst, dev... string n/a yes
exclude_tags List of tags for the "exclude" part of the query. Can be either key:value tags or boolean tags. list(string) [] no
include_tags List of tags for the "over" part of the query. Can be either key:value tags or boolean tags. list(string) [] no
locked Makes sure only the creator or admin can modify the monitor bool true no
metric_name Name of the status metric being monitored. If this check is not ok for a number of times, as defined by the threshold, an alert is raised. string n/a yes
name Name that the monitor should get. Will be automatically prefixed with the Service name. Also name_suffix and name_prefix have an effect on the eventual name. It's best set this property to a value that best describes the concern you're trying to cover with the monitor. Eg. Connection Available string n/a yes
name_prefix Can be used to prefix to the Monitor name string "" no
name_suffix Can be used to suffix to the Monitor name string "" no
no_data_message Message to be sent when the monitor is no longer receiving data string "" no
no_data_timeframe n/a number null no
note Field in the alert message that can be used to bring something to the attention of the engineer handling the alert string "" no
notification_channel Channel to which datadog sends alerts, will be overridden by alerting_enabled if that's set to false string "" no
notify_no_data Do you want an alert when the monitoring stops sending data? bool false no
ok_threshold n/a number null no
priority Number from 1 (high) to 5 (low). number n/a yes
recovery_message Recovery message to be sent when the alert threshold is no longer hit string "" no
require_full_window n/a bool true no
service Service name of what you're monitoring. This also sets the service: tag on the monitor string n/a yes
service_display_name n/a string null no
track_as_cluster_level_status This allows to check for the status of a cluster instead of individual hosts, warning and critical thresholds are then expressed as percentages bool false no
warning_threshold n/a number null no

Outputs

Name Description
alert_id n/a