Agent 6 docker image
This is how the official agent 6 image available here is built.
How to run it
Head over to datadoghq.com to get the official installation guide.
For a simple docker run, you can quickly get started with:
docker run -d -v /var/run/docker.sock:/var/run/docker.sock:ro \ -v /proc/:/host/proc/:ro \ -v /sys/fs/cgroup/:/host/sys/fs/cgroup:ro \ -e DD_API_KEY=<YOUR_API_KEY> \ datadog/agent:latest
The agent is highly customizable, here are the most used environment variables:
DD_API_KEY: your API key (required)
DD_SITE: Destination site for your metrics, traces, and logs. Valid options are datadoghq.com for the Datadog US site, and datadoghq.eu for the Datadog EU site.
DD_HOSTNAME: hostname to use for metrics (if autodetection fails)
DD_DD_URL: optional setting to override the url for metric submission
DD_TAGS: host tags, separated by spaces. For example:
DD_CHECK_RUNNERS: the agent runs all checks concurrently by default (default value =
4runners). To run the checks sequentially, set the value to
1. If you need to run a high number of checks (or slow checks) the
collector-queuecomponent might fall behind and fail the healthcheck. You can increase the number of runners to run checks in parallel
Starting with Agent v6.4.0, the agent proxy settings can be overridden with the following environment variables:
DD_PROXY_HTTP: an http URL to use as a proxy for
DD_PROXY_HTTPS: an http URL to use as a proxy for
DD_PROXY_NO_PROXY: a space-separated list of URLs for which no proxy should be used.
Note: at the moment, the trace agent only supports the above proxy environment variables starting from version 6.5.0
For more information: https://docs.datadoghq.com/agent/proxy/#agent-v6
Optional collection agents
These features are disabled by default for security or performance reasons, you need to explicitly enable them:
DD_APM_ENABLED: run the trace-agent along with the infrastructure agent, allowing the container to accept traces on 8126/tcp
DD_LOGS_ENABLED: run the log-agent along with the infrastructure agent. See below for details
DD_PROCESS_AGENT_ENABLED: enable live process collection in the process-agent. The Live Container View is already enabled by default if the Docker socket is available
Dogstatsd (custom metrics)
Send custom metrics via the statsd protocol:
DD_DOGSTATSD_NON_LOCAL_TRAFFIC: listen to dogstatsd packets from other containers, required to send custom metrics
DD_HISTOGRAM_PERCENTILES: histogram percentiles to compute, separated by spaces. The default is "0.95"
DD_HISTOGRAM_AGGREGATES: histogram aggregates to compute, separated by spaces. The default is "max median avg count"
DD_DOGSTATSD_SOCKET: path to the unix socket to listen to. Must be in a
DD_DOGSTATSD_ORIGIN_DETECTION: enable container detection and tagging for unix socket metrics. Running in host PID mode (e.g. with --pid=host) is required.
DD_DOGSTATSD_TAGS: Additional tags to append to all metrics, events and service checks received by this dogstatsd server, for example:
DD_DOCKER_LABELS_AS_TAGS: extract docker container labels
DD_DOCKER_ENV_AS_TAGS: extract docker container environment variables
DD_KUBERNETES_POD_LABELS_AS_TAGS: extract pod labels
DD_KUBERNETES_POD_ANNOTATIONS_AS_TAGS: extract pod annotations
You can either define them in your custom
datadog.yaml, or set them as JSON maps in these envvars. The map key is the source (label/envvar) name, and the map value the Datadog tag name.
You can use shell patterns in label names to define simple rules for mapping labels to Datadog tag names using the same simple template system used by Autodiscovery. This is only supported by
To add all pod labels as tags to your metrics where tags names are prefixed by
kube_, you can use the following:
To add only pod labels as tags to your metrics that start with
app, you can use the following:
Using secret files (BETA)
Integration credentials can be stored in Docker / Kubernetes secrets and used in Autodiscovery templates. See the setup instructions for the helper script and the agent documentation for more information.
You can exclude containers from the metrics collection and autodiscovery, if these are not useful for you. We already exclude Kubernetes and OpenShift
pause containers by default. See the
datadog.yaml.example file for more documentation, and examples.
DD_AC_INCLUDE: whitelist of containers to always include
DD_AC_EXCLUDE: blacklist of containers to exclude
The format for these option is space-separated strings. For example, if you only want to monitor two images, and exclude the rest, specify:
DD_AC_EXCLUDE = "image:.*" DD_AC_INCLUDE = "image:cp-kafka image:k8szk"
Please note that the
.stopped.total metrics are not affected by these settings and always count all containers. This does not affect your per-container billing.
Additional Autodiscovery sources
You can add extra listeners and config providers via the
DD_EXTRA_CONFIG_PROVIDERS environment variables. They will be added on top of the ones defined in the
config_providers section of the datadog.yaml configuration file.
Datadog Cluster Agent
The DCA is a beta feature, if you are facing any issues please reach out to our support team Starting with Agent v6.3.2, you can use the Datadog Cluster Agent.
Cluster level features are now handled by the cluster agent, and you will find a
[DCA] notation next to the affected features. Please refer to the below user documentation as well as the technical documentation here for further details on the instrumentation.
Please refer to the dedicated section about the Kubernetes integration for more details.
DD_KUBERNETES_COLLECT_METADATA_TAGS: configures the agent to collect Kubernetes metadata (service names) as tags.
DD_KUBERNETES_METADATA_TAG_UPDATE_FREQ: set the collection frequency in seconds for the Kubernetes metadata (service names) from the API Server (or the Datadog Cluster Agent if enabled).
DD_COLLECT_KUBERNETES_EVENTS[DCA]: configures the cluster agent to collect Kubernetes events. See Event collection for more details.
DD_LEADER_ELECTION[DCA]: activates the leader election. Will be activated if the
DD_COLLECT_KUBERNETES_EVENTSis set to
true. The expected value is a bool: true/false.
DD_LEADER_LEASE_DURATION[DCA]: only used if the leader election is activated. See the details here. The expected value is a number of seconds.
DD_KUBE_RESOURCES_NAMESPACE[DCA]: configures the namespace where the Cluster Agent creates the configmaps required for the Leader Election, the Event Collection (optional) and the Horizontal Pod Autoscaling.
DD_JMX_CUSTOM_JARS: space-separated list of custom jars to load in jmxfetch (only for the
DD_ENABLE_GOHAI: enable or disable the system information collector gohai (enabled by default if not set)
DD_EXPVAR_PORT: change the port for fetching expvar public variables from the agent. (defaults to 5000, you may then also have to change the agent_stat.yaml)
Some options are not yet available as environment variable bindings. To customize these, the agent supports mounting a custom
/etc/datadog-agent/datadog.yaml configuration file (based on the docker or kubernetes base configurations) for these options, and using environment variables for the rest.
To run custom checks and configurations without building your own image, you can mount additional files in these folders:
/checks.d/: custom checks in this folder will be copied over and used, if a corresponding configuration is found
/conf.d/: check configurations and Autodiscovery templates in this folder will be copied over in the agent's configuration folder. You can mount a host folder, kubernetes configmaps, or other volumes. Note: autodiscovery templates now are directly stored in the main
conf.dfolder, not in an
For more information about the container's lifecycle, see SUPERVISION.md.
Without the DCA
This sub-section is only valid for the agent versions < 6.3.2 or when not using the Datadog Cluster Agent.
To deploy the Agent in your Kubernetes cluster, you can use the manifest in manifests. Firstly, make sure you have the correct RBAC in place. You can use the files in manifests/rbac that contain the minimal requirements to run the Kubernetes Cluster level checks and perform the leader election.
kubectl create -f manifests/rbac
Please note that with the above RBAC, every agent will have access to the API Server, to list the pods, services ... These accesses vanish when using the Datadog Cluster Agent. Indeed, the agents will only have access to the local kubelet and only the Cluster Agent will be able to access cluster level insight (nodes, services...).
Once the RBAC is in place, you can then create the agents with:
kubectl create -f manifests/agent.yaml
The manifest for the agent has the
KUBERNETES environment variable enabled, which will enable the collection of local kubernetes metrics via the kubelet's API. For the event collection and the API server check please read below.
If you want the event collection to be resilient, you can create a ConfigMap
datadogtoken that agents will use to save and share a state reflecting which events where pulled last.
To create such a ConfigMap, you can use the following command:
kubectl create -f manifests/datadog_configmap.yaml
See details in Event Collection.
With the DCA
This sub-section is only valid for agent versions > 6.3.2 and when using the Datadog Cluster Agent.
Event collection is handled by the cluster agent and the RBAC for the agent is slimmed down to the kubelet's API access. There is now a dedicated Clusterrole for the agent which should be as follows:
apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: datadog-agent rules: - apiGroups: # Kubelet connectivity - "" resources: - nodes/metrics - nodes/spec - nodes/proxy verbs: - get
It goes along the ClusterRoleBinding and the Service Account, dedicated to the datadog-agents.
Event Collection [DCA]
1/ Set the
collect_kubernetes_events variable to
true in the
datadog.yaml file, you can use the environment variable
DD_COLLECT_KUBERNETES_EVENTS for this.
2/ Give the agents proper RBACs to activate this feature. See the RBAC section.
3/ A ConfigMap can be used to store the
event.tokenKey and the
event.tokenTimestamp. It has to be deployed in the
default namespace and be named
kubectl create configmap datadogtoken --from-literal="event.tokenKey"="0" .
You can also use the example in [manifests/datadog_configmap.yaml][https://github.com/DataDog/datadog-agent/blob/master/Dockerfiles/manifests/datadog_configmap.yaml].
Note: When the ConfigMap is used, if the agent in charge (via the Leader election) of collecting the events dies, the next leader elected will use the ConfigMap to identify the last events pulled. This is in order to avoid duplicate the events collected, as well as putting less stress on the API Server.
Leader Election [DCA]
This feature relies on Endpoints, you can enable it by setting the
DD_LEADER_ELECTION environment variable to
true the Datadog Agents will need to have a set of actions allowed prior to its deployment nevertheless.
See the RBAC section for more details and keep in mind that these RBAC entities will need to be created before the option is set.
Agents coordinate by performing a leader election among members of the Datadog DaemonSet through kubernetes to ensure only one leader agent instance is gathering events at a given time.
This functionality is disabled by default, enabling the event collection will activate it (see Event collection) to avoid duplicating collecting events and stress on the API server.
The leaderLeaseDuration is the duration for which a leader stays elected. It should be > 30 seconds and is 60 seconds by default. The longer it is, the less frequently your agents hit the apiserver with requests, but it also means that if the leader dies (and under certain conditions), events can be missed until the lease expires and a new leader takes over.
It can be configured with the environment variable
If you are using the DCA, find all the RBAC for the agent as well as the Cluster agent here
You will need to allow the agent to be allowed to perform a few actions:
datadogtokento update and query the most up to date version token corresponding to the latest event stored in ETCD.
Eventsto pull the events from the API Server, format and submit them.
Endpoint. The Endpoint used by the agent for the Leader election feature is named
componentstatusesresource, in order to submit service checks for the Controle Plane's components status.
You can find the templates in manifests/rbac here. This will create the Service Account in the default namespace, a Cluster Role with the above rights and the Cluster Role Binding.
Node label collection
The agent can collect node labels from the APIserver and report them as host tags. This feature is disabled by default, as it is usually redundant with cloud provider host tags. If you need to do so, you can provide a node label -> host tag mapping in the
DD_KUBERNETES_NODE_LABELS_AS_TAGS environment variable. The format is the inline JSON described in the tagging section.
Kubernetes node name as aliases
By default, the agent is using the kubernetes node name as an alias that can be used to forward metrics and events. This allows to submit events and metrics from remote hosts.
However, if you have several clusters where some nodes could have similar node names, some host alias collisions could occur. To prevent those, the agent supports the use of a cluster-unique identifier (such as the actual cluster name), through the environment variable
DD_CLUSTER_NAME. That identifier will be added to the node name as a host alias, and avoid collision issues altogether.
Legacy Kubernetes Versions
Our default configuration targets Kubernetes 1.7.6 and later, as we rely on features and endpoints introduced in this version. More installation steps are required for older versions:
- RBAC objects (
ClusterRoleBindings) are available since Kubernetes 1.6 and OpenShift 1.3, but are available under different
rbac.authorization.k8s.io/v1in Kubernetes 1.8+ (and OpenShift 3.9+), the default apiVersion we target
rbac.authorization.k8s.io/v1beta1in Kubernetes 1.5 to 1.7 (and OpenShift 3.7)
v1in Openshift 1.3 to 3.6
You can apply our yaml manifests with the following
sed "s%authorization.k8s.io/v1%authorization.k8s.io/v1beta1%" clusterrole.yaml | kubectl apply -f - sed "s%authorization.k8s.io/v1%authorization.k8s.io/v1beta1%" clusterrolebinding.yaml | kubectl apply -f -
or for Openshift 1.3 to 3.6:
sed "s%rbac.authorization.k8s.io/v1%v1%" clusterrole.yaml | oc apply -f - sed "s%rbac.authorization.k8s.io/v1%v1%" clusterrolebinding.yaml | oc apply -f -
kubeletcheck retrieves metrics from the Kubernetes 1.7.6+ (OpenShift 3.7.0+) prometheus endpoint. You need to enable cAdvisor port mode for older versions.
Our default daemonset makes use of the downward API to pass the kubelet's IP to the agent. This only works on versions 1.7 and up. For older versions, here are other ways to enable kubelet connectivity:
- On versions 1.6, use
fieldPath: spec.nodeNameand make sure your node name is resolvable and reachable from the pod
DD_KUBERNETES_KUBELET_HOSTis unset, the agent will retrieve the node hostname from docker and try to connect there. See
docker info | grep "Name:"and make sure the name is resolvable and reachable
- If the IP of the docker default gateway is constant across your cluster, you can directly pass that IP in the
DD_KUBERNETES_KUBELET_HOSTenvvar. You can retrieve the IP with the
ip addr show | grep docker0command.
- On versions 1.6, use
Our default configuration relies on bearer token authentication to the APIserver and kubelet. On 1.3, the kubelet does not support bearer token auth, you will need to setup client certificates for the
datadog-agentserviceaccount and pass them to the agent.
The Datadog Agent can collect logs from containers starting at the version 6. Two installations are possible:
- on the host: where the agent is external to the Docker environment
- or by deploying its containerized version in the Docker environment
To run a Docker container which embeds the Datadog Agent to monitor your host use the following command:
docker run -d --name datadog-agent \ -e DD_API_KEY=<YOUR_API_KEY> \ -e DD_LOGS_ENABLED=true \ -e DD_LOGS_CONFIG_CONTAINER_COLLECT_ALL=true \ -e DD_AC_EXCLUDE="name:datadog-agent" \ -v /var/run/docker.sock:/var/run/docker.sock:ro \ -v /proc/:/host/proc/:ro \ -v /opt/datadog-agent/run:/opt/datadog-agent/run:rw \ -v /sys/fs/cgroup/:/host/sys/fs/cgroup:ro \ datadog/agent:latest
The commands related to log collection are the following:
-e DD_LOGS_ENABLED=true: this parameter enables log collection when set to
true. The Agent looks for log instructions in configuration files.
-e DD_LOGS_CONFIG_CONTAINER_COLLECT_ALL=true: this parameter adds a log configuration that enables log collection for all containers (see
-v /opt/datadog-agent/run:/opt/datadog-agent/run:rw: to make sure you do not lose any logs from containers during restarts or network issues, the last line that was collected for each container in this directory is stored on the host.
-e DD_AC_EXCLUDE="name:datadog-agent": to prevent the Datadog Agent from collecting and sending its own logs. Remove this parameter if you want to collect the Datadog Agent logs.
Important notes: Integration Pipelines and Processors will not be installed automatically, as the source and service are set to the
docker generic value.
The source and service values can be overridden thanks to Autodiscovery as described below; it automatically installs integration Pipelines that parse your logs and extract all the relevant information from them.
Activate Log Integrations
The second step is to use Autodiscovery to customize the
service value. This allows Datadog to identify the log source for each container.
Since version 6.2 of the Datadog Agent, you can configure log collection directly in the container labels. Pod annotations are also supported for Kubernetes environment, see the [Kubernetes Autodiscovery documentation][https://docs.datadoghq.com/agent/autodiscovery/#template-source-kubernetes-pod-annotations].
How to build this image
On debian-based systems
You can build your own debian package using
Then you can call
inv agent.image-build that will take the debian package generated above and use it to build the image
On other systems
To build the image you'll need the agent debian package that can be found on this APT listing here.
You'll need to download one of the
datadog-agent*_amd64.deb package in this directory, it will then be used by the
Dockerfile and installed within the image.
You can then build the image using
docker build -t datadog/agent:master .
To build the jmx variant, add
--build-arg WITH_JMX=true to the build command