⚠️ REPO MIGRATION NOTICE ⚠️
Development of the Wavefront Collector for Kubernetes has moved to https://github.com/wavefrontHQ/observability-for-kubernetes/. This repo is no longer maintained.
Wavefront Collector for Kubernetes
Wavefront is a high-performance streaming analytics platform for monitoring and optimizing your environment and applications.
The Wavefront Collector for Kubernetes is an agent that runs as a DaemonSet on each node within a Kubernetes cluster. It collects metrics and events about the cluster and sends them to the Wavefront SaaS service.
Features
- Collects real-time data from all layers of a Kubernetes environment
- Multiple sources of metrics providing comprehensive insight:
- Kubernetes (kubelet) source: For core kubernetes metrics
- Prometheus source: For scraping prometheus metric endpoints (API server, etcd, NGINX etc)
- Kubernetes state source: For resource state metrics
- Telegraf source: For host and application level metrics
- Systemd source: For host level systemd metrics
- Auto discovery of pods and services based on annotation and configuration
- Daemonset mode for high scalability with leader election for monitoring cluster level resources
- Rich filtering support
- Auto reload of configuration changes
- Internal metrics for tracking the collector health and configuration
Installation
Refer to the installation instructions.
Configuration
The installation instructions use a default configuration suitable for most use cases. Refer to the documentation for details on all the configuration options.
Building
Build using make
and the provided Makefile
.
Commonly used make
options include:
fmt
togo fmt
all your codetests
to run all the unit testsbuild
that creates a local executablecontainer
that uses a docker container to build for consistency and reproducability
Troubleshooting Dropped Metrics
Formerly, we would see the following error in the proxy logs when a metric has too many tags: Too many point tags
.
However, logic has been added to the collector to automatically drop tags in priority order
to ensure that metrics make it through to the proxy and no longer cause this error.
This is the order of the logic used to drop tags:
- Explicitly excluded tags (from
tagExclude
config). Refer here for an example scenario. - Tags are empty or are interpreted to be empty (
"tag.key": ""
,"tag.key": "-"
, or"tag.key": "/"
). - Tags are explicitly excluded
(
"namespace_id": "..."
,"host_id": "..."
,"pod_id": "..."
, or"hostname": "..."
). - Tag values are duplicated, and the shorter key is kept
(
"tag.key": "same value"
is kept instead of"tag.super.long.key": "same value"
). - Tag key matches
alpha.*
orbeta.*
, after keys have been sorted (e.g."alpha.eksctl.io/nodegroup-name": "arm-group"
or"beta.kubernetes.io/arch": "amd64"
). - (As a last resort) tag key matches IaaS-specific tags, after keys have been sorted
(
"kubernetes.azure.com/agentpool": "agentpool"
,"topology.gke.io/zone": "us-central1-c"
,"eksctl.io/nodegroup-name": "arm-group"
, etc.).
Contributing
Public contributions are always welcome. Please feel free to report issues or submit pull requests.