Add Agent standalone k8s manifest #23679

ChrsMark · 2021-01-26T10:37:10Z

What does this PR do?

This PR adds k8s manifest for running Elastic Agent in standalone mode with the k8s integration enabled by default. This one deploys Agent as Daemonset Pods on all k8s nodes and as Deployment Pod on the cluster. Deamonset Pods are responsible for collecting metrics from node's kubelet API, kubeproxy metrics and try to autodiscover k8s Scheduler Pod and k8s Controller Manager Pod (which are deployed on master node(s)) and start collecting from them dynamically using the respective metricsets. Deployment pod is responsible for collecting cluster wide metrics from kube_state_metrics service running on the cluster.

@blakerouse @masci @ph @ruflin I would love your feedback here.

Disclaimer: The manifest works if we disable the dynamic inputs part. Find full information about the issues in the bottom of this description: #23685

How to test this PR locally

Run a kind cluster locally using the following:

# three node (two workers) cluster config
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker
- role: worker

kind create cluster --config kind-mutly.yaml
2. Uncomment the scheduler and controllermanager config section and deploy Agent: kubectl apply -f elastic-agent-standalone-kubernetes.yml
3. Verify that all data streams ship data:

Install Kubernetes integration from Fleet UI and verify that Dashboards work properly as well as Metrics UI.

Related issues

Open Issues

Dynamic inputs / Former Autodiscover
Dynamic inputs setup to automatically discover scheduler and controllermanager Pods does not completely work right now and we get the following error:

2021-01-25T15:36:29.224Z	DEBUG	application/periodic.go:40	Failed to read configuration, error: could not emit configuration: could not create the AST from the configuration: missing field accessing 'inputs' (source:'/etc/agent.yml')

Converting ${NODE_NAME} placeholders to ${env.NODE_NAME} does not fix the problem and even if we remove all other datastream configs and leave only the dynamic one it still gives the error:

- id: >-
    kubernetes/metrics-kubernetes.controllermanager-3d50c483-2327-40e7-b3e5-d877d4763fe1
  data_stream:
    dataset: kubernetes.controllermanager
    type: metrics
  metricsets:
    - controllermanager
  hosts:
    - '${kubernetes.pod.ip}:10252'
  period: 10s
  condition: ${kubernetes.pod.labels.component} == 'kube-controller-manager'

In addition, if we remove the dynamic inputs part and have ${env.NODE_NAME} we still get the same error.

In this, there might be a bug in Agent which does not allow us to combine these 2 configuration approaches.

Package setup requires manual interaction with the UI
After deploying the manifests the package is not automaticaly installed and requires the user to manually install it from Fleet UI. This is already known but I'm putting it here for reference.

Signed-off-by: chrismark <chrismarkou92@gmail.com>

elasticmachine · 2021-01-26T10:38:12Z

Pinging @elastic/integrations (Team:Integrations)

elasticmachine · 2021-01-26T11:34:06Z

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS

Expand to view the summary

Build stats

Build Cause: Pull request #23679 updated
Start Time: 2021-02-16T11:20:28.199+0000
Duration: 52 min 58 sec
Commit: 9955c74

Trends 🧪

❕ Flaky test report

No test was executed to be analysed.

blakerouse · 2021-01-26T12:12:51Z

@ChrsMark Lets file an issue for the dynamic inputs piece. You are using it correctly here so it should work, we need to track down and fix why it is not working on the Agent side.

No way to disable dynamic inputs in Agent either, the ${NODE_NAME} would still be considered a variable by dynamic inputs and it would fail to resolve. That still should be ${env.NODE_NAME}.

…ent_manifests

Signed-off-by: chrismark <chrismarkou92@gmail.com>

ChrsMark · 2021-02-16T11:18:15Z

@ruflin @blakerouse Heads-up on this, after pulling the latest changes from #23886 (thanks @blakerouse!) it finally works and collects metrics from all k8s datastreams. This one also proves that dynamic inputs in combination with kubernetes provider can be used to autodiscover scheduler and controller-manager pods and start collecting from them on the master node (https://github.com/elastic/beats/pull/23679/files#diff-7896a70414721b8d0b3d8b90808b92c750d40c56bdf2ad01bf629c9499cde64eR112):

@blakerouse can you share your thoughts here plz? After this one is in we can add parts for system metrics and container logs (better to split them in different PRs)

Signed-off-by: chrismark <chrismarkou92@gmail.com>

blakerouse

Looks great! Glad to see this is working.

blakerouse · 2021-02-16T12:28:42Z

@ChrsMark I think with the new hostfs work your did on the inputs, I think gather system metrics from the nodes should be possible.

ChrsMark · 2021-02-16T12:49:05Z

@ChrsMark I think with the new hostfs work your did on the inputs, I think gather system metrics from the nodes should be possible.

Yeap, this will be the next one coming.

ChrsMark · 2021-02-16T12:53:12Z

Merging this one and let's iterate on it with follow-up PRs to add more functionality.

jsoriano · 2021-02-16T12:23:34Z