Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to detect the kubelet URL automatically after k8s upgrade to 1.16.x #5942

Closed
bramvdklinkenberg opened this issue Jul 10, 2020 · 3 comments

Comments

@bramvdklinkenberg
Copy link

Describe what happened:
I upgraded my AKS cluster from 1.15.10 to 1.16.10 and I don't see any logs in DD anymore. The DD agents are giving the following error:

2020-07-10 07:34:58 UTC | CORE | ERROR | (pkg/collector/runner/runner.go:292 in work) | Error running check kubelet: [{"message": "Unable to detect the kubelet URL automatically.", "traceback": "Traceback (most recent call last):\n  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/checks/base.py\", line 822, in run\n    self.check(instance)\n  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/kubelet/kubelet.py\", line 297, in check\n    raise CheckException(\"Unable to detect the kubelet URL automatically.\")\ndatadog_checks.base.errors.CheckException: Unable to detect the kubelet URL automatically.\n"}]
2020-07-10 07:35:04 UTC | CORE | ERROR | (pkg/autodiscovery/config_poller.go:123 in collect) | Unable to collect configurations from provider kubernetes: temporary failure in kubeutil, will retry later: try delay not elapsed yet
2020-07-10 07:35:13 UTC | CORE | ERROR | (pkg/collector/python/kubeutil.go:38 in getConnections) | connection to kubelet failed: temporary failure in kubeutil, will retry later: try delay not elapsed yet
2020-07-10 07:35:13 UTC | CORE | ERROR | (pkg/collector/runner/runner.go:292 in work) | Error running check kubelet: [{"message": "Unable to detect the kubelet URL automatically.", "traceback": "Traceback (most recent call last):\n  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/checks/base.py\", line 822, in run\n    self.check(instance)\n  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/kubelet/kubelet.py\", line 297, in check\n    raise CheckException(\"Unable to detect the kubelet URL automatically.\")\ndatadog_checks.base.errors.CheckException: Unable to detect the kubelet URL automatically.\n"}]
2020-07-10 07:35:14 UTC | CORE | ERROR | (pkg/autodiscovery/config_poller.go:123 in collect) | Unable to collect configurations from provider kubernetes: temporary failure in kubeutil, will retry later: try delay not elapsed yet
2020-07-10 07:35:14 UTC | CORE | INFO | (pkg/autodiscovery/autoconfig.go:357 in initListenerCandidates) | kubelet listener cannot start, will retry: temporary failure in kubeutil, will retry later: try delay not elapsed yet

Describe what you expected:
To see my logs in DD

Steps to reproduce the issue:
Upgrade cluster from 1.15.10 to 1.16.10

Additional environment details (Operating System, Cloud provider, etc):
I also upgraded the kube-state-metrics from 1.8.0 to 1.9.7. Logs of the state metrics looks fine:

kubectl -n kube-system logs -f kube-state-metrics-6d7449fc78-hsjzw
I0710 07:13:01.539587       1 main.go:86] Using default collectors
I0710 07:13:01.539645       1 main.go:98] Using all namespace
I0710 07:13:01.539659       1 main.go:139] metric white-blacklisting: blacklisting the following items:
W0710 07:13:01.539680       1 client_config.go:543] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I0710 07:13:01.541039       1 main.go:184] Testing communication with server
I0710 07:13:01.599744       1 main.go:189] Running with Kubernetes cluster version: v1.16. git version: v1.16.10. git tree state: clean. commit: 89d8075525967c7a619641fabcb267358d28bf08. platform: linux/amd64
I0710 07:13:01.599769       1 main.go:191] Communication with server successful
I0710 07:13:01.599883       1 main.go:225] Starting metrics server: 0.0.0.0:8080
I0710 07:13:01.599923       1 main.go:200] Starting kube-state-metrics self metrics server: 0.0.0.0:8081
I0710 07:13:01.600026       1 metrics_handler.go:96] Autosharding disabled
I0710 07:13:01.600990       1 builder.go:146] Active collectors: certificatesigningrequests,configmaps,cronjobs,daemonsets,deployments,endpoints,horizontalpodautoscalers,ingresses,jobs,limitranges,mutatingwebhookconfigurations,namespaces,networkpolicies,nodes,persistentvolumeclaims,persistentvolumes,poddisruptionbudgets,pods,replicasets,replicationcontrollers,resourcequotas,secrets,services,statefulsets,storageclasses,validatingwebhookconfigurations,volumeattachments
@bramvdklinkenberg
Copy link
Author

kubelet (4.1.0)
    ---------------
      Instance ID: kubelet:d884b5186b651429 [ERROR]
      Configuration Source: file:/etc/datadog-agent/conf.d/kubelet.d/conf.yaml.default
      Total Runs: 120
      Metric Samples: Last Run: 0, Total: 0
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 0s
      Last Execution Date : 2020-07-10 09:38:35.000000 UTC
      Last Successful Execution Date : Never
      Error: Unable to detect the kubelet URL automatically.
      Traceback (most recent call last):
        File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/checks/base.py", line 822, in run
          self.check(instance)
        File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/kubelet/kubelet.py", line 297, in check
          raise CheckException("Unable to detect the kubelet URL automatically.")
      datadog_checks.base.errors.CheckException: Unable to detect the kubelet URL automatically.

@bramvdklinkenberg
Copy link
Author

Apparently the location of the kubelet ca cert changed in AKS 1.16.x and higher.
Added the following volume and volumemount:
volumes: - name: k8s-certs hostPath: path: /etc/kubernetes/certs type: '' volumeMounts: - name: k8s-certs readOnly: true mountPath: /etc/kubernetes/certs
And this extra ENV
env: - name: DD_KUBELET_CLIENT_CA value: /etc/kubernetes/certs/kubeletserver.crt

@picardsrcd
Copy link

ation of the kubelet ca cert changed in AKS 1.16.x and higher.
Added the following volume and volumemoun

Better formatted version:

agents:
  containers:
    agent:
      env:
      - name: DD_KUBELET_CLIENT_CA 
        value: /etc/kubernetes/certs/kubeletserver.crt
  volumes:
    - name: k8s-certs 
      hostPath: 
        path: /etc/kubernetes/certs 
        type: '' 
  volumeMounts: 
    - name: k8s-certs 
      readOnly: true 
      mountPath: /etc/kubernetes/certs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants