Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cri-dockerd results in failing metric-server #2716

Closed
travisghansen opened this issue Oct 16, 2021 · 2 comments
Closed

cri-dockerd results in failing metric-server #2716

travisghansen opened this issue Oct 16, 2021 · 2 comments

Comments

@travisghansen
Copy link

RKE version: v1.3.1

Docker version: (docker version,docker info preferred)

Containers: 177
 Running: 175
 Paused: 0
 Stopped: 2
Images: 74
Server Version: 18.09.9
Storage Driver: overlay2
 Backing Filesystem: xfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host ipvlan macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: ea765aba0d05254012b0b9e595e995c09186427f
runc version: dc9208a3303feef5b3839f4323d9beb36df0a9dd
init version: fec3683
Security Options:
 seccomp
  Profile: default
Kernel Version: 5.12.6-1.el7.elrepo.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 16
Total Memory: 62.68GiB
Name: node01
ID: OXET:Q53D:RBEA:EMJ3:OB5S:IR7P:IDF6:5Q47:X6CH:PBPV:MNC3:OWBV
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: true
Insecure Registries:
 controller.lan:5000
 127.0.0.0/8
Registry Mirrors:
 http://controller.lan:5000/
Live Restore Enabled: false
Product License: Community Engine

Operating system and kernel: (cat /etc/os-release, uname -r preferred)

cat /etc/os-release 
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

uname -r
5.12.6-1.el7.elrepo.x86_64

Type/provider of hosts: (VirtualBox/Bare-metal/AWS/GCE/DO) Bare metal

cluster.yml file:

...
enable_cri_dockerd: true/false
...

Steps to Reproduce:

  • add enable_cri_dockerd: true to cluster.yml

Results:

The metrics-server pod(s) never become healthy and spew timeouts:

E1016 16:28:42.609094       1 scraper.go:139] "Failed to scrape node" err="Get \"https://172.29.2.11:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node01"
E1016 16:28:42.609141       1 scraper.go:139] "Failed to scrape node" err="Get \"https://172.29.2.13:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node03"
E1016 16:28:42.609308       1 scraper.go:139] "Failed to scrape node" err="Get \"https://172.29.2.12:10250/stats/summary?only_cpu_and_memory=true\": context deadline exceeded" node="node02"
I1016 16:28:52.137340       1 server.go:188] "Failed probe" probe="metric-storage-ready" err="not metrics to serve"

Additional Info

# with cri-dockerd enabled
time curl -v -k --key /etc/kubernetes/ssl/kube-apiserver-key.pem --cert /etc/kubernetes/ssl/kube-apiserver.pem 'https://localhost:10250/stats/summary?only_cpu_and_memory=true'
...
real	3m13.619s
user	0m0.048s
sys	0m0.067s

# with cri-dockerd disabled
time curl -v -k --key /etc/kubernetes/ssl/kube-apiserver-key.pem --cert /etc/kubernetes/ssl/kube-apiserver.pem 'https://localhost:10250/stats/summary?only_cpu_and_memory=true'
...
real	0m0.117s
user	0m0.063s
sys	0m0.047s

# with cri-dockerd enabled
time crictl stats
...
real	6m26.254s
user	0m0.029s
sys	0m0.029s

time crictl stats --id f8415e3b6cdf8
...
real	0m4.455s
user	0m0.016s
sys	0m0.012s

# docker stats
time docker stats --no-stream
...
real	0m2.714s
user	0m0.165s
sys	0m0.080s
@travisghansen
Copy link
Author

This is probably more of an issue with cri-dockerd than rke per-se, but creating here since rke includes an internally built binary etc.

@superseb
Copy link
Contributor

Merging this into #2709

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants