kube-prometheus & kubernetes 1.5.2 - prometheus-k8s-0 node - docker stops responding #239

dmcnaught · 2017-03-27T13:12:52Z

What did you do?
hack/cluster-monitoring/deploy
What did you expect to see?
Stable cluster
What did you see instead? Under which circumstances?
docker stops responding on node. Need to do a docker restart. Node just went down again after 3 days...
Environment
AWS KOPS 1.5.1 Kubernetes 1.5.2

Kubernetes version information:
--- kubernetes/kops ‹master› » ku version
Client Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.2", GitCommit:"08e099554f3c31f6e6f07b448ab3ed78d0520507", GitTreeState:"clean", BuildDate:"2017-01-12T04:57:25Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.2", GitCommit:"08e099554f3c31f6e6f07b448ab3ed78d0520507", GitTreeState:"clean", BuildDate:"2017-01-12T04:52:34Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/amd64"}
Kubernetes cluster kind:

kops 1.5.2
Manifests:

https://github.com/coreos/kube-prometheus.git : 333bd23
I plan to test release 0.7.0 soon...

Prometheus Operator Logs:
Node looks like this before a /etc/init.d/docker restart to fix:

--- kubernetes/kops ‹master› » ku get po --all-namespaces -o wide | grep ip-10-101-118-222.ec2.internal
athena-graphql          athena-graphql-cmd-290124063-x2c19                      1/1       Unknown    1          9d        100.96.33.50     ip-10-101-118-222.ec2.internal
deis                    deis-controller-2434209242-ztkcs                        1/1       Unknown    3          9d        100.96.33.47     ip-10-101-118-222.ec2.internal
deis                    deis-logger-fluentd-19csf                               1/1       NodeLost   1          9d        100.96.33.44     ip-10-101-118-222.ec2.internal
deis                    deis-logger-redis-304849759-9z5g4                       1/1       Unknown    1          5d        100.96.33.42     ip-10-101-118-222.ec2.internal
deis                    deis-monitor-telegraf-xf6mm                             1/1       NodeLost   1          9d        100.96.33.36     ip-10-101-118-222.ec2.internal
deis                    deis-router-3101872284-nmwgf                            1/1       Unknown    1          9d        100.96.33.43     ip-10-101-118-222.ec2.internal
deis                    deis-workflow-manager-2528409207-7pttp                  1/1       Unknown    1          5d        100.96.33.34     ip-10-101-118-222.ec2.internal
hades-graphql           hades-graphql-cmd-459006866-r3pbl                       1/1       Unknown    1          3d        100.96.33.48     ip-10-101-118-222.ec2.internal
kube-system             kube-proxy-ip-10-101-118-222.ec2.internal               1/1       Unknown    1          9d        10.101.118.222   ip-10-101-118-222.ec2.internal
monitoring              grafana-1046448512-l8cgh                                2/2       Unknown    2          9d        100.96.33.40     ip-10-101-118-222.ec2.internal
monitoring              kube-state-metrics-4090613309-mnbrj                     1/1       Unknown    1          9d        100.96.33.32     ip-10-101-118-222.ec2.internal
monitoring              node-exporter-sz8r4                                     1/1       NodeLost   1          9d        10.101.118.222   ip-10-101-118-222.ec2.internal
monitoring              prometheus-k8s-0                                        2/2       Unknown    2          5d        100.96.33.51     ip-10-101-118-222.ec2.internal
monitoring              prometheus-operator-3658205960-2zpfp                    1/1       Unknown    1          9d        100.96.33.49     ip-10-101-118-222.ec2.internal
programs-service        programs-service-cmd-1240201140-mjd53                   1/1       Unknown    0          2d        100.96.33.52     ip-10-101-118-222.ec2.internal
speech-to-text-nodejs   speech-to-text-nodejs-cmd-2508035524-zk217              1/1       Unknown    1          9d        100.96.33.45     ip-10-101-118-222.ec2.internal
splunkspout             k8ssplunkspout-nonprod-c0d1w                            1/1       NodeLost   1          9d        100.96.33.35     ip-10-101-118-222.ec2.internal
styleguide              styleguide-cmd-3772371803-2tdlw                         1/1       Unknown    1          9d        100.96.33.38     ip-10-101-118-222.ec2.internal
styleguide              styleguide-cmd-3772371803-hsg0w                         1/1       Unknown    1          9d        100.96.33.37     ip-10-101-118-222.ec2.internal
styleguide-staging      styleguide-staging-cmd-83554885-cb2pq                   1/1       Unknown    1          9d        100.96.33.39     ip-10-101-118-222.ec2.internal
wellbot                 wellbot-web-2518992024-1bvml                            1/1       Unknown    1          9d        100.96.33.46     ip-10-101-118-222.ec2.internal

I think these tickets are related: kubernetes/kubernetes#42164
kubernetes/kubernetes#39028

The text was updated successfully, but these errors were encountered:

brancz · 2017-03-28T06:33:02Z

Interesting, if a docker restart solves this do you think it makes sense for us to track this here? It's certainly good to know though for future reference, so thanks for reporting! 🙂

dmcnaught · 2017-03-28T19:59:11Z

It seems like prometheus can really speed up the issue. We have one cluster that had the problem twice in four days - and it was always the node with prometheus on.
I wouldn't say a docker restart solves the issue, as the problem is than nodes are going out of service, but the pods are not getting rescheduled. Docker restart is a workaround, but there are still problems.
I'm not sure who is working on this.

brancz · 2017-03-31T09:18:25Z

Hmm .. odd, I can't imagine how Prometheus could play into this, except maybe causing high memory pressure on the host. Maybe you can try setting the kubelet flags to reserve a bit more system memory so the kubelet doesn't over use the node too much.

dmcnaught · 2017-03-31T20:46:12Z

I see it on nodes without prometheus too, it just seems to happen more when prometheus is on the node too...
This comment relates it to prometheus, which is interesting: kubernetes/kubernetes#39028 (comment)

klausenbusk · 2017-04-03T11:10:29Z

I just hit the same issue after deploying the operator and kube-prometheus, I got ContainerGCFailed and context deadline exceeded (when trying to exec into a container on the node). A restart of the node solved the issue, but this isn't good...

brancz · 2017-04-03T11:35:05Z

Definitely @klausenbusk ! However, I don't think this is a particular issue with Prometheus itself rather a Docker and/or Kubernetes issue.

Are either of you mounting storage? It seems some people have trouble with that. Maybe you can setup an extra Prometheus that is monitoring only your node resources, that way we could try and see whether the node resource usage is correlating with this issue.

My suspicion is as I stated above that this happens due to memory or disk pressure.

klausenbusk · 2017-04-03T12:21:31Z

My suspicion is as I stated above that this happens due to memory or disk pressure.

You is properly correct.
I have searched a bit in the log and found:

sshd: page allocation stalls for 51708ms, order:0, mode:0x24201ca(GFP_HIGHUSER_MOVABLE|__GFP_COLD)
[...]
EXT4-fs warning (device sdb): ext4_end_bio:314: I/O error -5 writing to inode 16 (offset 0 size 0 starting block 36442)

klausenbusk · 2017-04-03T12:29:18Z

Kernel log: https://gist.github.com/klausenbusk/21e05251e3b170c48bc66e1c6a081b64 ,hmm.. I need to leave now..

brancz · 2017-04-06T11:33:44Z

I'm guessing that would be related to Prometheus time series churn because pods keep appearing and disappearing and every time series in Prometheus v1.x is one file on disk. This is going to change in Prometheus 2.0. Until then one thing to do is to increase the number of inodes on the hosts that Prometheus runs on, or choose a low retention rate, so that stale time-series get garbage collected faster, therefore less files on disk. Let us know if that helps.

klausenbusk · 2017-04-06T11:52:47Z

I'm guessing that would be related to Prometheus time series churn because pods keep appearing and disappearing and every time series in Prometheus v1.x is one file on disk.

Hmm, that sounds weird. The cluster isn't very big (3 masters + 4 workers) and the pod count is also "low" (guess 20-30, not at my work computer), and I don't add/remove new pod very often.
Could a pod in a crashloop cause this?

Until then one thing to do is to increase the number of inodes on the hosts

That would require reformatting the filesystem, which isn't easily done on a cloud provider (DigitalOcean).

or choose a low retention rate, so that stale time-series get garbage collected faster, therefore less files on disk.

Prometheus was only running for a few hours in a 7 nodes cluster (3 master + 4 workers). If I set the retention that low, Prometheus would be more or less useless.

klausenbusk · 2017-04-18T08:27:29Z

I'm wondering if this theory by @guoshimin could explain this/is the cause of this issue.

kubernetes/kubernetes#39028 (comment)

We are also hitting this issue after upgrading to v1.5.4 from v1.4.7. We noticed that in v1.5.4, kubelet adds a :Z to each bind mount spec if SELinux is enabled on the system. We patched kubelet to disable this behavior and the problem went away.

Our running theory is that this causes libcontainer to try to recursively label all the files in the prometheus volume, which has a lot of files, and the container create request times out.

klausenbusk · 2017-04-18T08:33:38Z

Also see: moby/moby#32007 , I just experienced this again today but haven't got time yet to look at the logs, I properly end up disabling selinux for Docker (it sounds like a plausible explanation).

brancz · 2017-04-25T14:07:07Z

That does sound plausible, but unfortunately I don't see this going away until we fully support Prometheus 2.0 which is in it's first alpha release right now. Our plan is to start supporting it once it goes into beta.

brancz · 2017-06-12T12:28:56Z

Have you been able to test out Prometheus 2.0 alpha releases in this regard? The selinux labeling all files recursively issue mentioned in moby/moby#32892, should not be an issue with Prometheus 2.0 anymore, as the number of files created by Prometheus should be drastically smaller. I realize until Prometheus 2.0 hits a stable release, this is not a valid solution, but it would be great to know whether this problem is solved.

The Prometheus 2.0 pre releases are in experimental support in the Prometheus Operator so it's possible to try them out. We would highly appreciate feedback!

/cc @mikebryant @klausenbusk @dmcnaught

brancz · 2017-07-25T09:56:05Z

Has anyone been able to try out Prometheus 2.0-beta.0 regarding this issue? Prometheus 2.0 is nearing GA, but we'd appreciate as much testing as possible, and if it solves this issue, then even better.

qrpike · 2017-09-20T14:29:00Z

I also seem to be having this issue ( CoreOS 1465.7.0, Kubernetes 1.6.1, Prometheus 1.0.1 ). However I'm mounting the prometheus data directory to the host disk.

@klausenbusk Did disabling selinux fix the issue for you?

klausenbusk · 2017-09-20T14:55:21Z

@klausenbusk Did disabling selinux fix the issue for you?

I haven't experienced this for some time, and the cluster has changed a bit since (k8s 1.5 -> self-hosted 1.7).

brancz · 2017-11-22T10:42:21Z

Prometheus 2.0 stable has been released, and the Prometheus Operator fully supports Prometheus 2.0. I will close this issue here. Feel free to open new issues regarding Prometheus 2.0. The issue described in this post is fundamentally not solvable with Prometheus 1.x therefore we recommend switching to 2.0.

stuart-warren · 2017-11-22T12:06:45Z

I don't believe that there has been a release of prometheus-operator that supports prometheus-2.0 stable since #735

brancz · 2017-11-22T12:59:26Z

Prometheus 2.0 has support has been in the Prometheus Operator for multiple releases now 🙂

…es-master [bot] Bump openshift/prometheus-operator to v0.67.0

mikebryant mentioned this issue Apr 7, 2017

docker ps hang on 1.13.1 due to selinux relabelling holding a container lock moby/moby#32007

Closed

mikebryant mentioned this issue Apr 27, 2017

selinux relabelling causes a long-lived container lock, resulting in docker api hangs moby/moby#32892

Open

brancz closed this as completed Nov 22, 2017

machine424 pushed a commit to machine424/prometheus-operator that referenced this issue Sep 20, 2023

Merge pull request prometheus-operator#239 from rhobs/automated-updat…

12fcd8b

…es-master [bot] Bump openshift/prometheus-operator to v0.67.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kube-prometheus & kubernetes 1.5.2 - prometheus-k8s-0 node - docker stops responding #239

kube-prometheus & kubernetes 1.5.2 - prometheus-k8s-0 node - docker stops responding #239

dmcnaught commented Mar 27, 2017

brancz commented Mar 28, 2017

dmcnaught commented Mar 28, 2017

brancz commented Mar 31, 2017

dmcnaught commented Mar 31, 2017

klausenbusk commented Apr 3, 2017

brancz commented Apr 3, 2017

klausenbusk commented Apr 3, 2017

klausenbusk commented Apr 3, 2017

brancz commented Apr 6, 2017

klausenbusk commented Apr 6, 2017

klausenbusk commented Apr 18, 2017

klausenbusk commented Apr 18, 2017

brancz commented Apr 25, 2017

brancz commented Jun 12, 2017

brancz commented Jul 25, 2017

qrpike commented Sep 20, 2017

klausenbusk commented Sep 20, 2017

brancz commented Nov 22, 2017

stuart-warren commented Nov 22, 2017

brancz commented Nov 22, 2017

kube-prometheus & kubernetes 1.5.2 - prometheus-k8s-0 node - docker stops responding #239

kube-prometheus & kubernetes 1.5.2 - prometheus-k8s-0 node - docker stops responding #239

Comments

dmcnaught commented Mar 27, 2017

brancz commented Mar 28, 2017

dmcnaught commented Mar 28, 2017

brancz commented Mar 31, 2017

dmcnaught commented Mar 31, 2017

klausenbusk commented Apr 3, 2017

brancz commented Apr 3, 2017

klausenbusk commented Apr 3, 2017

klausenbusk commented Apr 3, 2017

brancz commented Apr 6, 2017

klausenbusk commented Apr 6, 2017

klausenbusk commented Apr 18, 2017

klausenbusk commented Apr 18, 2017

brancz commented Apr 25, 2017

brancz commented Jun 12, 2017

brancz commented Jul 25, 2017

qrpike commented Sep 20, 2017

klausenbusk commented Sep 20, 2017

brancz commented Nov 22, 2017

stuart-warren commented Nov 22, 2017

brancz commented Nov 22, 2017