Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

context deadline exceeded for all exporters #2311

Closed
apsega opened this Issue Dec 30, 2016 · 3 comments

Comments

Projects
None yet
1 participant
@apsega
Copy link

apsega commented Dec 30, 2016

Hey,

I've noticed that recently from time to time randomly I'm getting context deadline exceeded error from all exporters in Prometheus:

screen shot 2016-12-30 at 4 12 39 pm

For the record I see this not only from the ceph-exporter by DigitalOcean, but at the same time blackbox exporter shows similar error:

Get http://blackbox-exporter:9115/probe?module=http_2xx&target=http%3A%2F%2F[removed_ip]: dial tcp: lookup blackbox-exporter on [removed_ip]:53: dial udp [removed_ip]:53: i/o timeout

Prometheus (1.4.1) runs under Docker on a Linux 3.10.0-327.18.2.el7.x86_64 x86_64 system.

# docker -v
Docker version 1.11.2, build b9f10c9
  • Logs:

What's interesting, logs do not show anything, below are the last entries:

time="2016-12-30T14:23:00Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:549"
time="2016-12-30T14:23:03Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.029525873s." source="persistence.go:573"
time="2016-12-30T14:28:03Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:549"
time="2016-12-30T14:28:06Z" level=info msg="Done checkpointing in-memory metrics and chunks in 3.001948943s." source="persistence.go:573"

Although, after this, I've found these errors in Docker logs:

Dec 30 15:30:28 docker: time="2016-12-30T15:30:28.260836203+01:00" level=error msg="More than 100 concurrent queries from 172.18.0.7:45773"
Dec 30 15:30:39 docker: time="2016-12-30T15:30:39.261085324+01:00" level=error msg="More than 100 concurrent queries from 172.18.0.7:45773"

Disk I/O is no more than 5% utilized, CPU and RAM usage is low. Any ideas what would be worth checking?

@apsega

This comment has been minimized.

Copy link
Author

apsega commented Dec 30, 2016

Actually this can be more of a Docker problem:

# cat /var/log/messages | grep "More than 100 concurrent queries" | grep "Dec 30" | wc -l
755
@apsega

This comment has been minimized.

Copy link
Author

apsega commented Jan 2, 2017

It seems that the issue wasn't Prometheus fault after all. I tried running container with METRICS_RESOLUTION=5s flag and it didn't helped.

Looks like it was Docker 1.11.2 bug with concurrent queries - moby/moby#22185. Upgrading Docker to newest version (1.12.5) helped, and I do not see context deadline exceeded errors in Prometheus and More than 100 concurrent queries from <...> errors in Docker.

@apsega apsega closed this Jan 2, 2017

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 24, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 24, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.