Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign up"Context deadline exceeded" for responsive target #2459
Comments
This comment has been minimized.
This comment has been minimized.
|
Hm, @alexellis mentioned that he had similar problems with Prometheus on Docker - he couldn't even scrape Prometheus itself at @AlexRudd I used your exact steps to reproduce your scenario and it works for me. I wonder if Docker broke something networking-related in recent versions. I'm still on Docker 1.12.1. This might be one of these cases where you want to pull out Wireshark or tcpdump and look at what's really happening on the network. |
This comment has been minimized.
This comment has been minimized.
alexellis
commented
Mar 1, 2017
|
I was seeing the same error with the latest docker images for Prometheus. I thought it was because I was running it as a |
This comment has been minimized.
This comment has been minimized.
|
It does work for me with exactly the same Prometheus version mentioned here btw., so I suspect it's Docker-related. But since I can't reproduce, it would be cool if someone who sees the problem can dig into actual packet traces to see what's going on. |
This comment has been minimized.
This comment has been minimized.
|
Ah, I didn't think to check my docker version! I just tried my reproduction steps at home and it worked fine:
I'll check what version I'm on at work tomorrow and do some digging through Docker's issues. I was also having issues scraping Prometheus itself (localhost:9090) but that was giving a different error and thought I'd focus on one thing at a time. |
This comment has been minimized.
This comment has been minimized.
|
So at work I'm running:
I upgraded to the latest Docker (now Docker CE) and still get the same "context deadline exceeded" error in Prometheus:
|
This comment has been minimized.
This comment has been minimized.
|
Then I'll have to refer to "But since I can't reproduce, it would be cool if someone who sees the problem can dig into actual packet traces to see what's going on." :) It seems it's either a Docker problem, or some weird interaction between how Go resolves things and the way Docker does DNS. |
This comment has been minimized.
This comment has been minimized.
|
Any chance you might be hitting moby/moby#19866? Seems a lot has changed regarding default bridge network and --link. |
This comment has been minimized.
This comment has been minimized.
|
I get the same error message (context deadline exceeded) on Travis-CI and Docker 1.12.3. I don't use Prometheus, but run integration tests from a Java client against the Docker daemon. The error appears when creating a service and directly afterwards requesting the current tasks. Locally on Docker for Mac 17.03.0-ce-mac2 I cannot reproduce the error. In other words: I would confirm @juliusv's assumption that this isn't a Prometheus issue - but maybe the Prometheus setup helps to reproduce the issue? edit: I fixed my issue by adding a client-side timeout of 1s between service creation and the task list request. |
This comment has been minimized.
This comment has been minimized.
|
Yeah - I think it's safe to close this issue for now, as it's likely a Docker issue. I'd rather ask the Docker folks about it. |
juliusv
closed this
Mar 12, 2017
This comment has been minimized.
This comment has been minimized.
meowtochondria
commented
Mar 27, 2017
|
Happening in Centos 7.3 in AWS without using docker. This is a very basic setup as the OP reported, with nothing fancy going on, except that i am running everything native. Would be happy to provide any details that might help resolution. |
This comment has been minimized.
This comment has been minimized.
prologic
commented
Jul 18, 2017
|
Ran into a similar problem on Docker 17.03-17.05 ce These versions of docker have a bug such that connecting the service/container to the host bridge network is harder than it should be. Work-around:
|
This comment has been minimized.
This comment has been minimized.
joshpollara
commented
Sep 15, 2017
|
Ran into the same issue. Amazon Linux 2017.03 no docker. Basic setup. |
This comment has been minimized.
This comment has been minimized.
MarilynZ
commented
Oct 25, 2017
•
|
@juliusv Prometheus version: Prometheus configuration file: external_labels: rule_files:
alerting:
scrape_configs:
|
This comment has been minimized.
This comment has been minimized.
benitogf
commented
Nov 14, 2017
|
@juliusv found this issue on ubuntu 16.04:
Scrape interval and timeout set to 30s but no change |
This comment has been minimized.
This comment has been minimized.
roberteventival
commented
Jan 31, 2018
|
Hi, I've had the same problem, but noticed by docker environment is very slow and swapping quite a lot. After adding more RAM and restarting docker, the problem is gone for the moment. |
This comment has been minimized.
This comment has been minimized.
dcmspe
commented
Sep 27, 2018
|
I am experiencing the same issue.
My docker has 15GB of limit, adding more memory could solve the issue? |
This comment has been minimized.
This comment has been minimized.
weskinner
commented
Dec 18, 2018
|
The fix for me was editing the NetworkPolicy associated with the Pod prometheus was trying to scrape. |
This comment has been minimized.
This comment has been minimized.
rcsavage
commented
Feb 27, 2019
|
@weskinner what exactly did you edit in the NetworkPolicy associated with the Pod? I am seeing the same issue right now, so any advise would help. |
This comment has been minimized.
This comment has been minimized.
weskinner
commented
Feb 27, 2019
|
@rcsavage sorry I'm not remember the specifics. I just know the NetworkPolicy that applied to that pod was preventing prometheus from accessing it from the monitoring namespace. Adjusting the labels on the monitoring namespace and/or the selectors in the NP should fix it if that is your problem as well. |

AlexRudd commentedMar 1, 2017
What did you do?
Tried to run Prometheus server in a docker container and scrape metrics from a node_exporter instance running in a separate but linked container (see environment section.)
Used a basic config which scrapes the linked container, via its container name dns address, every 10 seconds.
Noticed that this very basic target was showing as DOWN in the Prometheus dashboard targets view, with the error: "context deadline exceeded"
I attached to the running Prometheus container and timed a manual scrape of the target using
timeandwget, this worked as expected:Created a separate target group with
scrape_intervalandscrape_timeoutboth increased to 30s but targeted still at the same node_exporter instance. Scrapes for this group are successful but appear to take a long time (Last Scrape reported as > 36s ago which I assume means the scrape took upwards of 6 seconds from Prometheus' perspective.)What did you expect to see?
Both target groups reporting as UP and scrapes completing quickly
What did you see instead? Under which circumstances?
DOWN "context deadline exceeded" for the 10s scrape group
Environment
Recreate my exact environment:
Nothing which seems to be relevant in the logs.
Hopefully I'm not doing anything obviously wrong!
Thanks