Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Target "prometheus" in docker image is always DOWN #2260

Closed
goofansu opened this Issue Dec 7, 2016 · 21 comments

Comments

Projects
None yet
6 participants
@goofansu
Copy link

goofansu commented Dec 7, 2016

What did you do?

docker run -p 9090:9090 prom/prometheus:latest

What did you expect to see?

The state of target "prometheus" should be "UP".

What did you see instead? Under which circumstances?

The state of target "prometheus" is "DOWN".

Environment

  • System information:

    Darwin 15.6.0 x86_64

  • Prometheus version:

      prometheus, version 1.4.1 (branch: master, revision: 2a89e8733f240d3cd57a6520b52c36ac4744ce12)
        build user:       root@e685d23d8809
        build date:       20161128-09:59:22
        go version:       go1.7.3
    
  • Logs:

time="2016-12-07T08:38:18Z" level=info msg="Starting prometheus (version=1.4.1, branch=master, revision=2a89e8733f240d3cd57a6520b52c36ac4744ce12)" source="main.go:77"
time="2016-12-07T08:38:18Z" level=info msg="Build context (go=go1.7.3, user=root@e685d23d8809, date=20161128-09:59:22)" source="main.go:78"
time="2016-12-07T08:38:18Z" level=info msg="Loading configuration file /etc/prometheus/prometheus.yml" source="main.go:250"
time="2016-12-07T08:38:19Z" level=info msg="Loading series map and head chunks..." source="storage.go:354"
time="2016-12-07T08:38:19Z" level=info msg="0 series loaded." source="storage.go:359"
time="2016-12-07T08:38:19Z" level=info msg="Starting target manager..." source="targetmanager.go:63"
time="2016-12-07T08:38:19Z" level=info msg="Listening on :9090" source="web.go:248"
time="2016-12-07T08:39:19Z" level=info msg="Completed maintenance sweep through 3 in-memory fingerprints in 30.00262488s." source="storage.go:1180"
time="2016-12-07T08:39:59Z" level=info msg="Completed maintenance sweep through 3 in-memory fingerprints in 30.004019298s." source="storage.go:1180"

FYI, the endpoint is like http://d101b017c30f:9090/metrics, the d101b017c30f is the container ID.

@brancz

This comment has been minimized.

Copy link
Member

brancz commented Dec 7, 2016

Posting your config would be helpful, but I'm guessing the target you actually want is localhost:9090, I don't think the container id resolves to anything from within the container.

@goofansu

This comment has been minimized.

Copy link
Author

goofansu commented Dec 7, 2016

I'm just using the default config in docker, I copied it from http://localhost:9090/config

# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

  # Attach these labels to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).
  external_labels:
      monitor: 'codelab-monitor'

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first.rules"
  # - "second.rules"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ['localhost:9090']
@sdurrheimer

This comment has been minimized.

Copy link
Member

sdurrheimer commented Dec 7, 2016

I've just tested running docker run -p 9090:9090 prom/prometheus:latest on my machine, working fine.

@goofansu

This comment has been minimized.

Copy link
Author

goofansu commented Dec 7, 2016

@sdurrheimer Can you help to take a look at the endpoint of target "prometheus"?

@sdurrheimer

This comment has been minimized.

Copy link
Member

sdurrheimer commented Dec 7, 2016

The prometheus endpoint is UP yes.

@goofansu

This comment has been minimized.

Copy link
Author

goofansu commented Dec 7, 2016

@sdurrheimer

screenshot of google chrome 16-12-7 5-47-21

This is my target, notice the link at the bottom.

@sdurrheimer

This comment has been minimized.

Copy link
Member

sdurrheimer commented Dec 7, 2016

The link looks okay to me given that the default hostname of a docker container is the ID of the container.
Of course, this will not resolve magically from outside of the container.

What is strange, is that prometheus can't scrape itself on your side.

@goofansu

This comment has been minimized.

Copy link
Author

goofansu commented Dec 7, 2016

@sdurrheimer Yes, it is strange. Don't know why.

@goofansu

This comment has been minimized.

Copy link
Author

goofansu commented Dec 8, 2016

I tested on Ubuntu, and it's ok.

@goofansu

This comment has been minimized.

Copy link
Author

goofansu commented Dec 8, 2016

Close this as I cannot see the cause.

@goofansu goofansu closed this Dec 8, 2016

@goofansu

This comment has been minimized.

Copy link
Author

goofansu commented Dec 9, 2016

@sdurrheimer Did you try on macOS? I tested on three macOS and all show "DOWN". Two are in the same network, one in another network.

@goofansu goofansu reopened this Dec 9, 2016

@brancz

This comment has been minimized.

Copy link
Member

brancz commented Dec 9, 2016

What are you using as a host for the container @goofansu ?

@goofansu

This comment has been minimized.

Copy link
Author

goofansu commented Dec 9, 2016

@brancz I'm using Docker for Mac app.

Version 1.12.3 (13776)
Channel: Stable
583d1b8ffe
@brancz

This comment has been minimized.

Copy link
Member

brancz commented Dec 9, 2016

Interesting. I was able to reproduce it with the Docker for Mac client. It can't resolve localhost for some reason. But this doesn't seem to be a Prometheus issue per se.

I exec'd into the container:

docker exec -it <container-id> /bin/sh`

Replaced the target from localhost to 127.0.0.1:

sed -i s/localhost/127\.0\.0\.1/g /etc/prometheus/prometheus.yml

And made Prometheus reload the config:

kill -SIGHUP 1

... and it worked. So while we can improve the default config here, there is something else odd, but I have the feeling it has to do with the Docker for Mac client.

@goofansu

This comment has been minimized.

Copy link
Author

goofansu commented Dec 9, 2016

@brancz Ok, I'll use 127.0.0.1 as a workaround. I think this should be documented.

@goofansu goofansu closed this Dec 9, 2016

@sdurrheimer

This comment has been minimized.

Copy link
Member

sdurrheimer commented Dec 9, 2016

It looks like more a Docker for Mac thing, I know they do some hacky/magic things when resolving names.

@simonvanderveldt

This comment has been minimized.

Copy link

simonvanderveldt commented Dec 21, 2016

@sdurrheimer do you know why prometheus doesn't actually use localhost, but the container's ID/hostname? Is there some lookup done for this?

Also, does prometheus do some custom stuff with determining the hostname?

Within the container we have this in /etc/hosts:

/prometheus # cat /etc/hosts
127.0.0.1	localhost
::1	localhost ip6-localhost ip6-loopback
fe00::0	ip6-localnet
ff00::0	ip6-mcastprefix
ff02::1	ip6-allnodes
ff02::2	ip6-allrouters
172.17.0.27	bc0b4b7443e6

From within this container pinging to either localhost or bc0b4b7443e6 works fine, but prometheus itself can't access itself. So I guess it seems like prometheus isn't using what's defined in /etc/hosts.

@Robpol86

This comment has been minimized.

Copy link

Robpol86 commented Dec 27, 2016

I ran into this too (using 127.0.0.1 fixed it for me too).

I'm testing out prometheus on a Fedora Server 25 VM with no additional/3rdparty repos (no official Docker YUM repos):

$ rpm -qa |grep -i docker
python-dockerpty-0.4.1-3.fc25.noarch
docker-compose-1.8.1-1.fc25.noarch
python2-docker-pycreds-0.2.1-2.fc25.noarch
python-docker-py-1.10.6-1.fc25.noarch
docker-1.12.5-1.git6009905.fc25.x86_64
docker-common-1.12.5-1.git6009905.fc25.x86_64
$ docker --version
Docker version 1.12.5, build 6009905/1.12.5
@Robpol86

This comment has been minimized.

Copy link

Robpol86 commented Dec 27, 2016

So I did some digging and may have found the culprit (maybe it's a bug?).

127.0.0.1 did indeed work however "localhost" still didn't. Neither did "prometheus", my container_name. Nor did "cadviser", another container.

While refreshing http://localhost:9090/targets I noticed the error message changed from "context deadline exceeded" to something with "hangup 198.105.244.117" (sorry didn't copy the error message). Anyway it was very weird seeing a public IP in the error when I was pointing to localhost.

It turns out that IP is a Suddenlink DNS server of sorts (I'm at my parents house for the holidays). Once I removed that DNS and stuck with 8.8.8.8 on my Mac (I'm running Fedora Server 25 via virtualbox) and rebooted my VM all of a sudden "localhost" and "cadviser" addresses in prometheus.yml started working. Everything is "UP" in http://localhost:9090/targets !

Basically this is probably a weird DNS issue with either Docker, the Docker that ships with Fedora, Fedora, or Prometheus. Not really sure though.

Edit: Now that I think about it it may be Suddenlink resolving localhost and everything else to inject their advertising or data cap limit messages. Probably not a bug, just an unfortunate side effect for this annoying practice.

@fer-marino

This comment has been minimized.

Copy link

fer-marino commented Feb 13, 2017

I had the same issue. At the end it was an incorrect dns configuration on the host machine. Nothing to do with prometheus or docker.

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 24, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 24, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.