Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zookeeper discovery has an exception #4668

Closed
wenfei3 opened this Issue Sep 28, 2018 · 6 comments

Comments

Projects
None yet
2 participants
@wenfei3
Copy link

wenfei3 commented Sep 28, 2018

Bug Report

What did you do?
Added a serverset_sd_configs job in prometheus.yml and reloaded the prometheus.

What did you expect to see?
I would expect to see the prometheus could read the services which were registerd to zookeeper cluster.

What did you see instead? Under which circumstances?
When I reloaded the prometheus.yml, the prometheus stopped working and I found some error logs.

  • System information:
    Linux 2.6.32-573.22.1.el6.x86_64 x86_64

  • Prometheus version:
    prometheus, version 2.3.2 (branch: HEAD, revision: 71af5e2)
    build user: root@5258e0bd9cc1
    build date: 20180712-14:02:52
    go version: go1.10.3

  • Zookeeper version:
    3.4.12

  • Prometheus configuration file:

global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  external_labels:
      monitor: 'fishtrip-monitor'

rule_files:
  - "prometheus.rules.yml"

alerting:
  alertmanagers:
  - scheme: http
    static_configs:
    - targets:
      - "10.x.x.x:19093"
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:19090']
  -job_name: 'node_exporter'
  ...
  ...
  - job_name: 'zookeeper_discovery'
    serverset_sd_configs:
    - servers:
      - 'a3:2181'
      - 'a5:2181'
      - 'a6:2181'
      paths:
      - '/services'
  • Logs:
panic: runtime error: invalid memory address or nil pointer dereference
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x1511546]

goroutine 58 [running]:
github.com/prometheus/prometheus/discovery/zookeeper.(*Discovery).Run.func1(0x0)
	/go/src/github.com/prometheus/prometheus/discovery/zookeeper/zookeeper.go:165 +0x26
panic(0x19972e0, 0x2ae2be0)
	/usr/local/go/src/runtime/panic.go:502 +0x229
github.com/prometheus/prometheus/discovery/zookeeper.(*Discovery).Run(0x0, 0x1deec40, 0xc4208de2c0, 0xc420b38120)
	/go/src/github.com/prometheus/prometheus/discovery/zookeeper/zookeeper.go:178 +0x8d
created by github.com/prometheus/prometheus/discovery.(*Manager).startProvider
	/go/src/github.com/prometheus/prometheus/discovery/manager.go:133 +0x116
@wenfei3

This comment has been minimized.

Copy link
Author

wenfei3 commented Sep 28, 2018

I thought I found my problem.

@simonpasquier

This comment has been minimized.

Copy link
Member

simonpasquier commented Sep 28, 2018

Reopening as there's indeed a problem with the Zookeeper SD when it can't connect to the servers.

@simonpasquier simonpasquier reopened this Sep 28, 2018

@wenfei3

This comment has been minimized.

Copy link
Author

wenfei3 commented Sep 28, 2018

@simonpasquier
Thanks for your reply! There was another problem.My prometheus connected ZK servers successfully , but it couldn't get any services' address and port.

Here is my prometheus log about ZK connection success

evel=info ts=2018-09-28T10:18:44.05267337Z caller=main.go:222 msg="Starting Prometheus" version="(version=2.3.2, branch=HEAD, revision=71af5e29e815795e9dd14742ee7725682fa14b7b)"
level=info ts=2018-09-28T10:18:44.052764946Z caller=main.go:223 build_context="(go=go1.10.3, user=root@5258e0bd9cc1, date=20180712-14:02:52)"
level=info ts=2018-09-28T10:18:44.052793254Z caller=main.go:224 host_details="(Linux 2.6.32-573.22.1.el6.x86_64 #1 SMP Wed Mar 23 03:35:39 UTC 2016 x86_64 op1 (none))"
level=info ts=2018-09-28T10:18:44.052817947Z caller=main.go:225 fd_limits="(soft=65535, hard=65535)"
level=info ts=2018-09-28T10:18:44.053894088Z caller=web.go:415 component=web msg="Start listening for connections" address=0.0.0.0:9090
level=info ts=2018-09-28T10:18:44.053873265Z caller=main.go:533 msg="Starting TSDB ..."
level=info ts=2018-09-28T10:18:45.402911724Z caller=main.go:543 msg="TSDB started"
level=info ts=2018-09-28T10:18:45.403104292Z caller=main.go:603 msg="Loading configuration file" filename=prometheus.yml
level=info ts=2018-09-28T10:18:45.405783471Z caller=main.go:629 msg="Completed loading of configuration file" filename=prometheus.yml
level=info ts=2018-09-28T10:18:45.4058165Z caller=main.go:502 msg="Server is ready to receive web requests."
level=info ts=2018-09-28T10:18:45.40758143Z caller=treecache.go:60 component="discovery manager scrape" discovery=zookeeper msg="Connected to 10.25.197.20:2181"
level=info ts=2018-09-28T10:18:45.415419815Z caller=treecache.go:60 component="discovery manager scrape" discovery=zookeeper msg="Authenticated: id=73893403303280656, timeout=10000"
level=info ts=2018-09-28T10:18:45.415490736Z caller=treecache.go:60 component="discovery manager scrape" discovery=zookeeper msg="Re-submitting `0` credentials after reconnect"

and here is the service in ZK

[zk: localhost:2181(CONNECTED) 12] get /services/zeus/c34e20d3-f278-430c-8f04-db40b145331c
{"name":"zeus","id":"c34e20d3-f278-430c-8f04-db40b145331c","address":"10.x.x.x","port":xxxx,"sslPort":null,"payload":{"@class":"org.springframework.cloud.zookeeper.discovery.ZookeeperInstance","id":"zeus:xxxx","name":"zeus","metadata":{"instance_status":"UP"}},"registrationTimeUTC":1538103418690,"serviceType":"DYNAMIC","uriSpec":{"parts":[{"value":"scheme","variable":true},{"value":"://","variable":false},{"value":"address","variable":true},{"value":":","variable":false},{"value":"port","variable":true}]}}
cZxid = 0x10000006d
ctime = Fri Sep 28 10:56:58 CST 2018
mZxid = 0x10000006d
mtime = Fri Sep 28 10:56:58 CST 2018
pZxid = 0x10000006d
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x10685a8a7230007
dataLength = 515
numChildren = 0

and here is my prom ui

image

I have no idea that why the host is empty and the port is 0.
Could anyone give me a hand?Thanks very much!

@simonpasquier

This comment has been minimized.

Copy link
Member

simonpasquier commented Sep 28, 2018

Unfortunately I'm not a ZooKeeper SD expert. Which system does provision the serversets in your ZooKeeper cluster?

@wenfei3

This comment has been minimized.

Copy link
Author

wenfei3 commented Sep 30, 2018

Thanks for your reply!
I think I know where my problem is.I use the spring-cloud to implement services discovery, and the prometheus doesn't support it.It only support the finagle .

@simonpasquier

This comment has been minimized.

Copy link
Member

simonpasquier commented Oct 31, 2018

Closed by #4669

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.