Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upZookeeper connection leak in Discovery #5093
Comments
This comment has been minimized.
This comment has been minimized.
|
I have replaced the below code in func (d *Discovery) Run():
with the code:
and we did not experience the leaks described in the issue. The purpose of this code is to consume any event from the d.updates channel and in case there is no event then it will default and break out from the for loop. After that, d.conn.Close() will be invoked and the defer function will exit. |
This comment has been minimized.
This comment has been minimized.
|
@ioanvapi Do you want open a PR for this with the changes you described above? |
This comment has been minimized.
This comment has been minimized.
|
I have a fork and I can ask a merge request if my solution seams to be ok. |
ioanvapi
added a commit
to ioanvapi/prometheus
that referenced
this issue
Jan 14, 2019
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
ioanvapi commentedJan 14, 2019
•
edited
What did you do?
Used prometheus with zookeeper for service discovery then reloaded prometheus (SIGHUP) several times.
What did you expect to see?
From pprof perspective, prometheus keeps the same amount of goroutines and a similar number of process file descriptors related to the zookeeepr connections.
What did you see instead? Under which circumstances?
After several prometheus service reload ("killall -SIGHUP prometheus") we observed the number of files descriptors for zookeeper connections are increased. Also, the number of goroutines related to the Discovery.Run() increased in /debug/pprof/goroutine.
We created a cron that reloads the prometheus service at every 20 mins and we observed the above behaviour.
Environment
System information:
Linux 3.10.0-862.14.4.el7.x86_64 x86_64
Prometheus version:
prometheus, version 2.6.0 (branch: HEAD, revision: dbd1d58)
build user: root@bf5760470f13
build date: 20181217-15:14:46
go version: go1.11.3
Code in prometheus/discovery/zookeeper/zookeeper.go:
In the above code we observe the d.conn.Close() is invoked if the above for range over the channel d.updates is terminated. Since the d.updates channel is never closed in the code then the following happens: