Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upProm using up all available socket connections after a few days #4659
Comments
This comment has been minimized.
This comment has been minimized.
|
It might be connected to my other issue, #4646 . Where some domains are not scraped, but data is valid, some error might not be freeing up the connections. (Just throwing an idea out there...) |
simonpasquier
added
the
component/scraping
label
Sep 26, 2018
This comment has been minimized.
This comment has been minimized.
|
This is probably a Windows-specific issue as it hasn't been reported for other systems. How many targets do you scrape? |
This comment has been minimized.
This comment has been minimized.
|
About 20. Some of the targets are still enabled even though they're down. About 5 are Up. Should I try running only with targets that are UP and see if there's a difference? |
This comment has been minimized.
This comment has been minimized.
|
Having all targets in UP state does not help. Will try running PROM from docker and will update here. |
This comment has been minimized.
This comment has been minimized.
|
Hmm, this is concerning. Could you see if prometheus is holding files or network sockets? I am not familiar with Windows to see how to do it, but in linux, I'd use Further, if it is indeed files, using |
This comment has been minimized.
This comment has been minimized.
|
I have been running Prometheus inside docker for a few days and it seems better. I cant use 2.4.2 I receive this error "INVALID is not a valid start token". I will try to run the windows version and check using tips found here to find if Prometheus is holding the connections open - https://stackoverflow.com/questions/8902136/how-to-find-a-list-of-sockets-held-by-a-process-in-windows |
This comment has been minimized.
This comment has been minimized.
can you paste the metrics output from the target that generates this error? |
This comment has been minimized.
This comment has been minimized.
|
@simonpasquier: I can't really share the whole output due to privacy reasons, here's a piece of it. Would it be possible to add more detail to the parsing error?
|
This comment has been minimized.
This comment has been minimized.
|
You can check the metrics with
|
simonpasquier
added
the
kind/more-info-needed
label
Oct 4, 2018
This comment has been minimized.
This comment has been minimized.
|
@simonpasquier Seems the issue is gone in v2.4.2!! I will report back if it reoccurs. Thanks a lot for the help! |
augmenter commentedSep 26, 2018
Windows Version using up all available socket connections after a long time it seems.
I'm running it fine for a few days, then I am hit with log messages: dial tcp 127.0.0.1:3000: bind: An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full. When this happens, I am unable to open any browser webpages, or do any new connections from my machine. Also scraping stops.
Proposal
I'm running a test environment on my Windows machine, as a proof of concept, later I'm planning to deploy it to a Linux environment.
Bug Report
Run prometheus for a few days.
What did you expect to see?
Expected it to keep working.
What did you see instead? Under which circumstances?
Scraping stopped
Environment
Windows 10
n/a
Prometheus version:
prometheus, version 2.3.2 (branch: HEAD, revision: 71af5e2 5682fa14b7b)
build user: root@5258e0bd9cc1
build date: 20180712-14:13:08
go version: go1.10.3
Logs: