Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checking VictoriaMetrics health endpoint returns socket timeout #507

Closed
matejzero opened this issue Dec 27, 2019 · 4 comments · Fixed by #515
Closed

Checking VictoriaMetrics health endpoint returns socket timeout #507

matejzero opened this issue Dec 27, 2019 · 4 comments · Fixed by #515
Labels

Comments

@matejzero
Copy link

Hello,

I'm trying to monitor Victoria Metrics health endpoint, but check_http is returning CRITICAL - Socket timeout.

Output of the command:

./check_http -H localhost -u /health -p 8428 --verbose
GET /health HTTP/1.1
User-Agent: check_http/v2.3.1 (nagios-plugins 2.3.1)
Connection: close
Host: localhost:8428
Accept: */*


CRITICAL - Socket timeout

Every now and then, I do get successful return of OK:

/usr/lib64/nagios/plugins/check_http -H localhost -u /health -p 8428 --verbose
GET /health HTTP/1.1
User-Agent: check_http/v2.1.4 (nagios-plugins 2.1.4)
Connection: close
Host: localhost:8428
Accept: */*


http://localhost:8428/health is 122 characters
STATUS: HTTP/1.1 200 OK
**** HEADER ****
Content-Type: text/plain
Date: Fri, 27 Dec 2019 14:37:08 GMT
Content-Length: 2
Connection: close

**** CONTENT ****
OK
HTTP OK: HTTP/1.1 200 OK - 122 bytes in 0.002 second response time |time=0.001788s;;;0.000000 size=122B;;;0

curl returns successful result every time (notice, there is no new line at the end of OK string):

curl -v http://localhost:8428/health
* About to connect() to localhost port 8428 (#0)
*   Trying ::1...
* Connection refused
*   Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 8428 (#0)
> GET /health HTTP/1.1
> User-Agent: curl/7.29.0
> Host: localhost:8428
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Type: text/plain
< Date: Fri, 27 Dec 2019 14:40:05 GMT
< Content-Length: 2
<
* Connection #0 to host localhost left intact
OK

I'm not sure if this is a bug on check_http or VM side, since curl is returning OK everytime. If I run check_http with -N flag, then it passes the check.

@sawolf
Copy link
Member

sawolf commented Dec 27, 2019

Hi @matejzero, thanks for reporting this. I'm not sure when I'll be able to debug/investigate this, but at least I was able to reproduce this against their container. You said it works occasionally, so I suspect this is a bug in the nagios-plugins project.

@sawolf sawolf added the Bug label Dec 27, 2019
@matejzero
Copy link
Author

Sure, no problem. It works for me with -N flag for now, so I'm fine. I don't need string check for now, since status code is enough in my case.

sawolf added a commit to sawolf/nagios-plugins that referenced this issue Jan 31, 2020
… the amount of content received

Prior to this commit, check_http relied on read() returning 0 or erroring in order to
exit the loop. This commit keeps the same behavior, but once headers have been received,
it also reads the Content-Length and compares it to the number of bytes received after
the start of the body.
@sawolf
Copy link
Member

sawolf commented Jan 31, 2020

The pull request above seems to fix the issue in my environment. Would you be willing to compile and verify that it works for you as well?

@matejzero
Copy link
Author

I can confirm this PR fixes the bug.

# ./check_http -H localhost -u /health -p 8428
HTTP OK: HTTP/1.1 200 OK - 122 bytes in 0.002 second response time |time=0.001807s;;;0.000000 size=122B;;;0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants