Support monitoring TCP services #18

dchidell · 2020-10-04T13:29:50Z

Not all services are web based - but they can be checked fairly simply (i.e. see if they're at least running) by opening a TCP socket. If the socket opens successfully, the service is considered online.

This is crude - but at least opens the door to monitoring applications which are not based on HTTP.

This is pretty similar to this request: #5 but should be a lot more generic.

TwiN · 2020-10-04T23:44:48Z

I'm all for it, but however, I'm a little perplexed by the following statement:

If the socket opens successfully, the service is considered online.

If the socket opens successfully, it just means that there's something listening on that port at the host specified, it doesn't necessarily mean that the service is healthy.

The code for this is quite simple, in fact:

conn, err := net.DialTimeout("tcp", "127.0.0.1:6379", 10 * time.Second)
if err != nil {
	// couldn't open connection :(
	panic(err)
}
fmt.Println("Managed to open connection successfully :)")
conn.Close()

But this is the least of our concerns, as the biggest concern lies with the conditions themselves. To be perfectly frank, the idea of supporting non-http requests hadn't even crossed my mind when I started Gatus, and which the fault lies entirely with me on this one, it doesn't change the fact that we need to ask ourselves how can we support both the new pattern you're suggesting as well as HTTP.

Let's take a look at the current core configuration options:

services:
  - name: twinnation  
    url: "https://twinnation.org/health"
    interval: 30s     
    conditions:
      - "[STATUS] == 200"         
      - "[BODY].status == UP"     
      - "[RESPONSE_TIME] < 300"

We can already see a few problems here, namely the conditions and the url:

A few of the conditions use placeholders that are specific to HTTP requests, such as the [BODY], which uses JSONPath, and [STATUS], which is the HTTP status.
The URL is, well, an URL, but TCP doesn't really care about URL, it only cares about the host name and the port.

Suggested solution

Coupled with forcing the prefix tcp:// for tcp services and http:///https:// for HTTP services,
ignoring unnecessary placeholders are configuration options would definitely make the work easy.

I will introduce a new placeholder, [CONNECTED], which will return true only if a connection could be established.
It will work for both TCP and HTTP - probably UDP in the future too.

Here's what the configuration for a TCP service would look like:

services:
  - name: redis
    url: "tcp://127.0.0.1:6379"
    interval: 1m
    conditions:
      - "[CONNECTED] == true"

Of course, quite a few fields will be useless for TCP services, such as services[].body, services[].insecure,
services[].headers, services[].method and services[].graphql. Likewise, the [STATUS] and [BODY] placeholders will be useless.

I originally wanted to create a new top level field tcpServices[] instead of services[] just for TCP services, but then I realized that later on, if we want to add support for native MySQL query, UDP, native Redis commands, etc. it would require adding one new top level field for each of them.

Instead, leveraging the prefix of the services[].url should be enough even for future use case, i.e.:

tcp://127.0.0.1:6379
redis://127.0.0.1:6379
postgres://{user}:{password}@{hostname}:{port}/{database-name}

In any case, I've already started the work necessary and it'll be available shortly.

TwiN · 2020-10-05T00:10:28Z

@dchidell I've pushed this into latest. Would you mind giving it a try locally and telling me what you think before I release this in v1.2.0?

dchidell · 2020-10-05T12:06:41Z

Tested it using the latest docker tag, and looks good to me! Thanks for getting that in, SUPER fast as well.

To your above point regarding if the service is listening / healthy - there are almost limitless ways to test if a TCP service is healthy / functional, but while checking the TCP port is crude, it's often effective (i.e. the service hasn't crashed, and is at least configured to listen on the right port). It's a good indicator if something is likely to work or not without delving into the complexity of different services.

I don't think this will work for UDP based services, since there's no socket to 'establish' as you get with TCP using the 3 way handshake. I think you'd have to end up building a mechanism to send a particular byte string, and then expecting some form of response depending on the service type. For TCP however, this works beautifully :)

TwiN · 2020-10-05T14:23:59Z

For UDP, I was hoping to just send something and accept any answer as a valid answer, or perhaps support services[].body as what to write. Anyhow, one problem at a time 🤣

Thanks for your input! I'll release this in v1.2.0, probably today

TwiN · 2020-10-06T12:20:39Z

Released in v1.2.0

TwiN changed the title ~~Feature Request - Backend monitoring of TCP sockets~~ Support monitoring TCP services Oct 4, 2020

TwiN added the feature New feature or request label Oct 4, 2020

TwiN closed this as completed in 3ecfe4d Oct 4, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support monitoring TCP services #18

Support monitoring TCP services #18

dchidell commented Oct 4, 2020 •

edited

Loading

TwiN commented Oct 4, 2020

TwiN commented Oct 5, 2020

dchidell commented Oct 5, 2020

TwiN commented Oct 5, 2020

TwiN commented Oct 6, 2020

Support monitoring TCP services #18

Support monitoring TCP services #18

Comments

dchidell commented Oct 4, 2020 • edited Loading

TwiN commented Oct 4, 2020

Suggested solution

TwiN commented Oct 5, 2020

dchidell commented Oct 5, 2020

TwiN commented Oct 5, 2020

TwiN commented Oct 6, 2020

dchidell commented Oct 4, 2020 •

edited

Loading