Skip to content

HTTP/HTTPS check accepts non-2xx responses, leaks response body, and has no timeout #7

@dolph

Description

@dolph

Summary

The HTTP check in https.go has three independent defects that, combined, make it unreliable as a connectivity validator:

  1. Status code is never checked. The README states "An HTTP 2xx response is expected", but the code returns success on any non-error response — including 4xx and 5xx.
  2. Response body is never closed. This leaks file descriptors and connections; in monitor mode it accumulates over time.
  3. No timeout. http.Get uses the default client, which has no timeout. A slow or hung server will block the check indefinitely. In monitor mode that destination will simply stop being checked.

There is also a subtler design issue: the per-IP Dial loop validates connectivity to each address returned by DNS, but http.Get(dest.URL) re-resolves DNS internally via the OS resolver, so the actual HTTP request may end up at a different IP than what Lookup/Dial verified.

Code

https.go:

func HTTPS(dest *Destination) bool {
	dest.Increment("connectivity.http", []string{})
	_, err := http.Get(dest.URL)               // no timeout; response discarded; body never closed
	if err != nil {
		dest.Increment("connectivity.http.error", []string{})
		LogDestinationError(dest, "Failed HTTP GET", err)
		return false
	}
	dest.Increment("connectivity.http.success", []string{})
	return true
}

Impact

  • A destination returning HTTP 500 is reported as "Connected", so the tool returns 0 from connectivity check and gates do not catch the failure.
  • Long-running monitor accumulates leaked connections and may hang on slow upstreams forever.

Suggested fix

func HTTPS(dest *Destination) bool {
	dest.Increment("connectivity.http", []string{})
	client := &http.Client{Timeout: 15 * time.Second}
	resp, err := client.Get(dest.URL)
	if err != nil {
		dest.Increment("connectivity.http.error", []string{})
		LogDestinationError(dest, "Failed HTTP GET", err)
		return false
	}
	defer resp.Body.Close()
	io.Copy(io.Discard, resp.Body)              // drain so the connection can be reused
	if resp.StatusCode < 200 || resp.StatusCode >= 300 {
		dest.Increment("connectivity.http.error", []string{})
		LogDestinationError(dest, fmt.Sprintf("Non-2xx status: %d", resp.StatusCode), nil)
		return false
	}
	dest.Increment("connectivity.http.success", []string{})
	return true
}

Consider also making the timeout configurable and emitting an http.timeout metric tag to distinguish hangs from connect failures.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions