Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trickster's health endpoint infinitely appends to the "Via" header on upstream requests, ultimately causing an HTTP 431 error #531

Closed
tminor opened this issue Feb 3, 2021 · 2 comments
Assignees
Labels
1.1 release Feature/Fix will be available in 1.1 Release bug Something isn't working healthchecking affects health checking

Comments

@tminor
Copy link
Contributor

tminor commented Feb 3, 2021

Hi!

I recently set up Trickster behind a load balancer that checks Trickster's /trickster/health/default endpoint every second or so. After a Trickster node has been up for around 5 days, the load balancer marks the node as down. My first instinct was to check via curl, which returned HTTP 431. Restarting Trickster solved the problem, but it ultimately resurfaced. I had a look at a Grafana dashboard for node_exporter and found this:

trickster

The curious looking pattern is from the loopback/localhost interface and the sudden drops correspond with Trickster restarting; Trickster's default backend is Promxy running on the same host (via localhost). After seeing the above dashboard panel, I used tcpdump to examine the traffic between Trickster and Promxy. This revealed giant Via headers in the requests from Trickster to Promxy. I restarted Trickster and noted that the value for Via was appended to with every request to /trickster/health/default. Thinking this was an error on my part, I configured Trickster on my laptop and pointed it directly at a Prometheus server with the simplest possible configuration:

[caches.default]
cache_type = "memory"

[frontend]
listen_port = 8480

[logging]
log_level = "info"

[backends.default]
provider = "prometheus"
origin_url = "http://prometheus:9090" # Changed to a real value, of course.

During the next phase of troubleshooting, I used dlv to pinpoint the source of the problem. I'm very far from proficient with Go, but I believe this line is the culprit. Based on what I've read, the right side of the assignment is passing a pointer. Subsequent operations (such as in SetVia()) cause unintended side effects. Please correct me if I'm wrong!

I was able to ameliorate the problem with this patch:

diff --git a/pkg/backends/prometheus/handler_health.go b/pkg/backends/prometheus/handler_health.go
index 59b52a2..a1eaebf 100644
--- a/pkg/backends/prometheus/handler_health.go
+++ b/pkg/backends/prometheus/handler_health.go
@@ -19,6 +19,7 @@ package prometheus
 import (
 	"context"
 	"net/http"
+	"strings"
 
 	tctx "github.com/tricksterproxy/trickster/pkg/proxy/context"
 	"github.com/tricksterproxy/trickster/pkg/proxy/engines"
@@ -45,7 +46,9 @@ func (c *Client) HealthHandler(w http.ResponseWriter, r *http.Request) {
 	rsc := request.GetResources(r)
 	req = req.WithContext(tctx.WithHealthCheckFlag(tctx.WithResources(context.Background(), rsc), true))
 
-	req.Header = c.healthHeaders
+	for k, v := range c.healthHeaders {
+		req.Header.Set(k, strings.Join(v, ""))
+	}
 	engines.DoProxy(w, req, true)
 }
 

I'd submit a PR but I'm not sure if my "solution" is acceptable (strings.Join() seems smelly, for example). I'm also open to the suggestion that I'm doing something wrong :)

@tminor
Copy link
Contributor Author

tminor commented Feb 3, 2021

It seems this problem is handled differently in InfluxDB's handler_health.go. Perhaps that's the better solution?

tminor added a commit to tminor/trickster that referenced this issue Feb 3, 2021
@jranson jranson self-assigned this Feb 4, 2021
@jranson jranson added 1.1 release Feature/Fix will be available in 1.1 Release bug Something isn't working healthchecking affects health checking labels Feb 4, 2021
@jranson
Copy link
Member

jranson commented Feb 4, 2021

Thanks again for identifying and patching this. Release 1.1.5 is now available here and on DockerHub.

@jranson jranson closed this as completed Feb 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1.1 release Feature/Fix will be available in 1.1 Release bug Something isn't working healthchecking affects health checking
Projects
None yet
Development

No branches or pull requests

2 participants