Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backend/http: allow to force treating all requests as trusted for tracing purposes #19

Merged
merged 2 commits into from
Aug 23, 2023

Conversation

roobre
Copy link
Member

@roobre roobre commented Jul 27, 2023

Trusting client-provided parent trace IDs can be a security risk, as it can open the door to DoS attacks by allowing a client to send a large amount of requests with the sample flag set to true, causing the backend to generate a large amount of tracing data. This can affect the performance of the tracing backend, and sometimes also billing.

However, this could be useful for testing, so this PR introduces the QUICKPIZZA_TRACING_INSECURE env var which, when set to a truthy value, will always trust trace parent IDs from external requests.

@roobre roobre changed the base branch from custom-client to main July 28, 2023 16:36
@roobre roobre requested a review from dgzlopes July 28, 2023 16:55
@dgzlopes
Copy link
Member

dgzlopes commented Aug 7, 2023

I don't fully understand this. If the client doesn't send the traceID, we still create a new trace per request, don't we? 🤔

@roobre
Copy link
Member Author

roobre commented Aug 18, 2023

@dgzlopes The logic for traces and their relationship is a bit convoluted, I'll try to summarize the current behavior here:

  1. For incoming requests that do not have any trace-related header, we create a brand new span. This span will become the root span for the trace, and it will be an HTTP Server span for the server who received the request.
  2. If incoming requests do have a traceparent header, then we check the following:
    a. If a trusted, secret header is also found, we assume this requests comes from one of our other (trusted) services. In this case, we take the traceparent header and we make the HTTP Server span a child of it. The traceparent header is also trusted for more things, e.g. to flag whether a request should be sampled
    b. If there is a traceparent header, but no secret "trust me bro" header, the code will not make the HTTP Server span a child of the incoming trace, as it is not considered trusted. Instead, the code will make a weaker relationship (a link). The should-sample flag of the traceparent header is ignored

This is the default behavior encouraged by opentelemetry, as blindly trusting traceparent headers can have security implications: A client could force your code to sample every request it sends, DoSing your telemetry backend.

This PR introduces an env var for blindly trusting all requests: The traceparent header will be honored from external, untrusted requests, and generated spans will use the incoming, untrusted trace as a parent. This is useful for debugging purposes.

LMK if this helped made the thing more clear :)

@Blinkuu
Copy link

Blinkuu commented Aug 21, 2023

@roobre I would eliminate WithPublicEndpoint() altogether or at least not make it the default (e.g., put it behind a feature flag). This is a demo application; to me, it only makes the codebase more complex (and harder to demo with k6 x Tempo).

A client could force your code to sample every request it sends, DoSing your telemetry backend.

The sampling flag is one thing, but you can also configure sampling at the following levels:

  • Application level (SDK)
  • Agent
  • Tracing backend

Having link association configured does not help avoid the denial of wallet attack.

@roobre
Copy link
Member Author

roobre commented Aug 21, 2023

The sampling flag is one thing, but you can also configure sampling at the following levels: [...]

Correct me if I'm wrong, but isn't the typical tracing setup to honor the tracing flag over the SDK configuration? Otherwise you would get incomplete traces, right?

Having link association configured does not help avoid the denial of wallet attack.

My current understanding is that it does help, as it prevents clients from overriding your sampling choices. I haven't read the full spec, so I might be wrong here.

This is a demo application; to me, it only makes the codebase more complex

This is a fair point, but I do not fully agree. I think that as a demo application, it should do things the way they are supposed to be done, or be very obvious in doing them wrong (so everybody can say "ah, that's because it's a demo"). Trusting all traces by default is, in my view, extremely non-obvious for people without prior experience manually instrumenting things. I'm worried about people looking at the code for examples and finding concealed bad practices that they could then get carried over to real projects.

As a final point, I think real applications or deployments could be configured with the equivalent of WithPublicEndpoint(), so I think it would be nice for the demo application to mimic whenever possible real application behavior.

and harder to demo with k6 x Tempo

This is important for me, I would not like to make it hard to use for demos. I'm hopeful that we can figure out a way to make the application easily demoable that without making the code deceptively dangerous to learn from.

Some alternatives to getting rid of this behavior are:

  • Keep this PR, but set QUICKPIZZA_TRACING_INSECURE=1 in the default deployment YAML. This way, the code still tells the full story, but demos work out of the box.
  • Work a bit more on the isPublic logic, for example by defining a header like X-Tracing-Trusted-Request that k6 (client) can send and would cause the application to consider the request trusted. I suspect this would be more similar to how things are done in the real world: Letting your gateway know somehow that a client is trusted to create root spans.

@Blinkuu LMK what you think about this. I hope we can find a solution that makes the application easily demoable without making the code deceive readers into bad practices.

pkg/http/http.go Outdated Show resolved Hide resolved
pkg/http/http.go Outdated Show resolved Hide resolved
@roobre roobre requested a review from Blinkuu August 22, 2023 15:58
@roobre
Copy link
Member Author

roobre commented Aug 23, 2023

@dgzlopes Any other thoughts on the PR? LMK if you have any remaining questions 😃

Copy link
Member

@dgzlopes dgzlopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If @Blinkuu is happy, I'm happy. These days, he knows the most about tracing 😄

But yeah, I gave it another look, and generally, LGTM.

@roobre
Copy link
Member Author

roobre commented Aug 23, 2023

Awesome! I will raise a follow-up PR setting this var to 1 by default, so demo deployments work out of the box for k6 x tempo.

@roobre roobre merged commit d666949 into main Aug 23, 2023
@dgzlopes dgzlopes deleted the trust-traces branch August 23, 2023 16:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants