Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support RBAC on remote calls from the querier to the serverless backend #2477

Closed
modulitos opened this issue May 15, 2023 · 6 comments · Fixed by #2593
Closed

Support RBAC on remote calls from the querier to the serverless backend #2477

modulitos opened this issue May 15, 2023 · 6 comments · Fixed by #2593

Comments

@modulitos
Copy link
Contributor

Is your feature request related to a problem? Please describe.

As an admin of a self-hosted Tempo installation, I want to use my cloud provider's HTTP api when sending requests from the querier to the serverless backend (aka external_endpoint), so that I can control access via my cloud provider's RBAC policies.

The current external_endpoints configuration (used for hosting a Tempo serverless backend) doesn't allow for sending auth info. We can see this in the code where the querier sends an http request to the backend, in the Querier.searchExternalEndpoint function:

req, err := http.NewRequest(http.MethodGet, externalEndpoint, nil)

Describe alternatives you've considered

The current workaround seems to rely on firewalls/VPCs/etc to allow the queriers to send requests to the serverless backend, while blocking all other traffic. This solution isn't ideal because it doesn't leverage the existing ecosystem of RBAC, service accounts, etc from a cloud provider. These firewalls can also be difficult to configure correctly, potentially exposing security vulnerabilities.

Describe the solution you'd like

I would like a solution that leverages the RBAC policies of the cloud providers when sending requests to the serverless backend. Queriers are already configured with a service account or role from the cloud provider - after all, this is how they are accessing the storage bucket.

Here's a quick summary for how we might implement this feature:
Instead of using http.NewRequest(http.MethodGet, ...) in Querier.searchExternalEndpoint (linked above), I think we can delegate this call to the appropriate AWS/GCP/Azure client library.

We're already doing this when accessing the storage buckets:

switch cfg.Backend {
case "local":
err = fmt.Errorf("local backend not supported for serverless functions")
case "gcs":
r, _, _, err = gcs.NewNoConfirm(cfg.GCS)
case "s3":
r, _, _, err = s3.NewNoConfirm(cfg.S3)
case "azure":
r, _, _, err = azure.NewNoConfirm(cfg.Azure)
default:
err = fmt.Errorf("unknown backend %s", cfg.Backend)
}

I think we leverage a similar pattern when selecting the cloud provider when making an HTTP service request in Querier.searchExternalEndpoint.

As for the HTTP request itself, I believe AWS/GCP/Azure all have APIs for sending HTTP requests to their services. Here's an example where we are sending an HTTP request using Google's NewTransport api:

transport, err := google_http.NewTransport(ctx, customTransport, transportOptions...)

If that sounds reasonable, then I'm happy to chat more about the details for implementing this, including what the configuration might look like, how it'll get rolled out, etc.

Additional context

Let me know if I missed something or if anything is unclear - you can find me on the Grafana slack room if chatting is easier :)

I understand that this feature will add some complexity, so maybe this isn't a priority for Tempo's roadmap. But I'm curious what the maintainers think, and I'm interested in working on it.

@joe-elliott
Copy link
Member

We would be glad to take this PR. I would honestly prefer using RBAC myself, but we chose the path we did for speed. I think we'll need an interface that is backend agnostic and then we can write implementations for AWS Lambda, Google Cloud Run, etc. The interface should be as simple as possible so that it's easy to add implementations. The best way to get started would likely be to agree on this interface.

This work also should not involve the current backend storage interfaces but should likely be brand new code. WDYT?

@modulitos
Copy link
Contributor Author

Sounds good! I'll start with a proposal for the interface, and post here once that's ready.
My availability might be a little sporadic, but I'm excited to work on this 😀

@joe-elliott
Copy link
Member

Appreciate it Swart :)

@modulitos
Copy link
Contributor Author

What do we think about making a new "external client" struct to encapsulate the concerns around sending remote requests to an external endpoint? Then we can inject our RBAC behavior into that struct.

The interface for that struct would look like this:

type externalClient interface {
	Search(ctx context.Context, maxBytes int, searchReq *tempopb.SearchBlockRequest) (*tempopb.SearchResponse, error)
}

where that Search function basically handles the logic that's currently in the searchExternalEndpoint function of the Querier struct.

This new external client struct would track the backend (eg: cloud-run or lambda), as well as the http.Client and its HedgeRequests* customizations. Then we can replace the searchClient *http.Client field on the Querier struct with an instance of this new struct.

I spiked an implementation of it here:
b47f7ca

I did a red/green smoke test using a CloudRun backend, and it's behaving as expected. I'm happy to push up a PR with docs and test updates, but I want to make sure we're aligned on the interface and design as suggested above. Looking forward to feedback! I'm available on the Slack too if that's easier.

@joe-elliott
Copy link
Member

What do we think about making a new "external client" struct to encapsulate the concerns around sending remote requests to an external endpoint? Then we can inject our RBAC behavior into that struct.

Yup, 100% agree with this approach. Another nice improvement is that the querier -> external endpoint relationship will be easier to test with this interface.

That diff is looking good! Feel free to submit a PR and I'll give it a line by line review. Thanks for following up on this.

@modulitos
Copy link
Contributor Author

Great! I submitted the PR linked above ^

I might be a little slow to respond/followup, but I will eventually get to it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants