Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] proxy endpoint for kubernetes backend API #12231

Closed
jamieklassen opened this issue Jun 22, 2022 · 18 comments
Closed

[RFC] proxy endpoint for kubernetes backend API #12231

jamieklassen opened this issue Jun 22, 2022 · 18 comments
Labels
area:kubernetes Related to the Kubernetes Project Area - not deploying Backstage with k8s. rfc Request For Comment(s)

Comments

@jamieklassen
Copy link
Member

Status: Open for comments

Need

VMware is building Tanzu Application Platform, a platform for building applications on Kubernetes. Most of the platform consists of CRDs and controllers for exposing high-level abstractions which make developers more productive. The Tanzu Application Platform GUI is a Backstage app, and most of the UI experiences it provides take the form of CRD UIs (similar to those discussed in #2857 and #5511) which consume the kubernetes-backend plugin for their data. However, Tanzu Application Platform GUI increasingly needs to support:

  • non-Entity-based workflows, e.g. for "operator" personas more concerned with clusters (or groups/fleets of clusters) than services
  • "inner loop" workflows where developers want to have a GUI to show them k8s objects they are actively iterating on without needing to register a (publicly-cataloged) Entity to track them.
  • performing write operations against clusters on-demand, such as creating and deleting applications or supply chains.
  • user experiences which rely on non-resource-based APIs provided by kubernetes -- showing logs from pods being the premier example.

Proposal

Add a /proxy endpoint to the kubernetes-backend plugin's API. This endpoint would allow Backstage plugins to make arbitrary Kubernetes calls (including listing objects without reference to an Entity/label selector, creating/deleting objects, and non-resource-based APIs like logs) to any cluster (or multiple/all clusters), using the same or similar auth translation logic to that of the existing endpoints.

The OpenShift console has a k8s proxy mounted at /api/kubernetes, potentially routing to multiple clusters, which does a similar kind of auth translation. I'm suggesting that Backstage have something similar (but following its own idiom, and not written in Go).

Having discussed it briefly with @mclarke, it sounds like, instead of supporting any k8s endpoint as a path nested under /api/kubernetes/proxy (similar to what OpenShift does), it might be necessary to encode the URL path of the kubernetes endpoint in question. Personally I'd prefer the "sub-path" setup suggested by the OpenShift example, but , maybe some maintainers could weigh in here?

Some considerations:

  • Cluster scope: the endpoint should accept a header like X-Kubernetes-Cluster (inspired by kcp; if we want to bikeshed, OpenShift's console uses X-Cluster) to scope requests to a single cluster (or all clusters with a wildcard value? There are probably API actions where this might not make sense)
  • Auth: in a partial solution we built internally for logs, we had the client serialize a KubernetesRequestAuth as JSON in a different header (X-Kubernetes-Auth comes to mind? Maybe something to identify it as Backstage-specific?). A similar approach seems suitable here.

It would be really neat if kubernetes-based UI plugins could simply make use of a library like @kubernetes/client-node and have this API transparently "just work", but this might be a bit ambitious.

Example

Suppose I've got an app-config like

kubernetes:
  clusterLocatorMethods:
    - type: 'config'
      clusters:
        - name: gke-cluster
          url: https://IP-ADDRESS
          authProvider: google
          skipTLSVerify: true
          skipMetricsLookup: true

where IP-ADDRESS is the IP address of a real GKE cluster. Then I'd like to be able to make a request to the Backstage server like

GET /api/kubernetes/proxy?path=%2Fv1%2Fnamespaces%2Fdefault%2Fpods%2Fmypod-abc123%2Flog%3Fcontainer%3Dworkload HTTP/1.1
X-Kubernetes-Cluster: gke-cluster
X-Kubernetes-Auth: {"google":"GOOGLE-ACCESS-TOKEN"}

where that path URL parameter is /v1/namespaces/default/pods/mypod-abc123/log?container=workload (the usual path for a kubectl -n default logs mypod-abc123 -c workload command) URL-encoded, and GOOGLE-ACCESS-TOKEN is a valid access token that has been negotiated with accounts.google.com out of band. Then the server should respond with the logs of the workload container in the mypod-abc123 pod in the default namespace of the gke-cluster cluster.

Alternatives

We could try adding new features (pod logs, object creation, queries against specific cluster/namespace/type instead of entity-based queries) one-by-one to the existing API structure, each with its own RFC/discussion/consensus-building -- but this felt like it made iterating on each kubernetes-related feature too expensive. In theory, as more features are developed using this "unsafe" or "plumbing" API, their implementations could be used as helper methods in the kubernetes frontend plugin, or they could be references for new, safer/idiomatic API endpoints within the kubernetes-backend.

Risks

  • Load? kubernetes-backend winds up being a middle man for many kubernetes API calls, some of which (like logs) have unpalatably large response payloads
  • Maintenance burden as the kubernetes API changes over time? Not sure if this is a real concern.
  • Security in general; this proposal would render Backstage a full-fledged deputy/control plane for multiple kubernetes clusters.
@jamieklassen jamieklassen added the rfc Request For Comment(s) label Jun 22, 2022
@benjdlambert benjdlambert added the area:kubernetes Related to the Kubernetes Project Area - not deploying Backstage with k8s. label Jun 23, 2022
@benjdlambert
Copy link
Member

@mclarke47 maybe something you want to have a read through also 🙏

@mclarke47
Copy link
Collaborator

I really like this idea,

Do we foresee the need to add the entity ref to the query string when running a command against multiple clusters and then allowing the service locator to decide which clusters to route it to based on the entity.

Then the endpoint could make requests against 1 cluster or multiple clusters, however this might not be necessary.

@liamrathke
Copy link
Contributor

Hi everyone, I'm an intern at VMware working on an implementation of this RFC under @jamieklassen - my goal for the summer is to submit a PR with this functionality.

I have a mostly functional proxy that can make a request against a single cluster, and would like to add multi-cluster requests next, so I'd like to start a discussion about what we'd want those to look like from an end-user's perspective.

It seems like there are three possible cluster combinations we would want to account for:

  • Single cluster
  • Multiple clusters
  • All clusters (wildcard)

For each scenario, we will need to ensure that the proper auth credentials are available. Since the Kubernetes REST API wants an Authorization: Bearer <TOKEN> header, we will need bearer tokens for every cluster that is being queried. In other words, we will need to somehow match each cluster name with a valid auth token to make a proper request.


Given these requirements, I think it might make sense to move the two proposed headers (for cluster names and tokens) into one header that would contain a map with cluster names as keys and tokens as values. Here's what this could look like across all scenarios:

Single cluster:

X-Kubernetes-Clusters:

{
  "my-cluster": "my-token"
}

or, if the proxy should default to the existing token for the cluster (from app-config.yaml),

{
  "my-cluster": ""
}

Multiple clusters:

X-Kubernetes-Clusters:

{
  "my-cluster-1": "my-token-1",
  "my-cluster-2": "my-token-2",
  "my-cluster-3": ""
}

All clusters (wildcard):

X-Kubernetes-Clusters: "{}"


Thoughts?

@mclarke47
Copy link
Collaborator

mclarke47 commented Jul 6, 2022

Hi @liamrathke, it seems like this is for client side auth like oidc and google auth which are sent in the body of the other backend requests.

I'm not a huge fan of very complicated headers putting necessary parameters in headers can make the API harder to use and understand, however I'm not sure the alternatives are much better.

auth in the query string

One alternative is to include the auth details in the query string then use some convention to pass them, for example:

base64 encoding (url safe!) the object

{
  "my-cluster-1": "my-token-1",
  "my-cluster-2": "my-token-2",
  "my-cluster-3": ""
}

to ewogICJteS1jbHVzdGVyLTEiOiAibXktdG9rZW4tMSIsCiAgIm15LWNsdXN0ZXItMiI6ICJteS10b2tlbi0yIiwKICAibXktY2x1c3Rlci0zIjogIiIKfQ

and putting it in the querystring:

backend:8080/kubernetes/proxy?path=proxy/path/stuff&authClusterMap=ewogICJteS1jbHVzdGVyLTEiOiAibXktdG9rZW4tMSIsCiAgIm15LWNsdXN0ZXItMiI6ICJteS10b2tlbi0yIiwKICAibXktY2x1c3Rlci0zIjogIiIKfQ"

POST

The other option is switching the endpoint to a POST and passing the auth in the body like the other endpoints. The advantage here is you can reuse the existing types.

For server side auth (like a google service accounts) we don't need to pass auth at all because the request will get decorated with auth in the backend.

Let me know what you think!

@Rugvip
Copy link
Member

Rugvip commented Jul 7, 2022

Gonna chime in and strongly discourage placing credentials in the query string. It's very common that they will be exposed through logs or proxies that way. Keep credentials in the body or headers

@mclarke47
Copy link
Collaborator

@Rugvip sounds good. @liamrathke let's do either an encoded header or POST. I'll leave it up to you to decide

@liamrathke
Copy link
Contributor

@Rugvip @mclarke47 thanks for the feedback, I'm inclined to go with an encoded header so that we can have parity with the Kubernetes HTTP API as far as request methods are concerned - we won't need to decide whether an incoming POST request should be sent to Kubernetes as a GET or a POST

@liamrathke
Copy link
Contributor

Update - I'm working on my multi-cluster request implementation and realized that there's still some ambiguity about what the response from a multi-cluster request should look like. There are two main approaches I thought of which we could take:


Option 1

Aggregate the items from each response into one large items object, so that the full response for a multi-cluster request to cluster-1 and cluster-2 would look something like this:

{
  "kind": "PodList",
  "apiVersion": "v1",
  "metadata": {
    "resourceVersion":"10245",
    ...
  },
  "items": [CLUSTER-1-ITEMS, CLUSTER-2-ITEMS]
}

This would be useful if we expect the multi-cluster proxy feature would be mostly used for aggregating data from multiple clusters. If users will need to be able to trace back an item in the aggregated items list, maybe we could update each items object to include its original cluster name in its metadata field. I was initially planning on going with this approach, because (correct me if I'm wrong) I see multi-cluster requests primarily as a convenience feature. If users want to make one request instead of multiple, they probably don't want to have to combine the data from multiple responses themselves.

Pros:

  • Potentially easier format from a user's perspective
  • Same format as single-cluster request

Cons:

  • What do we do if one of the clusters returns an error or items in a different format?
  • If we add metadata, we would be modifying the Kubernetes response data, which might impact a potential future integration with the JavaScript Kubernetes client (?)

Option 2

Map each cluster name to its response - for a multi-cluster request to cluster-1 and cluster-2, it might look something like this:

{
  "cluster-1": {
    "kind": "PodList",
    "apiVersion": "v1",
    "metadata": {
      "resourceVersion":"10245",
      ...
    },
    "items": [CLUSTER-1-ITEMS]
  },
  "cluster-2": {
    "kind": "PodList",
    "apiVersion": "v1",
    "metadata": {
      "resourceVersion":"10245",
      ...
    },
    "items": [CLUSTER-2-ITEMS]
  }
}

Pros:

  • Raw response from Kubernetes provided for each cluster, no ambiguity about what the original response looked like

Cons:

  • No real difference than just making multiple single-cluster requests, is this feature even needed?
  • What do we do about single cluster requests? Do we still map the response to its cluster name? Would this impact an integration with the Kubernetes client as well?

Thoughts on either approach? Are there any specific scenarios we should be optimizing the multi-cluster request feature for? CC @jamieklassen @mklanjsek

@mklanjsek
Copy link

@liamrathke the first option feels like a better approach to me, mostly because multi and single cluster responses have the same format, as long as filtering response entries by cluster is easy.

@mclarke47
Copy link
Collaborator

@liamrathke I think the existing endpoints use option 2 because of the possibility that some clusters can return an error and some don't. It might be good to standardise our multi-cluster responses with a similar format, which would make me favour 2

@liamrathke
Copy link
Contributor

@mclarke47 makes sense, I ended up going with that approach (Option 2). Currently working on unit tests, will open a PR once they are ready.

@chinigo
Copy link

chinigo commented Jul 22, 2022

I'd like to make an argument for the first response format as well.

Sticking to the format that the K8s API provides will allow any standard Kubernetes client to use this proxy. I agree with @jamieklassen when he says "this might be a bit ambitious," but I think it's within the realm of possibility and worth the attempt.

Library & client compatibility

From the perspective of a plugin author taking advantage of this proxy, say some hot new query parameter gets added to the Kubernetes API. It would be great if I could bump the @kubernetes/client-node version in my code, point it at the proxy endpoint, and have it work transparently.

And from the perspective of the maintainers of this proxy, it would be great to not be in a position where changes to the K8s API require changes to the proxy.

I think this is possible if we take a light touch and limit the proxy to be purely additive to the Kubernetes API — Embrace and extend.

Multi-cluster results

Limiting results to specific clusters

Adding an X-Kubernetes-Cluster header is purely additive, straightforward to implement (e.g. using the Node client's applyToRequest), and valuable because it allows client code to limit the scope of its requests.

I don't see any problem adopting this as suggested, although we might want to be a bit more specific about how its content is structured. Are we talking, like, a comma-delimited list of the cluster names from app-config.yaml? Wildcards for pattern matching (*-dev might be nice)?

Associating results to their clusters

Instead of dedicating a top-level section for each cluster, we could indicate the source cluster for each item in the response by appending a namespaced annotation:

{
  "apiVersion": "v1",
  "items": [
    {
      "apiVersion": "v1",
      "kind": "Pod",
      "metadata": {
        "annotations": {
          "backstage.io/kubernetes-cluster-name": "cluster-a"
        }
      }
    },
    {
      "apiVersion": "v1",
      "kind": "Pod",
      "metadata": {
        "annotations": {
          "backstage.io/kubernetes-cluster-name": "cluster-b"
        }
      }
    }
  ]
}

Reporting errors

As @mclarke47 points out, multi-cluster queries can result in partially-successful responses, and we'll want to report the failures somehow. One additive way we could report them is by appending an errors stanza to the proxy's response:

{
  "apiVersion": "v1",
  "items": [{
    "apiVersion": "v1",
    "kind": "Pod",
    "metadata": {
      "annotations": {
        "backstage.io/kubernetes-cluster-name": "cluster-a"
      }
    }
  }],
  "errors": {
    "cluster-a": null, // Successful response from this cluster
    "cluster-b": {
      "code": "500", // sic, don't want JS to interpret this as a float
      "message": "Server error" // Message copied directly from server response
    },
    "cluster-c": {
      "code": "403",
      "message": "Forbidden"
    }
  }
}

Maybe it's even worth naming this section _errors to indicate its subversiveness. Standard clients would ignore this additional stanza, and we could maybe ship some TS typedefs for clients who cared about it.

It's tempting to go with a custom 2xx header like 210 Partial Success in the mixed-result scenario, but a flat 200 has several properties to recommend it:

  • Guarantees standard clients would treat the successful part of the response correctly (although they would be oblivious to the failure part). A 2xx would likely succeed, but who knows what clients out in the wild are coded to expect.
  • The proxy can start streaming its response as soon as any of its fanned-out requests return a successful result. (Otherwise you'd have to wait until all requests succeeded or at least one failed to know if you should send a 200 or a 210.)

Performance

One drawback of the existing plugin-kubernetes-backend is that each inbound request results in N*M fanned-out requests against the Kubernetes APIs, where N is the number of registered custom resources and M is the number of managed clusters.

The backend waits until every one of these requests completes or a 30-second timeout has elapsed before it starts to send the concatenated response. For large numbers of managed clusters — especially in a dev-environment scenario where infrastructure can be volatile — it becomes more and more likely that at least one of those requests hangs. The effect on the end user in these cases is that every request takes a full 30 seconds before anything can be displayed in the UI.

The proxy approach mitigates this problem to a large extent. Clients gain control over both N (by querying only for relevant resources) and M (in cases where it knows in advance which cluster(s) contain(s) the resource(s) of interest).

The problem of a slow cluster remains, but I think we can dramatically improve on the latency of the response if we're careful about its format. Two properties are desirable here, both of which are satisfied by the format proposed above:

Items from different clusters are interleaved

  • This property allows for the proxy to open multiple fanned-out connections in parallel, and start streaming the successful responses as soon as they start coming in.
  • It also has the benefit of reducing the proxy's peak memory usage — you only need to store one partially-complete item per cluster at a time before you flush the write buffer.

Timeout errors come at the very end of the response

  • Timeout errors will likely occur after all the successful responses have been completed. If we put those errors at the end of the response, we'll be able to start streaming the successful responses without having to wait for the full timeout.
  • This does have the downside of not being able to display error messages until all the successes have completed, but I think this tradeoff is in the right direction.

To be clear, I'm not suggesting that we immediately make the proxy smart enough to interleave a streamed response like this — we should probably start with a naive JSON.parse(response.body) instead of a SAX-like event-based parser — but it'd be nice if our response format allowed the possibility.

Authentication

The current plugin-kubernetes-backend is limited to read-only operations, which limits the scope for abuse. Opening up a more-or-less transparent proxy expands that surface area dramatically.

My main concern would be a privilege escalation attack where a Backstage user is 1) authorized to use a plugin that relies on this proxy but 2) is not authorized to make modifications to Kubernetes clusters.

In particular, I worry about abuse of a service account. A single Backstage service account per cluster will necessarily be granted the union of all the privileges necessary to drive all of a plugin's use cases. Plugins can lock down parts of their UX based on the logged-in user, but system operators cannot enforce the same constraints at the Kubernetes API layer.

Perhaps the use cases this proxy enables are distinct enough from what plugin-kubernetes-backend supports that we shouldn't use the same configured credentials to drive it?

My first instinct is to defer this problem to the operator. The proxy would expect calls to supply an opaque Authorization header, and it will blindly send them along in its fanned-out requests. Operators who want to support multi-cluster operations can use their platforms to respect that shared token at their discretion (using IAM or Pinniped or whatever).

This eliminates the possibility of server-wide authentication, but maybe that's a good thing because it precludes the service account escalation attack vector

Another benefit of this opaque-token approach is that it's supported by standard clients. Adding our own semantics to the contents of the Authorization header eliminates that possibility.

Ok, that's it! Let me know what you think. Glad we're tackling this proposal, it opens up a bunch of new use cases for us.

Cheers.

@liamrathke
Copy link
Contributor

Hi everyone - just wanted to give a quick update on where it looks like things are headed, based on some internal VMware discussions about the RFC. I'm a big fan of @chinigo's suggestion, but aggregating the Kubernetes API responses into a single Kubernetes response object is only really possible if the response is structured as a list already. In other words, if the response from Kubernetes is designed to return a single object only (i.e. creating an object), we would need to fundamentally modify the Kubernetes response to accommodate multiple items.

At that point, we would essentially be writing a wrapper for part of the Kubernetes API, which somewhat defeats the purpose of the proxy. For this reason, I'm inclined to stick with Option 1, which has already been implemented.

Adding the ability to create objects is one of the primary reasons we're looking to add the proxy in the first place, so a format that properly handles this use case is critical. I'm not sure if there's a simple implementation that balances a) developer convenience when accessing data from multiple clusters, and b) maintains parity with standard Kubernetes cluster responses.

@liamrathke
Copy link
Contributor

Another RFC update - I have removed the "fallback" option for making proxy requests without credentials, after discussing internally with the VMWare Backstage team. Allowing requests without credentials could potentially pose a security issue, allowing anyone with knowledge of the Backstage IP to perform arbitrary requests against the Kubernetes cluster without authenticating first.

@jamieklassen
Copy link
Member Author

jamieklassen commented Oct 25, 2022

Let's just do single-cluster

After long deliberation and a brief discussion with @luchillo17, I'd like to narrow the scope of this RFC even further -- in the first pass, let's make the /proxy handler only target a single cluster at a time, with the following simplified behaviour:

  1. Be instantiated with a singleton KubernetesClustersSupplier (i.e. cluster locator), similar to the /clusters handler.
  2. When receiving a request, inspect the contents of the X-Kubernetes-Cluster header and use the injected KubernetesClustersSupplier to find a cluster with the matching name. In practice, I expect this to mean using a CombinedClustersSupplier to load all clusters into memory and linear search through them, but an obvious improvement would be to add a getClusterByName method to the KubernetesClustersSupplier interface such that each implementation can provide an optimized solution (like a DB lookup in the case of the CatalogClusterLocator). At the moment there is no guarantee within the kubernetes subsystem that clusters have unique names (let's file an issue about this), so if this procedure results in multiple clusters, fail the response. Not sure what response code makes sense but we can bikeshed about that later.
  3. Modify the request so that it points to the url of the correct cluster, uses the right TLS configuration as defined by the cluster's details, and strip the /api/kubernetes/proxy prefix from the request
  4. Perform the modified request and return the response to the caller verbatim.

It feels right to use this highly-simplified implementation for now because it was so hard to resolve the problems with a multi-cluster proxy. These included:

  • encoding auth information for multiple clusters at once is hard -- we talked about JSON-and-base64-encoded stuff in response bodies or headers, supplying headers multiple times with semicolon-delimited key-value pairs, but it all felt really awkward, kludgey and confusing.
  • aggregating responses from multiple clusters at once is hard for a few reasons:
    • if you define a standardized JSON response envelope, then you kinda prevent yourself from supporting non-JSON content types (at least not without more awkwardness in a lot of cases) -- which can event prevent the pod logs use case from working!
    • if you try to cleverly interleave the responses so that they act the same as if they came from a single cluster, it's hard to get right AND it doesn't even make sense for non-List-type responses
  • API verbs like patch don't really feel sensible to multiplex like this.

In the initial post of this RFC, I introduced a number of problems that I wanted to solve, and I now feel that I introduced too many. The proxy design I'm suggesting here does help with:

  • "inner loop" workflows where develoeprs want to see k8s objects without having to register an Entity to track them -- provided these workflows interact with a single cluster at a time.
  • performing write operations against clusters on-demand
  • user experiences which rely on non-resource-based APIs provided by kubernetes

The needs that this design does not address, which I think can be either postponed to a new feature or even supported by the existing endpoints:

@luchillo17 is going to take over #13026 and move it in this new direction starting now, and @mclarke47 @freben I don't imagine you'll have any issues with this, but please share your thoughts if you have them!

@jamieklassen
Copy link
Member Author

To put a finer point on part of the above:

  1. Modify the request so that it points to the url of the correct cluster, uses the right TLS configuration as defined by the cluster's details, and strip the /api/kubernetes/proxy prefix from the request

means that if a cluster has a url: field with a scheme of https:// and skipTLSVerify: true, then the proxy should not check its certificate before issuing requests.

@freben
Copy link
Member

freben commented Nov 10, 2022

I just wanted to say that I've read through this and it sounds sane. Hopefully it'll be good enough for most use cases to simply issue several concurrent requests from the client side instead of multiplexing, especially if the server side has some internal debouncing of heavy requests like cluster lists etc.

@jamieklassen
Copy link
Member Author

We've got a first version of the proxy now! plenty more ideas about how to improve and document it, but I think those can take the form of their own issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:kubernetes Related to the Kubernetes Project Area - not deploying Backstage with k8s. rfc Request For Comment(s)
Projects
None yet
Development

No branches or pull requests

8 participants