Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Allowing Delegates to still use SPIRE Agent attestation #5019

Open
bleggett opened this issue Mar 26, 2024 · 10 comments
Open

RFC: Allowing Delegates to still use SPIRE Agent attestation #5019

bleggett opened this issue Mar 26, 2024 · 10 comments
Labels
priority/backlog Issue is approved and in the backlog unscoped The issue needs more design or understanding in order for the work to progress

Comments

@bleggett
Copy link
Contributor

bleggett commented Mar 26, 2024

  • Subsystem: agent, [all-plugins]

In Istio, with the ambient mode we've added, we now have a node proxy written in Rust that will proxy traffic for workloads on that node.

This node proxy is obviously running in a different cgroup than the actual workloads it is privileged to act On-Behalf-Of (OBO).

The DelegateIdentity API is the best fit for this - the node proxy can attest the workloads it is acting OBO, the SPIRE Agent can attest the node proxy itself, and the node proxy can shunt the attested workload properties over to the SPIRE agent via the admin socket gRPC API. If that set of workload properties matches a workload registration, we get a cert back.

(I am excluding node attestation entirely from this issue, and assuming that is 100% the provenance of the SPIRE Agent - I am only talking about workload attestation)

That's all good.

The problem there is that as a delegate, we are now responsible for 100% of the workload attestation - the SPIRE Agent no longer is, and will in fact be unable to attest the workload the authorized delegate is acting OBO.

This means that our Rust authorized delegate either has to reimplement all the attestor plugins the SPIRE Agent supports (k8s, sigstore, etc etc) in a native library as a kind of parallel implementation to SPIRE Agent's Go plugin framework, or just reimplement a subset of them (also as a library/parallel impl) - we effectively lose the ability to lean on SPIRE Agent's pluggable workload attestation framework once we begin using the DelegateIdentity API.

This makes some amount of sense from a security perspective, otherwise you end up with a circular (somewhat pointless) trust model:

  1. Why attest the workload behind the delegated authority at all?
  2. Because you don't trust the delegated authority?
  3. But who would tell you what workload to attest?
  4. The delegated authority.

But the problem is this gets us into a spot where we cannot use the SPIRE Agent's built in workload attestation at all - we would actually like to use SPIRE Agent's attestation framework to validate the workload behind the delegate, before giving the delegate the cert. That way, we don't end up having to reimplement the entire workload attestation framework in SPIRE Agent.

We would be interested in adding a solution to this ourselves in SPIRE, but want to ask:

  • Does this problem make sense?
  • Is this something you all would be amenable to design proposals around?
  • Is there any objection to enhancing the DelegateIdentity API in some way so that callers/delegates can leverage SPIRE Agent's existing attestation model, without having to implement their own?
  • Could this be as simple as saying "if you are a trusted delegate using our Admin API, we effectively will allow you to trigger SPIRE Agent to perform attestation of the workload as a prerequisite to issuing the trusted delegate that cert?"
  • If it is that simple, the problem becomes that you as the delegate need all the selectors to use the DelegateIdentity API to begin with (somewhat related to/descended from Potentially revisit Selection in Delegated Identity API #4408)
  • Could we just change the DelegateIdentity API to accept a raw cgroup path from the trusted delegate, which would cause the SPIRE Agent to attest that cgroup, and return the trusted delegate the appropriate cert, as if something inside that cgroup had directly requested a cert?

tl;dr the basic problem is that the SPIRE DelegateIdentity API requires trusted delegates to effectively re-implement their own workload attestation stack which roughly parallels the one SPIRE Agent already ships, and we would rather not do that and use the SPIRE Agent workload attestation stack directly via gRPC, rather than build our own or write our own workload attestation libraries.

@youngnick
Copy link

Cilium's use of the DelegatedIdentity API has basically the same concerns, so I'm also interested here.

@bleggett
Copy link
Contributor Author

bleggett commented Apr 2, 2024

I am going to take a stab modifying the plugins to support this shortly, to see how much churn it might introduce.

@kfox1111
Copy link
Contributor

Was thinking about this some and discussing it on slack...

What if we modified the regular unix socket workload api, to accept an optional pidfd being passed. If a pidfd is passed, and the client attached to the workload api socket is in the delegation list, use the pid associated with the pidfd for validation rather then the pid of the delegate. Then I think all the rest of the logic would work out?

@rturner3
Copy link
Collaborator

The maintainers have been discussing this proposal and will provide some feedback after we've had a chance to think about some of the security implications.

@amartinezfayo
Copy link
Member

Thank you @bleggett for opening this issue.
While we were discussing this today in our maintainers sync, we thought that it would be a good idea if you can join one instance of the SIG-SPIRE meeting to have a discussion about this. We have a lot of questions and it may be better to go through them in a meeting, where you could present your proposal and we can provide feedback. Would that work for you?

@bleggett
Copy link
Contributor Author

bleggett commented Apr 18, 2024

Thank you @bleggett for opening this issue. While we were discussing this today in our maintainers sync, we thought that it would be a good idea if you can join one instance of the SIG-SPIRE meeting to have a discussion about this. We have a lot of questions and it may be better to go through them in a meeting, where you could present your proposal and we can provide feedback. Would that work for you?

Yep, that makes sense, and I'm getting freed up a bit in a few days and this is next in my queue, so it's good timing.

I will shoot for the one next week and try to have a doc ready.

@evan2645
Copy link
Member

Thank you @bleggett , that would be awesome. Please drop a note in the SPIRE slack channel if/when you're ready to present, and ping @dfeldman so we can be sure the right folks are on the call and you're added to the agenda 🙏

@bleggett
Copy link
Contributor Author

https://docs.google.com/document/d/1A1oQHuR6z3bvQtXN17r2EwBr5lazGGPbUPkxoURAAh4/edit

doc (mostly a codification of above) that I intend to run thru there

@bleggett
Copy link
Contributor Author

SIG-SPIRE discussion outcomes here: https://docs.google.com/document/d/1A1oQHuR6z3bvQtXN17r2EwBr5lazGGPbUPkxoURAAh4/edit#heading=h.to9i1s83kgpn

I think we are all leaning towards pid as the most portable option, provided we can be explicit about the responsibilities of the delegate around validating pid consistency (SPIRE Agent should already take care of this within its boundary).

@evan2645 evan2645 added priority/backlog Issue is approved and in the backlog unscoped The issue needs more design or understanding in order for the work to progress and removed triage/in-progress Issue triage is in progress labels Apr 25, 2024
bleggett added a commit to bleggett/spire-api-sdk that referenced this issue May 3, 2024
Signed-off-by: Benjamin Leggett <benjamin.leggett@solo.io>
bleggett added a commit to bleggett/spire-api-sdk that referenced this issue May 3, 2024
Signed-off-by: Benjamin Leggett <benjamin.leggett@solo.io>
@bleggett
Copy link
Contributor Author

bleggett commented May 3, 2024

I have started hacking at this and since the API changes are probably the most controversial bit, I threw up a WIP PR for those: spiffe/spire-api-sdk#58

looking for feedback/opinions there if people have any.

bleggett added a commit to bleggett/spire-api-sdk that referenced this issue Jul 1, 2024
Signed-off-by: Benjamin Leggett <benjamin.leggett@solo.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/backlog Issue is approved and in the backlog unscoped The issue needs more design or understanding in order for the work to progress
Projects
None yet
Development

No branches or pull requests

7 participants