Skip to content

Conversation

@micahhausler
Copy link
Member

@micahhausler micahhausler commented Jan 20, 2019

This is my first KEP, any input on structure or details is appreciated.

Initial implementation PR is kubernetes/kubernetes#73110

/sig auth

/cc @mikedanese @liggitt @tallclair

@k8s-ci-robot k8s-ci-robot added sig/auth Categorizes an issue or PR as relevant to SIG Auth. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/pm size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 20, 2019
@mikedanese mikedanese self-assigned this Jan 20, 2019
Copy link
Member

@justaugustus justaugustus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove any references to NEXT_KEP_NUMBER and rename the KEP to just be the draft date and KEP title.
KEP numbers will be obsolete once #703 merges.


message SigningKeyReply {
// The PEM encoded bytes containing a private key
bytes private_key = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you point to docs of an AWS API that would back this? The GCP APIs that offer managed key rotation do not offer key download.

https://cloud.google.com/kms/docs/reference/rest/v1/projects.locations.keyRings.cryptoKeys.cryptoKeyVersions/asymmetricSign
https://cloud.google.com/iam/reference/rest/v1/projects.serviceAccounts/signJwt

A TPM can treat a key as sealed data or an external key, but ideally the key is fully owned and non-exportable from the TPM (which can be verified by a key certification and attestation). cc @awly

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wasn't intended for an AWS API, but I've since refactored it to basically be a grpc wrapper for jose.OpaqueSigner


message VerifyingKeysReply {
// A list of public signing keys
repeated PublicKey public_keys = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to differentiate the active key?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've since broken this out into 2 calls, GetPublicKey() and ListPublicKeys()


### New API

I'm proposing creating a new versioned grpc API under `k8s.io/kubernetes/staging/src/k8s.io/apiserver/pkg/util/flag/keyreader`. This will be similar to how the KMS envelope encryption has an API at `k8s.io/kubernetes/staging/src/k8s.io/apiserver/pkg/storage/value/encrypt/envelope/v1beta1/service.proto`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Until this needs to be generalized, somewhere in k8s.io/kubernetes/serviceaccount is probably preferable. k8s.io/apiserver/pkg/util/flag is for mapping generic apiserver configuration to the command line.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll update to be under k8s.io/kubernetes/serviceaccount


- Reading TLS serving certificates and key from a socket or reloading of the API server with new cert and key
- Reading any other certificates from a file
- Supporting out-of-process JWT signing and verifying
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very desirable in my opinion. A plugin that supports out of process signing can support downloaded keys (where the plugin driver does the signing) but not the other way around.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll move this to be a goal. I had originally though of keeping this scoped down where out of process operations could be in a later KEP, but then it wasn't that hard to add support for it.

@micahhausler micahhausler force-pushed the service-account-key-pipereader branch from 6a5b5a8 to 941e4ff Compare January 21, 2019 02:57
@micahhausler micahhausler changed the title Added service account pipereader KEP Added service account external signer KEP Jan 21, 2019
@micahhausler micahhausler force-pushed the service-account-key-pipereader branch 2 times, most recently from 6718cef to f2a8c8a Compare April 17, 2019 16:33

### Implementation Details/Notes/Constraints

The API server flag `--service-account-key-file` can be specified multiple times. If all files are regular files, the existing behavior will be unchanged. If both a unix socket and files are provided, the APIserver will exit with an error. If the file is a unix socket, all verifying keys for projected tokens will be read over the socket.
Copy link
Member

@mikedanese mikedanese May 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be specified multiple times

This was probably a mistake in retrospect...

Copy link
Member

@liggitt liggitt May 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was probably a mistake in retrospect...

how so? it supports rotation methods where the old and new key are in different files

Copy link
Member

@mikedanese mikedanese May 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What value does the flexibility provide vs requiring a single file with multiple pem blocks? The downside now is the ambiguity. I would be curious to see how many clusters are taking advantage of this.


service KeyService {
// Sign an incoming payload
rpc SignPayload(SignPayloadRequest) returns (SignPayloadResponse) {}
Copy link
Member

@mikedanese mikedanese May 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What validation does the API server do on the response? Can the signer add additional claims to the token? Does it have to be a JWT? I'd prefer some guarantees here for the sake of portability.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically the signer could, as the API server is just sending bytes over, but this is just modeled after OpaqueSigner.SignPayload()

@mikedanese
Copy link
Member

I'm LGTM on this for alpha

// payload is the content to be signed
bytes payload = 1;
// algorithm specifies which algorithm to sign with
string algorithm = 2;
Copy link
Member

@mikedanese mikedanese May 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is useful. The apiserver reads this from the ListPublicKeys response than passes it back to the signer? What happens if it gets it wrong e.g. if a active key is rotated and the new key has a different algorithm?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that we are somewhat constrained by the OpaqueSigner interface.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the algorithm comment, it should be a list. A list would allow to specify algorithms and you are able to migrate, in case one algorithm is broken.

Copy link
Member

@enj enj Jun 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are limited in what we can do here because signer.Public().Algorithm is the alg value for SignPayload and JSONWebKey.Algorithm. I am trying to think if there is some clever way we can allow moving to a new algorithm for new signed payloads while verifying previously signed data with the old algorithm.

We need something like an OpaqueVerifier...

// OpaqueSigner is an interface that supports signing payloads with opaque
// private key(s). Private key operations preformed by implementors may, for
// example, occur in a hardware module. An OpaqueSigner may rotate signing keys
// transparently to the user of this interface.
type OpaqueSigner interface {
	// Public returns the public key of the current signing key.
	Public() *JSONWebKey
	// Algs returns a list of supported signing algorithms.
	Algs() []SignatureAlgorithm
	// SignPayload signs a payload with the current signing key using the given
	// algorithm.
	SignPayload(payload []byte, alg SignatureAlgorithm) ([]byte, error)
}

// SignatureAlgorithm represents a signature (or MAC) algorithm.
type SignatureAlgorithm string

// JSONWebKey represents a public or private key in JWK format.
type JSONWebKey struct {
	Key          interface{}
	Certificates []*x509.Certificate
	KeyID        string
	Algorithm    string
	Use          string
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hah I spoke too soon:

// OpaqueVerifier is an interface that supports verifying payloads with opaque
// public key(s). An OpaqueSigner may rotate signing keys transparently to the
// user of this interface.
type OpaqueVerifier interface {
	VerifyPayload(payload []byte, signature []byte, alg SignatureAlgorithm) error
}

I believe we can have signer.Public().Key return an OpaqueVerifier that allows us to have algorithm agility.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me

@enj
Copy link
Member

enj commented Jun 20, 2019

(Re-posting what I said on the ML thread)

Have we considered a simple starting point of just reloading the data from disk (that may be in the non-goals section though I do not remember why)?

I do think having external signers is valuable but will require investment in terms of some GRPC proxy bridge to an external signing service's API. Writing an updated file to disk is much easier and far less opinionated on how the rotation is done.


So I have been thinking of this KEP and the TPM talk from last KC [1]. Part of me is wondering why we do not just use PKCS11 via [2] (with some minimal bridge code to implement OpaqueSigner) and have the kube API server delegate this work to a HSM?

The advantage of a custom protobuf API is that it gives folks the flexibility to implement it in whatever way they desire. The negative is that everyone has to go write this glue code because this API is something that we made up and does not line up with anything that exists today.

My guess is that using a HSM as the external signer is likely to be the most desired use case (I welcome evidence for or against this claim). PKCS11 happens to be the industry standard for HSMs.

[1] https://www.youtube.com/watch?v=_kxmkI8Kc8Y
[2] https://github.com/ThalesIgnite/crypto11

participating-sigs: []
reviewers:
- "@mikedanese"
- "@liggit"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing a t in @liggitt

@mikedanese
Copy link
Member

pkcs#11 requires cgo and dynamically loading of shared libraries which poses challenges. The extension point would be significantly different than what we tend to do in this project (gRPC over UDS) for similar extension points. The KMS plugin is pretty close to this and we made the choice to go with gRPC there. The interface of pkcs#11 is also massive which adds to the complexity of the implementation, but it doesn't entirely cover the use case specific API that's proposed here. Out the door, it's not obvious how we would derive keyid to bridge a key provider that might use names, incrementing integers, hashes, etc... for keyids. We would also would not be able to add JWT specific features to this API such as kubernetes/kubernetes#61795 if we decide that we want to later on.

@smarterclayton
Copy link
Contributor

Our RPC extension points are complex to write, poorly adopted by anyone except major vendor players, and in general don't align with our Kube patterns that have been successful.

I am not in love with the KMS plugin although I understand why it exists and some of the tradeoffs. It would be helpful to have this include an explicit list of features that cannot be implemented via file on disk so that we can argue about them. I love that people want to go build complex plugins. I die a little bit on the inside when we make all solutions require complex plugins.

@mikedanese
Copy link
Member

mikedanese commented Sep 18, 2019

If you can't do it with a file on disk, don't use something other than gRPC.

@smarterclayton, sig-auth meeting, 2019-09-18

@enj
Copy link
Member

enj commented Sep 18, 2019

@smarterclayton: "If you can't do it with a file on disk, don't use something other than gRPC."

From zoom chat:

Use controller pattern, if you can’t use a file on disk, if you can’t and have lots of potential implementors use web hooks, if you can’t use gRPC
S/can’t/don’t/

@mikedanese
Copy link
Member

Are there any alternatives to GRPC other than pkcs#11 being proposed? I've mentioned the problems with pkcs#11 above.

I would really like to land this feature as alpha in the next cycle.

@mikedanese
Copy link
Member

@smarterclayton any final thoughts?

@mikedanese
Copy link
Member

lgtm
/approve
/hold

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Dec 13, 2019
@enj
Copy link
Member

enj commented Dec 13, 2019

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 13, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: enj, micahhausler, mikedanese

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment


## Summary

The Kubernetes API server has always read service account keys from disk as the process starts, and kept them in memory for the duraiton of the server's lifetime. As the API server can now verify and issue projected volume tokens, it would be advantageous to support external signing and verifying of token data over an API, as well as reading public keys from an API.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: typo

@smarterclayton
Copy link
Contributor

No concerns from me

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 13, 2019
@mikedanese mikedanese force-pushed the service-account-key-pipereader branch from b1d8549 to 8d8a01c Compare December 13, 2019 22:20
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 13, 2019
@k8s-ci-robot
Copy link
Contributor

New changes are detected. LGTM label has been removed.

@mikedanese mikedanese added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 13, 2019
@k8s-ci-robot k8s-ci-robot merged commit 5df0ee9 into kubernetes:master Dec 13, 2019
@k8s-ci-robot k8s-ci-robot added this to the v1.18 milestone Dec 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/auth Categorizes an issue or PR as relevant to SIG Auth. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants