Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: simple ClusterInfo object and discovery methods/APIs #30707

Closed
wants to merge 1 commit into from

Conversation

jbeda
Copy link
Contributor

@jbeda jbeda commented Aug 16, 2016

This is a design proposal (forked from https://docs.google.com/document/d/1GVMLTBrEH5kXGxo0fWTqiRxG4dSxQQzyUGLA7PxhzRY/edit#) for how clients can easily figure out how to talk to a cluster and keep that information up to date.

Related issues: #30395, #30360, #30361, #5754


This change is Reviewable

@k8s-github-robot k8s-github-robot added kind/design Categorizes issue or PR as related to design. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. release-note-label-needed labels Aug 16, 2016

Any client of the cluster will want to have this information. As the configuration of the cluster changes we need the client to keep this information up to date. It is assumed that the information here won’t drift so fast that clients won’t be able to find *some* way to connect.

In exceptional circumstances, it is possible that this information may be out of date and a client would be unable to connect to a cluster. If a user has kubectl set up and working well pointing to raw IP addresses and doesn’t run it for a long time it is possible that both (a) the set of servers will have migrated so that no IP is good or (b) the root certificates will have rotated so that the user can no longer trust the endpoint.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "If a user has kubectl ... " sentence is both run-on and hard to parse.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will clarify in next push.

@brendandburns brendandburns added release-note-none Denotes a PR that doesn't merit a release note. and removed release-note-label-needed labels Aug 17, 2016
@lukemarsden
Copy link
Contributor

lukemarsden commented Aug 17, 2016

@jbeda

Thanks for this. I have a couple of high level review comments:

  1. It seems to me that the ClusterInfo section is somewhat orthogonal to the Methods section. The ClusterInfo section seems to overlap with design and implement kubeconfig v2 api version #30395. The ClusterInfo proposal also seems to overlap a lot with what kubeconfig can already express. Perhaps it makes sense to split out the ClusterInfo section as a separate proposal.
  2. In terms of what we can land in a kubeadm alpha release that works with 1.4, we'll need to use a mechanism for ClusterInfo that kubelet already supports, and AIUI that is the existing kubeconfig file format.

What I'll do if it's OK with you is update the proposal in #30360 to refer to the JWS section of your proposal. Let me know if you think it makes sense to split the proposals out into a ClusterInfo proposal which might be unified with #30395 versus a separate JWS proposal (could also include the other discovery methods) which can be made a sub-proposal of #30360.

@lukemarsden
Copy link
Contributor

One other thing @jbeda, it would be good if the title of this proposal could be updated to reflect that it also includes a set of discovery proposals. Otherwise it looks at first glance like this is just an alternative to #30395, and it's actually much more than that.

@deads2k
Copy link
Contributor

deads2k commented Aug 17, 2016

@detiber fyi. This affects adding nodes to a cluster.

A81E5d4DwI.0ok9tB1QhB
```

The first part of the token is the `token-id`. The second part is the `token-secret`.
Copy link
Contributor

@lukemarsden lukemarsden Aug 17, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does the token-id come from? In #30360 we talk about being able to have a config management system generate and pass in a random string to kubeadm init as an optional arg.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is just another random string that should be able to be passed in. The idea is that one half is public and the other half is private and never sent over the wire (at least with this API).

We could just use a single token but then we'd have no way for the client to say which token they want to have the JWS signed with. It also means we can scale if we have a lot of tokens in flight at some point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the token IDs when it comes time to verify the JWS, this way we're not transmitting the secret we'll use to do that with the request itself, so that much less likely to get compromised data.

kubeadm will hopefully just take the ID.TOKEN form directly copied from the master init command output, user shouldn't have to worry about splitting it up into ID or token, it's just a token to them.

@jbeda jbeda changed the title A proposal for a simple ClusterInfo object and API A proposal for a simple ClusterInfo object and discovery methods Aug 17, 2016
@jbeda jbeda changed the title A proposal for a simple ClusterInfo object and discovery methods Proposal: simple ClusterInfo object and discovery methods/APIs Aug 17, 2016
@jbeda
Copy link
Contributor Author

jbeda commented Aug 17, 2016

@lukemarsden Fixed title. And I agree there is a lot of overlap with #30395. I'd love to rationalize. And obviously in the meantime we'll have to write out old kubeconfig as that is what the kubelet knows.

We really need to close on the design #30395 to really close on this.


Implementation note: We'll probably want to jitter this to avoid a stampede.

#### Future Method: DNSSEC
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this future method mentioned and gossip is not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add gossip but I'm not sure that it improves the user experience. For both you specify a token and a starting address. The JWS solution seems simpler. Happy to add a note if you think it is important.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can leave both out.

Assuming the existence of DNSSEC infra (which is a nightmare to setup up but if you got it already, might as well use it :) ), SRV/TLSA basically solves discovery altogether. No need to go through this process, just start talking to the apiservers at a well known domain name. What benefit does incorporating this process add?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good -- I'll leave it out.

@mikedanese
Copy link
Member

@kubernetes/sig-cluster-lifecycle

@mikedanese
Copy link
Member

ref #28422


## Overview

It is surprisingly hard to figure out how to talk to a Kubernetes cluster. Not only do client need to know where to look on the network, they also need to identify the set of root certificates to trust when talking to that endpoint.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do clients or does client

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in latest push

@bgrant0607
Copy link
Member

Please discuss the DoS issue with @kubernetes/sig-api-machinery before adding a new apiserver mechanism for that.

@jbeda
Copy link
Contributor Author

jbeda commented Nov 1, 2016

Reached out to @kubernetes/sig-api-machinery on slack. Summary of options here.

  1. Have kube-public namespace be open to unauthenticated users
    1. Don't sweat the DoS issues
    2. Implement rate limiting for unauthenticated users.
  2. Have the kube-public namespace work only available to authenticated users
    1. Implement something like a JWT+nonce auth scheme. We can't simply send the token to the server during bootstrap as we don't trust the server yet.

      This is further complicated by HA. Generally the server will keep track of nonces. That means we either need shared and/or replicated storage for nonces between API servers or we need to forward this type of auth request on to a "leader" (and the API server doesn't leader elect yet). Furthermore if we just store the nonces in memory we need to worry about someone forcing a reboot or leader election to "clear" the nonce cache.

      Wrt nonces -- we can follow the pattern from Let's Encrypt and the ACME spec/boulder impl where a possible nonce is served with every request that can be used. See the ACME spec and the Boulder impl.

I suggest we don't worry about this for now. If we really are concerned about DoS attacks we can do a rate limit.

@jbeda
Copy link
Contributor Author

jbeda commented Nov 2, 2016

@mikedanese @roberthbailey

I think we've worked through all of the objections. Here is a link to the discussion with the API machinery folks: http://kubernetes.slackarchive.io/sig-api-machinery/-/1477579324.000588/1477961700.0007/1477961700000700/

Ready to merge this?

@luxas
Copy link
Member

luxas commented Nov 2, 2016

+1 for that

Are we aiming to get the controller written in time for v1.5 (disabled by default)?
We'd better be quick then :)

Or are we going to defer that to v1.6?
It's probably too little time before the freeze, but who knows, it may be possible if we're collaborating quickly at the meetup on Monday...

@luxas
Copy link
Member

luxas commented Nov 12, 2016

I think we can merge this now, based on the general approval of this approach during the sig in-person meeting and the dev summit.

=> @roberthbailey @mikedanese Please give your lgtms

@luxas
Copy link
Member

luxas commented Nov 23, 2016

ping @mikedanese @roberthbailey @jbeda Anything more to do on this doc or can it be lgtm'd?

@mikedanese
Copy link
Member

I'll give this a once over today and see if this can't be lgtmd


While we could define a new format for communicating the set of information needed here, we'll start by using the standard [`kubeconfig`](http://kubernetes.io/docs/user-guide/kubeconfig-file/) file format.

It is expected that the `kubeconfig` file will have a single unnamed `Cluster` entry. Other information (especially authentication secrets) must be omitted.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this possible? Clusters maps to an internal map. Should I expect "" to be the struct for the bootstrap cluster in that map?

https://github.com/kubernetes/kubernetes/blob/master/pkg/client/unversioned/clientcmd/api/types.go#L43

@mikedanese
Copy link
Member

Proposal looks fine. I dislike the command line flags, but I don't think that's too important to get right in a proposal if you are willing to be flexible.

@mikedanese mikedanese added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. labels Nov 28, 2016
@k8s-github-robot
Copy link

Adding label:do-not-merge because PR changes docs prohibited to auto merge
See http://kubernetes.io/editdocs/ for information about editing docs

@k8s-github-robot k8s-github-robot added kind/design Categorizes issue or PR as related to design. kind/old-docs do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. labels Dec 1, 2016
@jbeda
Copy link
Contributor Author

jbeda commented Dec 5, 2016

@mikedanese I'm totally flexible on the command line flags. How about we get this in and then bike shed on that in a separate PR/Issue?

Also, I have no idea what the merge-bot is talking about. How are changes to design docs supposed to work? The link that the merge robot provided doesn't make any sense in this context. A lot of searching takes me to kubernetes-retired/contrib#2101.

I'll take it as an item to move it over to the community repo and refer to this PR.

@mikedanese mikedanese added lgtm "Looks good to me", indicates that a PR is ready to be merged. and removed do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. labels Dec 5, 2016
@mikedanese
Copy link
Member

LGTM

@k8s-github-robot
Copy link

Adding label:do-not-merge because PR changes docs prohibited to auto merge
See http://kubernetes.io/editdocs/ for information about editing docs

@k8s-github-robot k8s-github-robot added the do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. label Dec 5, 2016
@jbeda
Copy link
Contributor Author

jbeda commented Dec 17, 2016

Moved to community repo in kubernetes/community#189.

@jbeda jbeda closed this Dec 17, 2016
k8s-github-robot pushed a commit that referenced this pull request Dec 20, 2016
Automatic merge from submit-queue

Implement kubeadm bootstrap token management

Creates bootstrap tokens as secrets per the specification in #30707 

_WARNING_: These are not currently hooked up to the discovery service or the token it creates.

Still TODO:
- [x] delete tokens
- [x] merge with #35144 and adopt it's testing approach
- [x] determine if we want wholesale json output & templating like kubectl (we do not have an API object with the data we want here) may require a bit of plumbing.
- [x] allow specifying a token duration on the CLI
- [x] allow configuring the default token duration
- [x] hook up the initial token created during init

Sample output:

```
(root@centos1 ~) $ kubeadm token create
Running pre-flight checks
<cmd/token> Token secret created: f6dc69.c43e491752c4a0fd
(root@centos1 ~) $ kubeadm token create
Running pre-flight checks
<cmd/token> Token secret created: 8fad2f.e7b78c8a5f7c7b9a
(root@centos1 ~) $ kubeadm token list  
Running pre-flight checks
ID        TOKEN                     EXPIRATION
44d805    44d805.a4e78b6cf6435e33   23h
4f65bb    4f65bb.d006a3c7a0e428c9   23h
6a086e    6a086e.2ff99f0823236b5b   23h
8fad2f    8fad2f.e7b78c8a5f7c7b9a   23h
f6dc69    f6dc69.c43e491752c4a0fd   23h
f81653    f81653.9ab82a2926c7e985   23h
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/security do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. kind/design Categorizes issue or PR as related to design. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.