Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not break CoreOS when deprecating kubernetes-ro #9075

Closed
erictune opened this issue Jun 1, 2015 · 30 comments
Closed

Do not break CoreOS when deprecating kubernetes-ro #9075

erictune opened this issue Jun 1, 2015 · 30 comments
Assignees
Labels
priority/backlog Higher priority than priority/awaiting-more-evidence.
Milestone

Comments

@erictune
Copy link
Member

erictune commented Jun 1, 2015

CoreOS does not have an equivalent to "kube-addons.sh", and so there are not system secrets created when the cluster is created. So, when some of our CoreOS users run things like the Hazelcast example, or the elasticsearch example, they currently rely on kubernetes-ro to reach the apiserver. When kuberenetes-ro support is removed (#8155), these examples will still work on GCE, because they can use the kubeconfig in their associated secret. But, on CoreOS, it seems they will break.

Possible fix alternatives:

  1. Change Remove ro service #8155 to not remove support for the --readonly_port feature, but default it to off for GCE.
  2. Change all known examples to include instructions on how to generate a secret when setting up the example.
  3. Extend CoreOS cluster setup to include the kube-addons.sh secret-creation functionality.
  4. @lavalamp has a plan that uses a proxy in a sidecar.

Option 1 is nice because it allows us to defer fixing the problem until after 1.0 freeze.

@erictune
Copy link
Member Author

erictune commented Jun 1, 2015

@pires

@erictune
Copy link
Member Author

erictune commented Jun 1, 2015

My preferred solution is #9076.

@erictune
Copy link
Member Author

erictune commented Jun 1, 2015

Under this solution, we would update the docs for hazelcast, elasticsearch, etc, to show how to create a service account for the pods, and tweak the code in the clients to look for the token in the right place.

@pires
Copy link
Contributor

pires commented Jun 1, 2015

@erictune I'm ok with current elasticsearch example approach where we create a service account. Thing is we need a token that's supposedly available in the output of kubectl config view, but in my/our case this is empty. So no token, no service account, no fun.

I'm open to any of the proposed solutions but favor #9076 as well, as it feels more secure than just leaving a read-only port accessible.

@erictune erictune added priority/backlog Higher priority than priority/awaiting-more-evidence. team/cluster labels Jun 1, 2015
@erictune erictune added this to the v1.0-candidate milestone Jun 1, 2015
@lavalamp
Copy link
Member

lavalamp commented Jun 2, 2015

Pods get a default service account, so you don't have to do anything special to get that.

You do have to either start using the auth info from the service account (preferred solution), or if you have a legacy container that you can't modify, run kubectl in a side-car container and use its proxy feature to get a local proxy of apiserver that doesn't need auth for your legacy application. #8155 adds support for this, and I'll soon have an example of how easy it is.

@pires
Copy link
Contributor

pires commented Jun 2, 2015

Pods get a default service account, so you don't have to do anything special to get that.

@brendandburns said the same exact thing, but this is what happens when I try to run hazelcast example with @brendandburns patch on 0.18.0.

2015-06-01 11:56:34.641  INFO 12 --- [           main] c.g.p.h.HazelcastDiscoveryController     : Asking k8s registry at https://kubernetes.default.cluster.local..
2015-06-01 11:56:34.688  WARN 12 --- [           main] c.g.p.h.HazelcastDiscoveryController     : Request to Kubernetes API failed

java.nio.file.NoSuchFileException: /var/run/secrets/kubernetes.io/serviceaccount/token
    at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
    at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
    at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
    at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
    at java.nio.file.Files.newByteChannel(Files.java:361)
    at java.nio.file.Files.newByteChannel(Files.java:407)
    at java.nio.file.Files.readAllBytes(Files.java:3152)
    at com.github.pires.hazelcast.HazelcastDiscoveryController.getServiceAccountToken(HazelcastDiscoveryController.java:73)

That secret does not exist. I am not using kube-up or salt or whatever mechanisms you rely on to automate these things. I would like to replicate this in my Vagrant + CoreOS + Kubernetes binaries solution, and for that I need instructions on how does this work.

@liggitt
Copy link
Member

liggitt commented Jun 2, 2015

what does the following show?

kubectl get serviceaccounts --namespace=yournamespace
kubectl get secrets --namespace=yournamespace

Service accounts and token are provisioned using controllers running in the kube-controller-manager. You can start them by passing --service_account_private_key_file="<private rsa key file>" to that command.

@pires
Copy link
Contributor

pires commented Jun 2, 2015

$ kubectl get namespaces
NAME      LABELS    STATUS
default   <none>    Active
$ kubectl get serviceaccounts --namespace=default
Error: no resource "serviceaccounts" has been defined
$ kubectl get secrets --namespace=default
NAME      DATA

Do I need to replicate the steps from https://github.com/GoogleCloudPlatform/kubernetes/blob/master/hack/local-up-cluster.sh#L149 down, namely:

  • Generate SSL key
  • Start kube-apiserver with
    • --admission_control=NamespaceLifecycle,NamespaceAutoProvision,Li...
    • --service_account_key_file=GENERATED_SSL_FILE
  • Start kube-controller-manager with
    • --service_account_private_key_file=GENERATED_SSL_FILE

And it's done?

@liggitt
Copy link
Member

liggitt commented Jun 2, 2015

yes, that would enable service account creation, token generation, and token automounting

@pires
Copy link
Contributor

pires commented Jun 2, 2015

@liggitt you are the man. Going to make a PR for the new version of the hazelcast example.

It probably won't make sense to you, but it works.

2015-06-02 20:56:40.626  INFO 12 --- [           main] com.github.pires.hazelcast.Application   : Starting Application v0.3.1 on hazelcast-ir7ca with PID 12 (/bootstrapper.jar started by root in /)
2015-06-02 20:56:40.689  INFO 12 --- [           main] s.c.a.AnnotationConfigApplicationContext : Refreshing org.springframework.context.annotation.AnnotationConfigApplicationContext@5424f110: startup date [Tue Jun 02 20:56:40 GMT 2015]; root of context hierarchy
2015-06-02 20:56:41.965  INFO 12 --- [           main] o.s.j.e.a.AnnotationMBeanExporter        : Registering beans for JMX exposure on startup
2015-06-02 20:56:41.989  INFO 12 --- [           main] c.g.p.h.HazelcastDiscoveryController     : Asking k8s registry at https://kubernetes.default.cluster.local..
2015-06-02 20:56:53.215  INFO 12 --- [           main] c.g.p.h.HazelcastDiscoveryController     : Found 2 pods running Hazelcast.
2015-06-02 20:56:53.350  INFO 12 --- [           main] c.h.instance.DefaultAddressPicker        : [LOCAL] [someGroup] [3.4.2] Interfaces is disabled, trying to pick one address from TCP-IP config addresses: [10.244.78.3, 10.244.30.2]
2015-06-02 20:56:53.350  INFO 12 --- [           main] c.h.instance.DefaultAddressPicker        : [LOCAL] [someGroup] [3.4.2] Prefer IPv4 stack is true.
2015-06-02 20:56:53.369  INFO 12 --- [           main] c.h.instance.DefaultAddressPicker        : [LOCAL] [someGroup] [3.4.2] Picked Address[10.244.30.2]:5701, using socket ServerSocket[addr=/0:0:0:0:0:0:0:0,localport=5701], bind any local is true
2015-06-02 20:56:53.773  INFO 12 --- [           main] com.hazelcast.spi.OperationService       : [10.244.30.2]:5701 [someGroup] [3.4.2] Backpressure is disabled
2015-06-02 20:56:53.778  INFO 12 --- [           main] c.h.spi.impl.BasicOperationScheduler     : [10.244.30.2]:5701 [someGroup] [3.4.2] Starting with 2 generic operation threads and 2 partition operation threads.
2015-06-02 20:56:53.974  INFO 12 --- [           main] com.hazelcast.system                     : [10.244.30.2]:5701 [someGroup] [3.4.2] Hazelcast 3.4.2 (20150326 - f6349a4) starting at Address[10.244.30.2]:5701
2015-06-02 20:56:53.975  INFO 12 --- [           main] com.hazelcast.system                     : [10.244.30.2]:5701 [someGroup] [3.4.2] Copyright (C) 2008-2014 Hazelcast.com
2015-06-02 20:56:53.976  INFO 12 --- [           main] com.hazelcast.instance.Node              : [10.244.30.2]:5701 [someGroup] [3.4.2] Creating TcpIpJoiner
2015-06-02 20:56:53.977  INFO 12 --- [           main] com.hazelcast.core.LifecycleService      : [10.244.30.2]:5701 [someGroup] [3.4.2] Address[10.244.30.2]:5701 is STARTING
2015-06-02 20:56:54.102  INFO 12 --- [cached.thread-2] com.hazelcast.nio.tcp.SocketConnector    : [10.244.30.2]:5701 [someGroup] [3.4.2] Connecting to /10.244.78.3:5701, timeout: 0, bind-any: true
2015-06-02 20:56:54.169  INFO 12 --- [cached.thread-2] c.h.nio.tcp.TcpIpConnectionManager       : [10.244.30.2]:5701 [someGroup] [3.4.2] Established socket connection between /10.244.30.2:53817 and 10.244.78.3/10.244.78.3:5701
2015-06-02 20:57:01.133  INFO 12 --- [ration.thread-0] com.hazelcast.cluster.ClusterService     : [10.244.30.2]:5701 [someGroup] [3.4.2]

Members [2] {
    Member [10.244.78.3]:5701
    Member [10.244.30.2]:5701 this
}

2015-06-02 20:57:03.156  INFO 12 --- [           main] com.hazelcast.core.LifecycleService      : [10.244.30.2]:5701 [someGroup] [3.4.2] Address[10.244.30.2]:5701 is STARTED
2015-06-02 20:57:03.156  INFO 12 --- [           main] com.github.pires.hazelcast.Application   : Started Application in 22.92 seconds (JVM running for 23.765)

@pires
Copy link
Contributor

pires commented Jun 2, 2015

I will also need to update the CoreOS docs.

@thockin
Copy link
Member

thockin commented Jun 2, 2015

@kelseyhightower @pires Please take a look at this ASAP. I don't really want to go to 1.0 with CoreOS support broken.

@pires
Copy link
Contributor

pires commented Jun 3, 2015

@thockin I am updating it as we speak.

@pires
Copy link
Contributor

pires commented Jun 3, 2015

@thockin @erictune @AntonioMeireles I have questions.

Moved questions below to a #9178

1. Care if I remove single-node instructions? I don't think this makes sense anymore, unless someone is going to use hyperkube.

2. master.yaml needs instructions to generate service-account private key file, so it will depend on openssl. Any objections?

3. We should add DNS integration as many examples depend on it. Objections?

4. I'm moving aws folder from getting-started-guides folder, to getting-started-guides/coreos, in order to match how we did azure.
Speaking of which @errordeveloper @squillace @chanezon @crossorigin do you agree with questions 1, 2 and 3? Also, can we sync effort to get it homogeneous on all different CoreOS approaches?

@AntonioMeireles
Copy link
Contributor

no objections on my part.

@thockin
Copy link
Member

thockin commented Jun 17, 2015

Status on this?

@pires
Copy link
Contributor

pires commented Jun 18, 2015

I believe I will have the PR ready later today.

@pires
Copy link
Contributor

pires commented Jun 19, 2015

Due to increased workload, I have to postpone this to next week.

@yifan-gu
Copy link
Contributor

cc @yifan-gu

@thockin
Copy link
Member

thockin commented Jun 24, 2015

Are there still actions items open on this?

@pires
Copy link
Contributor

pires commented Jun 24, 2015

Yes, just need to implement a way in the master.yaml to generate SSL key to use and we're done.

@liggitt
Copy link
Member

liggitt commented Jun 24, 2015

See also #10264 in progress to include the root CA. You'll likely want to provide a CA for inclusion as well

@pires
Copy link
Contributor

pires commented Jun 24, 2015

I am not aware of this root CA and tbh I've had enough breaking changes in the last couple weeks. It's amazing to see Kubernetes getting there, but troublesome to keep up in the CoreOS world.

As for me, my kubernetes-vagrant-coreos-cluster works right now by just providing the SSL key file, generated with openssl genrsa -out kube-serviceaccount.key 2048 2>/dev/null, to my master VM and run:

kube-apiserver \
  --service_account_key_file=/tmp/kube-serviceaccount.key \
  --service_account_lookup=false \
  --admission_control=NamespaceLifecycle,NamespaceAutoProvision,LimitRanger,SecurityContextDeny,ServiceAccount,ResourceQuota \
  --...
kube-controller-manager \
  --service_account_private_key_file=/tmp/kube-serviceaccount.key \
  --...

If this works @liggitt @thockin I'm happy to push the PR tomorrow morning (GMT). Otherwise, and mostly due to the lack of energy to do it, I can't deliver anytime soon.

@liggitt
Copy link
Member

liggitt commented Jun 24, 2015

There are basically three levels of security talking to the API:

  • http, read only. This is what is going away
  • https, authenticated via token, client don't verify server's certificate. This is what you get by providing the service account key
  • https with server cert verification. This is enabled by providing the root CA bundle for clients inside pods to use to verify the server's cert

Enabling service account tokens is still worthwhile even if the CA part doesn't happen yet

@pires
Copy link
Contributor

pires commented Jun 24, 2015

@liggitt yes, service-accounts are mandatory. I will push the PR tomorrow. Thanks for clarifying.

@thockin
Copy link
Member

thockin commented Jul 1, 2015

Status on this? I want to get it off the books...

@pires
Copy link
Contributor

pires commented Jul 1, 2015

@thockin for real? #10344

pires added a commit to pires/kubernetes that referenced this issue Jul 1, 2015
Added instrumentation and configuration for service account tokens.
Fixes kubernetes#9075
@thockin
Copy link
Member

thockin commented Jul 1, 2015

Excellent! Is this issue resolved now?

@pires
Copy link
Contributor

pires commented Jul 1, 2015

@thockin the PR was up for a week but it got lost in the pipe. 0.19.3 with service-account support is done for CoreOS (tested on Vagrant, AWS and GCE).

@pires
Copy link
Contributor

pires commented Jul 1, 2015

I will wait for v1 to come out before updating again. I want to add root CA and whatever may be needed then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/backlog Higher priority than priority/awaiting-more-evidence.
Projects
None yet
Development

No branches or pull requests

9 participants