Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSL certificates for etcd cluster. #35516

Merged
merged 1 commit into from
Nov 10, 2016

Conversation

jszczepkowski
Copy link
Contributor

@jszczepkowski jszczepkowski commented Oct 25, 2016

Added generation of SSL certificates for etcd cluster's internal communication.
Turned on on GCE (gci, trusty and debain).


This change is Reviewable

@jszczepkowski jszczepkowski added this to the v1.5 milestone Oct 25, 2016
@jszczepkowski jszczepkowski added release-note-none Denotes a PR that doesn't merit a release note. area/HA labels Oct 25, 2016
@k8s-github-robot k8s-github-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Oct 25, 2016
@k8s-ci-robot
Copy link
Contributor

Jenkins Kubemark GCE e2e failed for commit 28571bf. Full PR test history.

The magic incantation to run this job again is @k8s-bot kubemark e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-ci-robot
Copy link
Contributor

Jenkins GCI GCE e2e failed for commit 28571bf. Full PR test history.

The magic incantation to run this job again is @k8s-bot gci gce e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-ci-robot
Copy link
Contributor

Jenkins GCE e2e failed for commit 28571bf. Full PR test history.

The magic incantation to run this job again is @k8s-bot cvm gce e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-ci-robot
Copy link
Contributor

Jenkins GCE etcd3 e2e failed for commit 28571bf. Full PR test history.

The magic incantation to run this job again is @k8s-bot gce etcd3 e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

"usages": [
"signing",
"key encipherment",
"server auth",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think you should ever add client or server auth to a ca cert.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done (removed)

./cfssl print-defaults csr > ca-csr.json
./cfssl gencert -initca ca-csr.json | ./cfssljson -bare ca -

if [[ ! -z "${ca_key}" && ! -z "${ca_cert}" ]]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about wrapping the if around the whole first block, so that we don't generate the ca if it isn't needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -375,6 +375,13 @@ current-context: service-account-context
EOF
}

function create-master-etcd-auth {
local -r auth_dir="/etc/srv/kubernetes"
echo "${ETCD_CA_CERT}" | base64 --decode > "${auth_dir}/etcd-ca.crt"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't appear to be optional, so anyone (like gke) who calls this script w/o setting the new fields will fail here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -609,7 +616,7 @@ function prepare-etcd-manifest {
local etcd_cluster=""
local cluster_state="new"
for host in $(echo "${INITIAL_ETCD_CLUSTER:-${host_name}}" | tr "," "\n"); do
etcd_host="etcd-${host}=http://${host}:$3"
etcd_host="etcd-${host}=https://${host}:$3"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks like it sets https for every setup, regardless of HA or non-HA. should we leave explicit non-HA using localhost http, so that we don't have to have the apiserver connect insecurely to etcd?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use https in every configuration: user should always have a possibility to make his cluster HA in future. In case of cluster with one master, there will be only one etcd server, so etcd server will not communicate with its peers (so, there is no different between http and https).

@jszczepkowski
Copy link
Contributor Author

Comments applied, PTAL

@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 29, 2016
@k8s-github-robot k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 1, 2016
@k8s-ci-robot
Copy link
Contributor

Jenkins GCE Node e2e failed for commit 7b093b58259d5624f3b24a56735080bd2002012c. Full PR test history.

The magic incantation to run this job again is @k8s-bot node e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@jszczepkowski
Copy link
Contributor Author

@k8s-bot node e2e test this
issue: #35983


mkdir -p "${KUBE_TEMP}/cfssl"
pushd "${KUBE_TEMP}/cfssl"
curl -s -L -o cfssl https://pkg.cfssl.org/R1.2/cfssl_linux-amd64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work on a mac (and from what I remember common.sh executes on the host system where you are running kube-up.sh).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done (moved under gce directory).

{% if pillar.get('etcd_over_ssl', '').lower() == 'true' -%}
{% set etcd_protocol = 'https' -%}
{% set etcd_creds = '--peer-trusted-ca-file /etc/srv/kubernetes/etcd-ca.crt --peer-cert-file /etc/srv/kubernetes/etcd-peer.crt --peer-key-file /etc/srv/kubernetes/etcd-peer.key -peer-client-cert-auth' -%}
{% else -%}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The else seems unnecessary

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

@@ -626,6 +635,8 @@ function prepare-etcd-manifest {
sed -i -e "s@{{ *etcd_cluster *}}@$etcd_cluster@g" "${temp_file}"
sed -i -e "s@{{ *storage_backend *}}@${STORAGE_BACKEND:-}@g" "${temp_file}"
sed -i -e "s@{{ *cluster_state *}}@$cluster_state@g" "${temp_file}"
sed -i -e "s@{{ *etcd_protocol *}}@https@g" "${temp_file}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that with GCI we are always using https and it's not possible to use http. Is that what we want?
Do we have any performance results for this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want always to secure peer communication inside etcd cluster by SSL, to prevent malicious users from changing etcd state. It shouldn't have any performance impact in case of a single node clusters, as SSL is used only for communication between peers (so, it will only affect HA deployments).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So apiserver(s) will still talk to etcd using pure http, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we are not changing it.

@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 2, 2016
@jszczepkowski jszczepkowski removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 2, 2016
@jszczepkowski jszczepkowski force-pushed the ha-etcd-certs branch 2 times, most recently from fcb894b to 1a713a3 Compare November 3, 2016 08:55
@jszczepkowski
Copy link
Contributor Author

Added support for Darwin kernel.

@roberthbailey
PTAL

@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 6, 2016
@jszczepkowski
Copy link
Contributor Author

Comments applied, PTAL

@saad-ali
Copy link
Member

saad-ali commented Nov 8, 2016

Applying do-not-merge until the post-code freeze merge exception for this feature for 1.5 has been approved.

https://groups.google.com/forum/#!topic/kubernetes-milestone-burndown/DTZu98YNuUE

@saad-ali saad-ali added the do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. label Nov 8, 2016
@@ -458,6 +458,7 @@ num_nodes: $(echo "${NUM_NODES:-}" | sed -e "s/'/''/g")
e2e_storage_test_environment: '$(echo "$E2E_STORAGE_TEST_ENVIRONMENT" | sed -e "s/'/''/g")'
kube_uid: '$(echo "${KUBE_UID}" | sed -e "s/'/''/g")'
initial_etcd_cluster: '$(echo "${INITIAL_ETCD_CLUSTER:-}" | sed -e "s/'/''/g")'
etcd_over_ssl: '$(if [[ -n "${ETCD_CA_KEY:-}" && -n "${ETCD_CA_CERT:-}" && -n "${ETCD_PEER_KEY:-}" && -n "${ETCD_PEER_CERT:-}" ]]; then echo "true"; else echo "false"; fi)'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be an if/else block below the EOF rather than inline (like the storage backend, admission control, etc below).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@jszczepkowski
Copy link
Contributor Author

I've applied the comments, run manual tests for debian OS and did some fixes for debian.

@roberthbailey roberthbailey added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 9, 2016
@roberthbailey
Copy link
Contributor

lgtm

@saad-ali
Copy link
Member

saad-ali commented Nov 9, 2016

Applying do-not-merge until the post-code freeze merge exception for this feature for 1.5 has been approved.

https://groups.google.com/forum/#!topic/kubernetes-milestone-burndown/DTZu98YNuUE

The request for exception to 1.5 release code freeze for the "Simplify HA setup for master on GCE" feature has been approved. Adding 1.5 milestone and removing do-not-merge label.

@saad-ali saad-ali removed the do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. label Nov 9, 2016
@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 10, 2016
@jszczepkowski jszczepkowski removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 10, 2016
@k8s-github-robot k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 10, 2016
@jszczepkowski jszczepkowski added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 10, 2016
@k8s-ci-robot
Copy link
Contributor

Jenkins unit/integration failed for commit 777cd95eb21b844891866e15e6b9b389841088ba. Full PR test history.

The magic incantation to run this job again is @k8s-bot unit test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@jszczepkowski
Copy link
Contributor Author

@k8s-bot unit test this
issue #36473

@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 10, 2016
Added generation of SSL certificates for etcd cluster internal
communication. Turned on on gci & trusty.
@jszczepkowski jszczepkowski removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 10, 2016
@k8s-github-robot k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 10, 2016
@jszczepkowski jszczepkowski added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 10, 2016
@k8s-github-robot
Copy link

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot
Copy link

Automatic merge from submit-queue

@k8s-github-robot k8s-github-robot merged commit a787044 into kubernetes:master Nov 10, 2016
@jszczepkowski
Copy link
Contributor Author

Part of kubernetes/enhancements#48

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/HA lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants