Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow setting certificates validity period during installation #3581

Merged
merged 8 commits into from Mar 29, 2017

Conversation

php-coder
Copy link
Contributor

This PR is adding support for setting certificates validity period during installation. It's done by passing --expire-days and --signer-expire-days options to oc adm that were adding in v1.5 (see openshift/origin#11814)

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1275176
Trello: https://trello.com/c/MV4uHYdW/367-leverage-the-new-expire-days-in-the-ansible-playbooks

PTAL @abutcher @tbielawa
CC @mfojtik

@php-coder
Copy link
Contributor Author

php-coder commented Mar 7, 2017

Current state/open questions:

  • It isn't tested because at this moment installer fails with error Detected OpenShift version 1.3.1 does not match requested openshift_release 1.5 when I'm trying to use v1.5 I didn't find the repo for origin 1.5
  • It's not clear do I need some special code for a containerized installation
  • I don't know whether code for supporting external etcd should be added or not
  • I'm not sure that new parameters will affect procedure of updating of the existing certificates
  • It isn't documented

@abutcher
Copy link
Member

abutcher commented Mar 7, 2017

It isn't tested because at this moment installer fails with error Detected OpenShift version 1.3.1 does not match requested openshift_release 1.5 when I'm trying to use v1.5 I didn't find the repo for origin 1.5

1.5 packages haven't been created yet afaik. I can get you configuration for testing 3.5 internally.

It's not clear do I need some special code for a containerized installation

This can be accomplished using a host level variable containerized=true per host in the inventory or containerized=true can be added under [OSEv3:vars] to globally set containerized true for all services.

For example,

...

[masters]
master1.abutcher.com containerized=true

[nodes]
master1.abutcher.com openshift_schedulable=true containerized=true

I don't know whether code for supporting external etcd should be added or not

We have an internal variable for configuring external etcd CA validity etcd_ca_default_days but no configurable for etcd peer and serving certificates. If we are allowing OpenShift certificate validity to be configured then I think we should also add variables for external etcd cert validity but that can be done separately.

I'm not sure that new parameters will affect procedure of updating of the existing certificates

This process uses the same code updated here so I expect that no changes will be required.

It isn't documented

Adding these variables to https://github.com/openshift/openshift-ansible/blob/master/inventory/byo/hosts.ose.example with a small explanation will be a good start.

Copy link
Contributor

@tbielawa tbielawa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious about default setting and the value for the default expiry days.

@@ -73,6 +73,9 @@
--hostnames="{{ docker_registry_service_ip.results.clusterip }},docker-registry.default.svc.cluster.local,{{ docker_registry_route_hostname }}"
--cert={{ openshift.common.config_base }}/master/registry.crt
--key={{ openshift.common.config_base }}/master/registry.key
{% if openshift.common.version_gte_3_5_or_1_5 | bool %}
--expire-days={{ openshift_registry_cert_expire_days | default(730) }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we issue 1 year certs now, you're suggesting a 2 year default expiry here. Do we have it defined somewhere how long we want certs to be valid for?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Below, in some vars file I'm seeing defaults for expire days being set.

roles/openshift_ca/vars/main.yml
+openshift_ca_cert_expire_days: 1825
+openshift_master_cert_expire_days: 730

roles/openshift_master_certificates/vars/main.yml
+openshift_master_cert_expire_days: 730

roles/openshift_node_certificates/vars/main.yml
+openshift_node_cert_expire_days: 730

But here you are using default(730) in case openshift_registry_cert_expire_days is not defined. Should that not also be added to a roles vars//defaults/ file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we issue 1 year certs now, you're suggesting a 2 year default expiry here.

I'm using oadm ca default values here. How it's possible that ansible installer uses 1 year certificates and in the same time uses oadm with its default values of 2 (and 5) years? It only possible if a) we have a modified oadm for some reason b) we're generating certs manually somewhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But here you are using default(730) in case openshift_registry_cert_expire_days is not defined. Should that not also be added to a roles vars//defaults/ file?

I'm setting default values for roles but here it's impossible because it's not a role but just a playbook, so I'm using explicit default() here. Do you know how it can be improved?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No idea how to improve that. Thanks for explaining!

@@ -88,7 +88,7 @@
# This should NOT replace the CA due to --overwrite=false when a CA already exists.
- name: Create the master certificates if they do not already exist
command: >
{{ hostvars[openshift_ca_host].openshift.common.client_binary }} adm create-master-certs
{{ hostvars[openshift_ca_host].openshift.common.client_binary }} adm ca create-master-certs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, did this not work in the past without the ca sub-sub command there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a warning about using deprecated command.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@php-coder
Copy link
Contributor Author

Ok, I've updated the code and tested it against OSE v3.5 on RHEL. It works fine on 1 node with embedded etcd. I was checking an expiration dates of all *.crt files inside /etc/origin.

To @abutcher @tbielawa:

  • Are you OK with these changes? If yes, then I'll add a bit documentation before merging.
  • Do we have certificates in some non-standard places that I'm not aware of?
  • Do we have QEs who can test this along with corner cases (like using containerized environment)?
  • I also tested redeploy-certificates.yml, it updates all the certificates except ca.crt. Is it a feature?

@php-coder
Copy link
Contributor Author

I don't know whether code for supporting external etcd should be added or not

We have an internal variable for configuring external etcd CA validity etcd_ca_default_days but no configurable for etcd peer and serving certificates. If we are allowing OpenShift certificate validity to be configured then I think we should also add variables for external etcd cert validity but that can be done separately.

@mfojtik Is it still in the scope of my task? Technically it's not because these certificates are generated by openssl command and it doesn't use oadm at all. Practically it could be in scope, in terms of a general task for setting the certificates validity.

@abutcher
Copy link
Member

abutcher commented Mar 8, 2017

Are you OK with these changes? If yes, then I'll add a bit documentation before merging.

These changes LGTM.

Do we have certificates in some non-standard places that I'm not aware of?

All are covered here afaict.

Do we have QEs who can test this along with corner cases (like using containerized environment)?

QE will pick up when we move our cluster lifecycle card to complete.

I also tested redeploy-certificates.yml, it updates all the certificates except ca.crt. Is it a feature?

There is a separate playbook for rolling CA redeployment playbooks/byo/openshift-cluster/redeploy-openshift-ca.yml.

@tbielawa
Copy link
Contributor

tbielawa commented Mar 8, 2017

@php-coder said

Ok, I've updated the code and tested it against OSE v3.5 on RHEL. It works fine on 1 node with embedded etcd. I was checking an expiration dates of all *.crt files inside /etc/origin.

To @abutcher @tbielawa:

Are you OK with these changes? If yes, then I'll add a bit documentation before merging.
Do we have certificates in some non-standard places that I'm not aware of?
Do we have QEs who can test this along with corner cases (like using containerized environment)?
I also tested redeploy-certificates.yml, it updates all the certificates except ca.crt. Is it a feature?

If you want a thorough test of the new depoyment I suggest you use the cert expiry checker we have now

Using your existing inventory file you can run:

$ ansible-playbook -v -i <INVENTORY_FILE> ./playbooks/certificate_expiry/easy-mode.yaml
$ xdg-open /tmp/cert-expiry-report.html
$ xdg-open /tmp/cert-expiry-report.json

@php-coder
Copy link
Contributor Author

@abutcher @tbielawa It's ready to merge. PTAL.

@abutcher
Copy link
Member

abutcher commented Mar 9, 2017

[merge]

@openshift-bot
Copy link

[test]ing while waiting on the merge queue

@abutcher
Copy link
Member

abutcher commented Mar 9, 2017

aos-ci-test

@openshift-bot
Copy link

@php-coder
Copy link
Contributor Author

php-coder commented Mar 13, 2017

The first one is very strange because it could not compile a test:

23:21:08 ++ Building release v1.5.0-alpha.3+4cbbecd-244
23:21:16 rm -rf _output
23:21:16 hack/build-cross.sh
23:21:17 ++ Building go targets for linux/amd64: images/pod cmd/dockerregistry cmd/gitserver pkg/sdn/plugin/sdn-cni-plugin vendor/github.com/containernetworking/cni/plugins/ipam/host-local vendor/github.com/containernetworking/cni/plugins/main/loopback examples/hello-openshift examples/deployment
23:22:31 ++ Building go targets for linux/amd64: cmd/openshift cmd/oc
23:25:46 ++ Building go targets for linux/amd64: test/extended/extended.test
23:27:26 # github.com/openshift/origin/test/extended/registry
23:27:26 test/extended/registry/registry.go:52: cannot use "github.com/openshift/origin/pkg/image/api".ResourceImageStreams (type "github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/api".ResourceName) as type string in argument to "github.com/openshift/origin/pkg/image/api".Resource

The second one is a test-flake openshift/origin#12797

@abutcher could you re-run merge process, please?

@sdodson
Copy link
Member

sdodson commented Mar 13, 2017

flake openshift/origin#12797
[merge]

@sdodson
Copy link
Member

sdodson commented Mar 13, 2017

aos-ci-test is failing on 3.4 because --expire-days hasn't been backported there. Though, why is it only failing for containerized installs?

@php-coder
Copy link
Contributor Author

aos-ci-test is failing on 3.4 because --expire-days hasn't been backported there. Though, why is it only failing for containerized installs?

I suspect that openshift.common.version_gte_3_5_or_1_5 is evaluating to true for some strange reason. Is it possible?

@sdodson
Copy link
Member

sdodson commented Mar 13, 2017

oh, that's probably right.

@openshift-bot
Copy link

openshift-bot commented Mar 13, 2017

continuous-integration/openshift-jenkins/merge SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pull_request_openshift_ansible_extended_conformance_install/84/) (Base Commit: 403b5c5)

@php-coder
Copy link
Contributor Author

How we'll merge it? How I can help?

@php-coder
Copy link
Contributor Author

Ping.

@sdodson
Copy link
Member

sdodson commented Mar 20, 2017

aos-ci-test

@openshift-bot
Copy link

@php-coder
Copy link
Contributor Author

Containerized installation fails because openshift.common.version_gte_3_5_or_1_5 is evaluating to true. I feel that it's a bug in condition evaluation. I don't see another way of not using new option on old versions.

@abutcher
Copy link
Member

@php-coder I'm looking into how we can address this. openshift.common.version_gte_3_5_or_1_5 is defaulting to true because we have nothing to base the version on (as written) but we do know the image tag that we intend to use prior to creating certs.

@abutcher
Copy link
Member

@php-coder Once #3769 merges the version comparisons can use the new filters https://gist.github.com/abutcher/c60a27f6fa9abf4ae365fc24738349ee.

@php-coder
Copy link
Contributor Author

@abutcher Thank you! I'm watching and waiting :)

@php-coder
Copy link
Contributor Author

@abutcher Updated and ready to be tested/merged.

@openshift-bot
Copy link

Evaluated for openshift ansible test up to 638e419

@openshift-bot
Copy link

continuous-integration/openshift-jenkins/test SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pull_request_openshift_ansible_extended_conformance_install/84/) (Base Commit: 403b5c5)

@abutcher
Copy link
Member

aos-ci-test

@openshift-bot
Copy link

success: aos-ci-jenkins/OS_unit_tests for 638e419 (logs)

@openshift-bot
Copy link

success: "aos-ci-jenkins/OS_3.4_NOT_containerized, aos-ci-jenkins/OS_3.4_NOT_containerized_e2e_tests" for 638e419 (logs)

@openshift-bot
Copy link

success: "aos-ci-jenkins/OS_3.4_containerized, aos-ci-jenkins/OS_3.4_containerized_e2e_tests" for 638e419 (logs)

@openshift-bot
Copy link

success: "aos-ci-jenkins/OS_3.5_NOT_containerized, aos-ci-jenkins/OS_3.5_NOT_containerized_e2e_tests" for 638e419 (logs)

@openshift-bot
Copy link

success: "aos-ci-jenkins/OS_3.5_containerized, aos-ci-jenkins/OS_3.5_containerized_e2e_tests" for 638e419 (logs)

@abutcher
Copy link
Member

[merge]

@openshift-bot
Copy link

Evaluated for openshift ansible merge up to 638e419

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants