Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

missing /etc/etcd/ca/openssl.cnf on extra etcd nodes #9018

Open
srgvg opened this Issue Jun 28, 2018 · 8 comments

Comments

Projects
None yet
4 participants
@srgvg
Copy link

srgvg commented Jun 28, 2018

Description

I'm trying a 3 node install:

[cluster_hosts:children]
OSEv3
[OSEv3:children]
masters
nodes
etcd
[masters]
oso[1:3] openshift_disable_check=memory_availability
[etcd]
oso[1:3]
[nodes]
oso[1:3]

and consistently get an error when deploying the etcd certificates.

As far as i can tell, this looks like a bug, but I can't imagine this wouldn't have been detected earlier.
Perhaps I'm trying an unsupported configuration, or am I missing something?

Version
  • Your ansible version per ansible --version
ansible 2.5.5
  config file = /home/serge/src/openshift/casl-ansible/ansible.cfg
  configured module search path = [u'/home/serge/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python2.7/dist-packages/ansible
  executable location = /usr/bin/ansible
  python version = 2.7.15 (default, May  1 2018, 05:55:50) [GCC 7.3.0]

Working with the https://github.com/casl/ansible repo 'v3.9.1-1-g9ca566d'

# From 'openshift-ansible'
- src: https://github.com/openshift/openshift-ansible
  version: release-3.9
Steps To Reproduce
  1. ansible-playbook -i inventory2/c1-do/ galaxy/openshift-ansible/playbooks/openshift-etcd/config.yml
Expected Results

No error.

Observed Results
TASK [etcd : Create the server csr] *********************************************************************************************************************************************************************************
Thursday 28 June 2018  15:16:16 +0200 (0:00:00.940)       0:01:01.306 *********
changed: [oso1 -> {{ inventory_hostname }}]
fatal: [oso2 -> {{ inventory_hostname }}]: FAILED! => {"changed": true, "cmd": ["openssl", "req", "-new", "-keyout", "server.key", "-config", "/etc/etcd/ca/openssl.cnf", "-out", "server.csr", "-reqexts", "etcd_v3_req", "-batch", "-nodes", "-subj", "/CN=oso2"], "delta": "0:00:00.007129", "end": "2018-06-28 13:16:17.019700", "msg": "non-zero return code", "rc": 1, "start": "2018-06-28 13:16:17.012571", "stderr": "error on line -1 of /etc/etcd/ca/openssl.cnf\n140377274333072:error:02001002:system library:fopen:No such file or directory:bss_file.c:175:fopen('/etc/etcd/ca/openssl.cnf','rb')\n140377274333072:error:2006D080:BIO routines:BIO_new_file:no such file:bss_file.c:182:\n140377274333072:error:0E078072:configuration file routines:DEF_LOAD:no such file:conf_def.c:195:", "stderr_lines": ["error on line -1 of /etc/etcd/ca/openssl.cnf", "140377274333072:error:02001002:system library:fopen:No such file or directory:bss_file.c:175:fopen('/etc/etcd/ca/openssl.cnf','rb')", "140377274333072:error:2006D080:BIO routines:BIO_new_file:no such file:bss_file.c:182:", "140377274333072:error:0E078072:configuration file routines:DEF_LOAD:no such file:conf_def.c:195:"], "stdout": "", "stdout_lines": []}
fatal: [oso3 -> {{ inventory_hostname }}]: FAILED! => {"changed": true, "cmd": ["openssl", "req", "-new", "-keyout", "server.key", "-config", "/etc/etcd/ca/openssl.cnf", "-out", "server.csr", "-reqexts", "etcd_v3_req", "-batch", "-nodes", "-subj", "/CN=oso3"], "delta": "0:00:00.015550", "end": "2018-06-28 13:16:17.053198", "msg": "non-zero return code", "rc": 1, "start": "2018-06-28 13:16:17.037648", "stderr": "error on line -1 of /etc/etcd/ca/openssl.cnf\n139793236567952:error:02001002:system library:fopen:No such file or directory:bss_file.c:175:fopen('/etc/etcd/ca/openssl.cnf','rb')\n139793236567952:error:2006D080:BIO routines:BIO_new_file:no such file:bss_file.c:182:\n139793236567952:error:0E078072:configuration file routines:DEF_LOAD:no such file:conf_def.c:195:", "stderr_lines": ["error on line -1 of /etc/etcd/ca/openssl.cnf", "139793236567952:error:02001002:system library:fopen:No such file or directory:bss_file.c:175:fopen('/etc/etcd/ca/openssl.cnf','rb')", "139793236567952:error:2006D080:BIO routines:BIO_new_file:no such file:bss_file.c:182:", "139793236567952:error:0E078072:configuration file routines:DEF_LOAD:no such file:conf_def.c:195:"], "stdout": "", "stdout_lines": []}

The part where that openssl.cnf file is created, indeed only happened before that on the first node:

TASK [etcd : template] **********************************************************************************************************************************************************************************************
Thursday 28 June 2018  15:16:00 +0200 (0:00:00.768)       0:00:45.234 ********* 
ok: [oso1.do.ginsys.net -> {{ inventory_hostname }}]

TASK [etcd : assemble] **********************************************************************************************************************************************************************************************
Thursday 28 June 2018  15:16:01 +0200 (0:00:01.360)       0:00:46.594 ********* 
ok: [oso1.do.ginsys.net -> {{ inventory_hostname }}]
Additional Information
  • CentOS Linux release 7.5.1804 (Core) (Digitalocean default image)
  • inventory vars
---
ansible_user: root
ansible_become: false

openshift_deployment_type: origin
openshift_master_cluster_method: native
openshift_release: v3.9

# The web console port setting (openshift_master_console_port) must match the API server port (openshift_master_api_port).
openshift_master_api_port: 8443
openshift_master_console_port: '{{ openshift_master_api_port }}'
# docker_storage_block_device: "/dev/vdb"

# Subscription Management Details
rhsm_register: false
openshift_hosted_manage_registry: false
dns_domain: "openshift.example"
env_id: c1
openshift_public_hostname: "{{ env_id }}.{{ dns_domain }}"
openshift_master_cluster_hostname:  "{{ openshift_public_hostname }}"
openshift_master_cluster_public_hostname: "{{ openshift_public_hostname }}"
openshift_master_default_subdomain: "{{ openshift_public_hostname }}"

# HTPASSWD Identity Provider
openshift_master_identity_providers:
 - 'name': 'htpasswd_auth'
   'login': 'true'
   'challenge': 'true'
   'kind': 'HTPasswdPasswordIdentityProvider'
   'filename': '/etc/origin/master/htpasswd'
#this will create an admin/admin user
openshift_master_htpasswd_users:
  admin: xxx
openshift_hosted_router_selector: 'region=infra'
openshift_hosted_manage_router: true
osm_default_node_selector: 'zone=primary'
openshift_docker_options: "--log-driver=json-file --log-opt max-size=50m --log-opt max-file=100"
os_sdn_network_plugin_name: 'redhat/openshift-ovs-multitenant'
os_firewall_use_firewalld: true
osm_cluster_network_cidr: 10.1.0.0/16
openshift_hosted_prometheus_deploy: false
openshift_cfme_install_app: false
openshift_node_kubelet_args:
  pod-manifest-path:
  - "{{ static_pod_manifest_path }}"

@vrutkovs

This comment has been minimized.

Copy link
Contributor

vrutkovs commented Jun 29, 2018

Does this happen on a clean install or one etcd was deployed and then reconfigured to use 3 nodes?

@srgvg

This comment has been minimized.

Copy link
Author

srgvg commented Jul 2, 2018

No, it's a clean install, starting with the inventory as I mentioned.
Tried it two times (on a snapshotted clean base)

@michakrause

This comment has been minimized.

Copy link

michakrause commented Aug 2, 2018

I can confirm this problem with an openshift 3.10 installation.

@srgvg

This comment has been minimized.

Copy link
Author

srgvg commented Aug 2, 2018

In the mean time, I discovered the cause of this, though not sure at what levet it's a
bug.

I had my inventory configured with group_vars/all.yaml containing:

---                                         
ansible_host: '{{ inventory_hostname }}'

Somehow this triggers ansible to override delegate_to: tasks to target inventory_hostname instead of the delegate_to: target host, basically disabling any delegate_to: configs...

I first removed this ansible_host in group_vars/all.yml config - I'm not even sure why I had put it there in the first place - as this only configures what would happen by default.

Next problem I encountered, is in the casl-ansible/galaxy/infra-ansible/roles/update-host/tasks/wait-for-host.yml role task (yes, I use the casl scripts) which needs at least one of ansible_ssh_host or ansible_host to be set.

I then tried setting this in group_vars/all.yaml

---                                         
ansible_ssh_host: '{{ inventory_hostname }}'

And this setting doesn't seem to trigger what seems to be an ansible bug related to delegate_to but I couldn't find anything on ansible_ssh_host vs ansible_host and how that interacts with delegate_to...

@michakrause

This comment has been minimized.

Copy link

michakrause commented Aug 3, 2018

Thank you @srgvg,
I was also using ansible_host, for me ansible_ssh_host did not work either, I had to remove it completely.

@srgvg

This comment has been minimized.

Copy link
Author

srgvg commented Aug 3, 2018

In what seems related, in a later playbook deploying certificates, I get again a strange delegate_to: behaviour.

I added a a debug statement at casl-ansible/galaxy/openshift-ansible/roles/openshift_master_certificates/tasks/main.yml:89 yielding:

TASK [openshift_master_certificates : debug] ************************************************************************************************
task path: /home/serge/src/openshift/casl-ansible/galaxy/openshift-ansible/roles/openshift_master_certificates/tasks/main.yml:89
Friday 03 August 2018  18:47:31 +0200 (0:00:01.734)       0:01:06.557 ********* 
ok: [oso1.do.ginsys.net] => {
    "openshift_ca_host": "oso1.do.ginsys.net"
}
ok: [oso2.do.ginsys.net] => {
    "openshift_ca_host": "oso1.do.ginsys.net"
}
ok: [oso3.do.ginsys.net] => {
    "openshift_ca_host": "oso1.do.ginsys.net"
}

and then the next task fails with strangely a 'None' delegate:

TASK [openshift_master_certificates : copy] *************************************************************************************************
task path: casl-ansible/galaxy/openshift-ansible/roles/openshift_master_certificates/tasks/main.yml:90
Friday 03 August 2018  18:47:31 +0200 (0:00:00.253)       0:01:06.810 ********* 
skipping: [oso1.do.ginsys.net] => (item=admin.crt)  => {"changed": false, "item": "admin.crt", "skip_reason": "Conditional result was False"}
skipping: [oso1.do.ginsys.net] => (item=admin.key)  => {"changed": false, "item": "admin.key", "skip_reason": "Conditional result was False"}
skipping: [oso1.do.ginsys.net] => (item=admin.kubeconfig)  => {"changed": false, "item": "admin.kubeconfig", "skip_reason": "Conditional result was False"}
skipping: [oso1.do.ginsys.net] => (item=master.kubelet-client.crt)  => {"changed": false, "item": "master.kubelet-client.crt", "skip_reason": "Conditional result was False"}
skipping: [oso1.do.ginsys.net] => (item=master.kubelet-client.key)  => {"changed": false, "item": "master.kubelet-client.key", "skip_reason": "Conditional result was False"}
skipping: [oso1.do.ginsys.net] => (item=master.proxy-client.crt)  => {"changed": false, "item": "master.proxy-client.crt", "skip_reason": "Conditional result was False"}
skipping: [oso1.do.ginsys.net] => (item=master.proxy-client.key)  => {"changed": false, "item": "master.proxy-client.key", "skip_reason": "Conditional result was False"}
skipping: [oso1.do.ginsys.net] => (item=service-signer.crt)  => {"changed": false, "item": "service-signer.crt", "skip_reason": "Conditional result was False"}
skipping: [oso1.do.ginsys.net] => (item=service-signer.key)  => {"changed": false, "item": "service-signer.key", "skip_reason": "Conditional result was False"}
skipping: [oso1.do.ginsys.net] => (item=ca.crt)  => {"changed": false, "item": "ca.crt", "skip_reason": "Conditional result was False"}
skipping: [oso1.do.ginsys.net] => (item=ca.key)  => {"changed": false, "item": "ca.key", "skip_reason": "Conditional result was False"}
skipping: [oso1.do.ginsys.net] => (item=ca-bundle.crt)  => {"changed": false, "item": "ca-bundle.crt", "skip_reason": "Conditional result was False"}
skipping: [oso1.do.ginsys.net] => (item=client-ca-bundle.crt)  => {"changed": false, "item": "client-ca-bundle.crt", "skip_reason": "Conditional result was False"}
skipping: [oso1.do.ginsys.net] => (item=serviceaccounts.private.key)  => {"changed": false, "item": "serviceaccounts.private.key", "skip_reason": "Conditional result was False"}
skipping: [oso1.do.ginsys.net] => (item=serviceaccounts.public.key)  => {"changed": false, "item": "serviceaccounts.public.key", "skip_reason": "Conditional result was False"}
failed: [oso2.do.ginsys.net -> None] (item=admin.crt) => {"changed": false, "item": "admin.crt", "msg": "Source /etc/origin/master/admin.crt not found"}
failed: [oso3.do.ginsys.net -> None] (item=admin.crt) => {"changed": false, "item": "admin.crt", "msg": "Source /etc/origin/master/admin.crt not found"}
failed: [oso2.do.ginsys.net -> None] (item=admin.key) => {"changed": false, "item": "admin.key", "msg": "Source /etc/origin/master/admin.key not found"}
failed: [oso3.do.ginsys.net -> None] (item=admin.key) => {"changed": false, "item": "admin.key", "msg": "Source /etc/origin/master/admin.key not found"}
failed: [oso2.do.ginsys.net -> None] (item=admin.kubeconfig) => {"changed": false, "item": "admin.kubeconfig", "msg": "Source /etc/origin/master/admin.kubeconfig not found"}
failed: [oso3.do.ginsys.net -> None] (item=admin.kubeconfig) => {"changed": false, "item": "admin.kubeconfig", "msg": "Source /etc/origin/master/admin.kubeconfig not found"}
failed: [oso2.do.ginsys.net -> None] (item=master.kubelet-client.crt) => {"changed": false, "item": "master.kubelet-client.crt", "msg": "Source /etc/origin/master/master.kubelet-client.crt not found"}
failed: [oso3.do.ginsys.net -> None] (item=master.kubelet-client.crt) => {"changed": false, "item": "master.kubelet-client.crt", "msg": "Source /etc/origin/master/master.kubelet-client.crt not found"}
failed: [oso2.do.ginsys.net -> None] (item=master.kubelet-client.key) => {"changed": false, "item": "master.kubelet-client.key", "msg": "Source /etc/origin/master/master.kubelet-client.key not found"}
failed: [oso3.do.ginsys.net -> None] (item=master.kubelet-client.key) => {"changed": false, "item": "master.kubelet-client.key", "msg": "Source /etc/origin/master/master.kubelet-client.key not found"}
failed: [oso2.do.ginsys.net -> None] (item=master.proxy-client.crt) => {"changed": false, "item": "master.proxy-client.crt", "msg": "Source /etc/origin/master/master.proxy-client.crt not found"}
failed: [oso3.do.ginsys.net -> None] (item=master.proxy-client.crt) => {"changed": false, "item": "master.proxy-client.crt", "msg": "Source /etc/origin/master/master.proxy-client.crt not found"}
failed: [oso2.do.ginsys.net -> None] (item=master.proxy-client.key) => {"changed": false, "item": "master.proxy-client.key", "msg": "Source /etc/origin/master/master.proxy-client.key not found"}
failed: [oso3.do.ginsys.net -> None] (item=master.proxy-client.key) => {"changed": false, "item": "master.proxy-client.key", "msg": "Source /etc/origin/master/master.proxy-client.key not found"}
failed: [oso2.do.ginsys.net -> None] (item=service-signer.crt) => {"changed": false, "item": "service-signer.crt", "msg": "Source /etc/origin/master/service-signer.crt not found"}
failed: [oso3.do.ginsys.net -> None] (item=service-signer.crt) => {"changed": false, "item": "service-signer.crt", "msg": "Source /etc/origin/master/service-signer.crt not found"}
failed: [oso2.do.ginsys.net -> None] (item=service-signer.key) => {"changed": false, "item": "service-signer.key", "msg": "Source /etc/origin/master/service-signer.key not found"}
failed: [oso3.do.ginsys.net -> None] (item=service-signer.key) => {"changed": false, "item": "service-signer.key", "msg": "Source /etc/origin/master/service-signer.key not found"}
failed: [oso2.do.ginsys.net -> None] (item=ca.crt) => {"changed": false, "item": "ca.crt", "msg": "Source /etc/origin/master/ca.crt not found"}
failed: [oso3.do.ginsys.net -> None] (item=ca.crt) => {"changed": false, "item": "ca.crt", "msg": "Source /etc/origin/master/ca.crt not found"}
failed: [oso2.do.ginsys.net -> None] (item=ca.key) => {"changed": false, "item": "ca.key", "msg": "Source /etc/origin/master/ca.key not found"}
failed: [oso3.do.ginsys.net -> None] (item=ca.key) => {"changed": false, "item": "ca.key", "msg": "Source /etc/origin/master/ca.key not found"}
failed: [oso2.do.ginsys.net -> None] (item=ca-bundle.crt) => {"changed": false, "item": "ca-bundle.crt", "msg": "Source /etc/origin/master/ca-bundle.crt not found"}
failed: [oso3.do.ginsys.net -> None] (item=ca-bundle.crt) => {"changed": false, "item": "ca-bundle.crt", "msg": "Source /etc/origin/master/ca-bundle.crt not found"}
failed: [oso3.do.ginsys.net -> None] (item=client-ca-bundle.crt) => {"changed": false, "item": "client-ca-bundle.crt", "msg": "Source /etc/origin/master/client-ca-bundle.crt not found"}
failed: [oso2.do.ginsys.net -> None] (item=client-ca-bundle.crt) => {"changed": false, "item": "client-ca-bundle.crt", "msg": "Source /etc/origin/master/client-ca-bundle.crt not found"}
failed: [oso3.do.ginsys.net -> None] (item=serviceaccounts.private.key) => {"changed": false, "item": "serviceaccounts.private.key", "msg": "Source /etc/origin/master/serviceaccounts.private.key not found"}
failed: [oso2.do.ginsys.net -> None] (item=serviceaccounts.private.key) => {"changed": false, "item": "serviceaccounts.private.key", "msg": "Source /etc/origin/master/serviceaccounts.private.key not found"}
failed: [oso3.do.ginsys.net -> None] (item=serviceaccounts.public.key) => {"changed": false, "item": "serviceaccounts.public.key", "msg": "Source /etc/origin/master/serviceaccounts.public.key not found"}
failed: [oso2.do.ginsys.net -> None] (item=serviceaccounts.public.key) => {"changed": false, "item": "serviceaccounts.public.key", "msg": "Source /etc/origin/master/serviceaccounts.public.key not found"}

This can be a different thing, but it seems a related ansible bug?

Edit: this last task is in casl-ansible/galaxy/openshift-ansible/roles/openshift_master_certificates/tasks/main.yml and uses delegate_to: "{{ openshift_ca_host }}"

@srgvg

This comment has been minimized.

Copy link
Author

srgvg commented Aug 3, 2018

Previous issue seems to be a known issue, see ansible/ansible#27351

@robfrut135

This comment has been minimized.

Copy link

robfrut135 commented Aug 24, 2018

I also confirm this problem with an openshift 3.9 enterprise installation.

ansible 2.4.6.0
config file = /etc/ansible/ansible.cfg
configured module search path = [u'/usr/share/ansible/openshift-ansible/library']
ansible python module location = /home/ocpadmin/src/dxc-cp-delivery/release-ocp-3.9-pl-latest/dxc-ocp-base/ave/lib/python2.7/site-packages/ansible
executable location = /home/ocpadmin/src/dxc-cp-delivery/release-ocp-3.9-pl-latest/dxc-ocp-base/ave/bin/ansible
python version = 2.7.5 (default, May 31 2018, 09:41:32) [GCC 4.8.5 20150623 (Red Hat 4.8.5-28)]


TASK [etcd : Create the server csr] **************************************************************************************************************************************
Friday 24 August 2018 22:43:04 +0200 (0:00:00.519) 0:02:29.363 *********
fatal: [xxxx-xxx-xxx-base-master-2 -> None]: FAILED! => {
"changed": true,
"cmd": [
"openssl",
"req",
"-new",
"-keyout",
"server.key",
"-config",
"/etc/etcd/ca/openssl.cnf",
"-out",
"server.csr",
"-reqexts",
"etcd_v3_req",
"-batch",
"-nodes",
"-subj",
"/CN=xxxx-xxx-xxx-base-master"
],
"delta": "0:00:00.006168",
"end": "2018-08-24 16:43:04.724811",
"failed": true,
"rc": 1,
"start": "2018-08-24 16:43:04.718643"
}

STDERR:

error on line -1 of /etc/etcd/ca/openssl.cnf
139767387371408:error:02001002:system library:fopen:No such file or directory:bss_file.c:175:fopen('/etc/etcd/ca/openssl.cnf','rb')
139767387371408:error:2006D080:BIO routines:BIO_new_file:no such file:bss_file.c:182:
139767387371408:error:0E078072:configuration file routines:DEF_LOAD:no such file:conf_def.c:195:

MSG:

non-zero return code

fatal: [xxxx-xxx-xxx-base-master-1 -> None]: FAILED! => {
"changed": true,
"cmd": [
"openssl",
"req",
"-new",
"-keyout",
"server.key",
"-config",
"/etc/etcd/ca/openssl.cnf",
"-out",
"server.csr",
"-reqexts",
"etcd_v3_req",
"-batch",
"-nodes",
"-subj",
"/CN=xxxx-xxx-xxx-base-master"
],
"delta": "0:00:00.007149",
"end": "2018-08-24 16:43:04.762910",
"failed": true,
"rc": 1,
"start": "2018-08-24 16:43:04.755761"
}

STDERR:

error on line -1 of /etc/etcd/ca/openssl.cnf
140500248819600:error:02001002:system library:fopen:No such file or directory:bss_file.c:175:fopen('/etc/etcd/ca/openssl.cnf','rb')
140500248819600:error:2006D080:BIO routines:BIO_new_file:no such file:bss_file.c:182:
140500248819600:error:0E078072:configuration file routines:DEF_LOAD:no such file:conf_def.c:195:

MSG:

non-zero return code

changed: [xxxx-xxx-xxx-base-master-0 -> None]

NO MORE HOSTS LEFT *******************************************************************************************************************************************************

PLAY RECAP ***************************************************************************************************************************************************************
localhost : ok=11 changed=0 unreachable=0 failed=0
xxxx-xxx-xxx-base-infra-0 : ok=26 changed=1 unreachable=0 failed=0
xxxx-xxx-xxx-base-infra-1 : ok=26 changed=1 unreachable=0 failed=0
xxxx-xxx-xxx-base-master-0 : ok=69 changed=4 unreachable=0 failed=0
xxxx-xxx-xxx-base-master-1 : ok=37 changed=2 unreachable=0 failed=1
xxxx-xxx-xxx-base-master-2 : ok=37 changed=2 unreachable=0 failed=1
xxxx-xxx-xxx-base-node-0 : ok=26 changed=1 unreachable=0 failed=0
xxxx-xxx-xxx-base-node-1 : ok=26 changed=1 unreachable=0 failed=0
xxxx-xxx-xxx-base-node-2 : ok=26 changed=1 unreachable=0 failed=0
xxxx-xxx-xxx-base-node-3 : ok=26 changed=1 unreachable=0 failed=0
xxxx-xxx-xxx-base-node-4 : ok=26 changed=1 unreachable=0 failed=0


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.