Skip to content

Commit

Permalink
fix: Improve output of openshift-installer progress and check (IBM#61)
Browse files Browse the repository at this point in the history
The openshift-installer requires up to 60 min to complete. You need to
check manually if the install process started sucessfully or you simply
wait 60 min for the results.
This patch monitors the 'node-bootstrapper' request. Usually such
request is raised within 6 min. If not, then this is considered as an
error and the ansible playbook script exits.

The openshift-installer process is not aborted! It is still running.

Signed-off-by: Klaus Smolin <smolin@de.ibm.com>

Signed-off-by: Jacob Emery <jacob.emery@ibm.com>
Co-authored-by: Jacob Emery <jacob.emery@ibm.com>
Signed-off-by: Amadeus Podvratnik <pod@de.ibm.com>
  • Loading branch information
2 people authored and AmadeusPodvratnik committed Jan 19, 2023
1 parent 005b8a8 commit 57fba69
Showing 1 changed file with 38 additions and 10 deletions.
48 changes: 38 additions & 10 deletions roles/wait_for_bootstrap/tasks/main.yaml
Original file line number Diff line number Diff line change
@@ -1,30 +1,58 @@
---

- name: Watch wait-for bootstrap-complete process.
- name: Start openshift-installer with 'wait-for bootstrap-complete' (async task)
tags: wait_for_bootstrap
shell: openshift-install wait-for bootstrap-complete --dir=/root/ocpinst
ansible.builtin.command: openshift-install wait-for bootstrap-complete --dir=/root/ocpinst
async: 3600
poll: 0
register: watch_bootstrap

- name: Retry wait-for bootstrap-complete job ID check until it's finished. This may take some time... To watch progress, SSH to bastion, switch to root, from there, SSH to core@bootstrap-ip and run 'journalctl -b -f -u release-image.service -u bootkube.service'
- name: Wait for first node-bootstrapper request, should be started within 6 min (retry every 30s)...To watch progress, SSH to root@bastion, \
SSH to core@bootstrap-ip and run 'journalctl -b -f -u release-image.service -u bootkube.service'
tags: wait_for_bootstrap
async_status:
ansible.builtin.shell: |
set -o pipefail
oc get csr | grep ":node-bootstrapper"
register: csr_check
until: (":node-bootstrapper" in csr_check.stdout)
retries: 12
delay: 30

- name: Print first node-bootstrapper requests
tags: wait_for_bootstrap
ansible.builtin.debug:
var: csr_check.stdout_lines

- name: Wait for control node CSRs (retry every 30s)
tags: wait_for_bootstrap
ansible.builtin.shell: |
set -o pipefail
oc get csr | awk '{print $4}' | grep "^system:node:{{ item|lower }}"
register: cmd_output
until: ("system:node:" in cmd_output.stdout)
loop: "{{ env.cluster.nodes.control.hostname }}"
retries: 20
delay: 30

- name: Retry wait-for bootstrap-complete job ID check until it's finished. This may take some time... To watch progress, \
SSH to bastion, switch to root, from there, SSH to core@bootstrap-ip and run 'journalctl -b -f -u release-image.service -u bootkube.service'
tags: wait_for_bootstrap
ansible.builtin.async_status:
jid: "{{ watch_bootstrap.ansible_job_id }}"
register: bootstrapping
until: bootstrapping.finished
retries: 120
retries: 100
delay: 30

- name: Make sure kubeconfig works properly. This
- name: Make sure kubeconfig works properly
tags: wait_for_bootstrap
shell: oc whoami
ansible.builtin.command: oc whoami
register: oc_whoami
until: oc_whoami.stdout == "system:admin"
retries: 120
retries: 60
delay: 10

- name: Print output of oc whoami, should be "system:admin" if previous task worked.
- name: Print output of oc whoami, should be "system:admin" if previous task worked
tags: wait_for_bootstrap
debug:
ansible.builtin.debug:
var: oc_whoami.stdout

0 comments on commit 57fba69

Please sign in to comment.