Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ovirt.ovirt.infra role restarts target_host, causing play to fail as it'll be UNREACHABLE #715

Open
Gauravtalreja1 opened this issue Aug 10, 2023 · 0 comments
Labels
bug Something isn't working

Comments

@Gauravtalreja1
Copy link

SUMMARY
COMPONENT NAME
STEPS TO REPRODUCE
---
- hosts: "{{ target_host }}"
  become: yes
  vars:
    - name: Include ovirt.ovirt.infra role
      block:
        - ansible.builtin.include_role:
            name: ovirt.ovirt.infra
          ignore_errors: True
          vars:
            engine_url: "https://{{ target_host }}/ovirt-engine/api"
            engine_user: "admin@internal"
            engine_password: "{{ default_password }}"
            data_center_name: "test_datacenter"
            data_center_local: true
            compatibility_version: 4.4
            clusters:
              - name: "test_ cluster"
                cpu_type: "{{ cpu_family | default('Secure AMD EPYC') }}"
                profile: production
            storages:
              test-storage:
                master: true
                state: present
                localfs:
                  path: "/home/storage/test-storage"
            hosts:
              - name: "test-host"
                address: "{{ target_host }}"
                cluster: "test_ cluster"
                password: "{{ default_password }}"
      rescue:
        - name: Restart vdsmd/libvirtd services if host isn't reachable
          ansible.builtin.service:
            name: "{{ item }}"
            state: restarted
          loop: ['vdsmd', 'libvirtd']

EXPECTED RESULTS

ovirt.ovirt.infra role should pass, host is present in UP state and reachable .

ACTUAL RESULTS

ovirt.ovirt.infra role fails with below error, but host is present in UP state after reboot.

TASK [ovirt.ovirt.hosts : Add hosts] *******************************************
changed: [target_host.example.com] => (item=test_host)

TASK [ovirt.ovirt.hosts : Wait for hosts to be added] **************************
FAILED - RETRYING: Wait for hosts to be added (105 retries left).
FAILED - RETRYING: Wait for hosts to be added (104 retries left).
FAILED - RETRYING: Wait for hosts to be added (103 retries left).
FAILED - RETRYING: Wait for hosts to be added (102 retries left).
FAILED - RETRYING: Wait for hosts to be added (101 retries left).
FAILED - RETRYING: Wait for hosts to be added (100 retries left).
FAILED - RETRYING: Wait for hosts to be added (99 retries left).
FAILED - RETRYING: Wait for hosts to be added (98 retries left).
FAILED - RETRYING: Wait for hosts to be added (97 retries left).
FAILED - RETRYING: Wait for hosts to be added (96 retries left).
FAILED - RETRYING: Wait for hosts to be added (95 retries left).
FAILED - RETRYING: Wait for hosts to be added (94 retries left).
FAILED - RETRYING: Wait for hosts to be added (93 retries left).
FAILED - RETRYING: Wait for hosts to be added (92 retries left).
FAILED - RETRYING: Wait for hosts to be added (91 retries left).
FAILED - RETRYING: Wait for hosts to be added (90 retries left).
FAILED - RETRYING: Wait for hosts to be added (89 retries left).
failed: [target_host.example.com] (item=test_host) => {"ansible_loop_var": "item", "item": {"ansible_job_id": "258538605917.98997", "ansible_loop_var": "item", "changed": true, "failed": false, "finished": 0, "item": {"address": "target_host.example.com", "cluster": "test_cluster", "name": "test_host", "password": "secret"}, "results_file": "/root/.ansible_async/258538605917.98997", "started": 1}, "msg": "Failed to connect to the host via ssh: ssh: connect to host target_host.example.com port 22: Connection refused", "unreachable": true}
fatal: [target_host.example.com]: UNREACHABLE! => {"changed": false, "msg": "All items completed", "results": [{"ansible_loop_var": "item", "item": {"ansible_job_id": "258538605917.98997", "ansible_loop_var": "item", "changed": true, "failed": false, "finished": 0, "item": {"address": "target_host.example.com", "cluster": "test_cluster", "name": "test_host", "password": "secret"}, "results_file": "/root/.ansible_async/258538605917.98997", "started": 1}, "msg": "Failed to connect to the host via ssh: ssh: connect to host target_host.example.com port 22: Connection refused", "unreachable": true}]}

PLAY RECAP *********************************************************************
target_host.example.com : ok=38   changed=20   unreachable=1    failed=0    skipped=32   rescued=0    ignored=0
WORKAROUND/SUGGESTION

Set reboot_after_installation: "false" as its default to true, which reboots the host after adding.
And ansible.builtin.wait_for_connection module could be used instead of ansible.builtin.async_status to wait for host to be added and reachable task here ?

@Gauravtalreja1 Gauravtalreja1 added the bug Something isn't working label Aug 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant