Skip to content

Commit

Permalink
config: ensure rgw section has the correct name
Browse files Browse the repository at this point in the history
the ceph.conf.j2 always assumes the hostname used to register the
radosgw in the servicemap is equivalent to `{{ ansible_hostname }}`
which returns the shortname form.

We need to detect which form of the hostname was used in case of already
deployed cluster and update the ceph.conf accordingly.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1580408

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
  • Loading branch information
guits authored and leseb committed Aug 13, 2018
1 parent db29b5b commit f422efb
Show file tree
Hide file tree
Showing 2 changed files with 32 additions and 6 deletions.
13 changes: 7 additions & 6 deletions roles/ceph-config/templates/ceph.conf.j2
Expand Up @@ -155,9 +155,10 @@ filestore xattr use omap = true

{% if inventory_hostname in groups.get(rgw_group_name, []) %}
{% for host in groups[rgw_group_name] %}
[client.rgw.{{ hostvars[host]['ansible_hostname'] }}]
host = {{ hostvars[host]['ansible_hostname'] }}
keyring = /var/lib/ceph/radosgw/{{ cluster }}-rgw.{{ hostvars[host]['ansible_hostname'] }}/keyring
{# {{ hostvars[host]['rgw_hostname'] }} for backward compatibility, fqdn issues. See bz1580408 #}
[client.rgw.{{ hostvars[host]['rgw_hostname'] }}]
host = {{ hostvars[host]['rgw_hostname'] }}
keyring = /var/lib/ceph/radosgw/{{ cluster }}-rgw.{{ hostvars[host]['rgw_hostname'] }}/keyring
log file = /var/log/ceph/{{ cluster }}-rgw-{{ hostvars[host]['ansible_hostname'] }}.log
{% if hostvars[host]['radosgw_address_block'] is defined and hostvars[host]['radosgw_address_block'] != 'subnet' %}
{% if ip_version == 'ipv4' %}
Expand Down Expand Up @@ -204,9 +205,9 @@ rgw frontends = {{ radosgw_frontend_type }} {{ 'port' if radosgw_frontend_type =
{% if inventory_hostname in groups.get(nfs_group_name, []) and inventory_hostname not in groups.get(rgw_group_name, []) %}
{% for host in groups[nfs_group_name] %}
{% if nfs_obj_gw %}
[client.rgw.{{ hostvars[host]['ansible_hostname'] }}]
host = {{ hostvars[host]['ansible_hostname'] }}
keyring = /var/lib/ceph/radosgw/{{ cluster }}-rgw.{{ hostvars[host]['ansible_hostname'] }}/keyring
[client.rgw.{{ hostvars[host]['rgw_hostname'] }}]
host = {{ hostvars[host]['rgw_hostname'] }}
keyring = /var/lib/ceph/radosgw/{{ cluster }}-rgw.{{ hostvars[host]['rgw_hostname'] }}/keyring
log file = /var/log/ceph/{{ cluster }}-rgw-{{ hostvars[host]['ansible_hostname'] }}.log
{% endif %}
{% endfor %}
Expand Down
25 changes: 25 additions & 0 deletions roles/ceph-defaults/tasks/facts.yml
Expand Up @@ -217,3 +217,28 @@
when:
- containerized_deployment
- ceph_docker_image | search("rhceph")

- block:
- name: get current cluster status (if already running)
command: "{{ docker_exec_cmd }} ceph --cluster {{ cluster }} -s -f json"
register: ceph_current_status

- name: set_fact ceph_current_status (convert to json)
set_fact:
ceph_current_status: "{{ ceph_current_status.stdout | from_json }}"

- name: set_fact rgw_hostname
set_fact:
rgw_hostname: "{% for key in ceph_current_status['servicemap']['services']['rgw']['daemons'].keys() %}{% if key == ansible_fqdn %}{{ key }}{% endif %}{% endfor %}"
when:
- ceph_current_fsid.get('rc', 1) == 0
- inventory_hostname in groups.get(rgw_group_name, [])
# no servicemap before luminous
- ceph_release_num[ceph_release] >= ceph_release_num['luminous']
- ansible_hostname != ansible_fqdn

This comment has been minimized.

Copy link
@hwoarang

hwoarang Aug 13, 2018

@guits I think the ansible_hostname != ansible_fqdn is not correct. ansible_hostname normally does not contain the domain name so this conditional may end up to evaluate to true for no reason. Maybe 'ansible_hostname != inventory_hostname` is the correct one?

This caused a regression in OpenStack ansible when deploying Ceph

This comment has been minimized.

Copy link
@hwoarang

This comment has been minimized.

Copy link
@leseb

leseb Aug 13, 2018

Member

@hwoarang not sure to understand the concern, nor the error/regression faced? Could you please be more explicit by maybe sharing a log? The reason why we have this condition is that if ansible_hostname == ansible_fqdn, then there is no FQDN configured on the machine.

ansible_hostname returns hostname -s where ansible_fqdn gives us hostname -f BUT could return shortname if no FQDN is configured.


- name: set_fact rgw_hostname
set_fact:
rgw_hostname: "{{ ansible_hostname }}"
when:
- rgw_hostname is undefined

3 comments on commit f422efb

@hwoarang
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leseb sure here is an example log

http://logs.openstack.org/46/589146/2/check/openstack-ansible-deploy-ceph-ubuntu-xenial/3444486/job-output.txt.gz#_2018-08-13_12_01_32_881202

fatal: [aio1_ceph-rgw_container-fc588f0a]: FAILED! => {"changed": false, "cmd": "ceph --cluster ceph -s -f json", "msg": "[Errno 2] No such file or directory"

The problem is that ansible_hostname returns aio1_ceph-rgw_container-fc588f0a but ansible_fqdn returns aio1_ceph-rgw_container-fc588f0a.openstack.local so the conditional is evaluated to True and the task is executed. However, at this point, the ceph-common which is responsible to install the required ceph packages hasn't been executed and so the ceph binary does not exist.

In our playbooks we first execute the ceph-default role and then ceph-common.

https://github.com/openstack/openstack-ansible/blob/master/playbooks/ceph-rgw-install.yml#L45-L48

However, as stated before, the ceph-default role apparently requires packages which are only installed by the ceph-common role.

Would you suggest to re-arrange the roles so we execute ceph-common before ceph-default to workaround this problem?

@leseb
Copy link
Member

@leseb leseb commented on f422efb Aug 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hwoarang Thanks for the clarification! lgtm

@leseb
Copy link
Member

@leseb leseb commented on f422efb Aug 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait, there is still something strange, typically this line f422efb#diff-2857b72f37476e1b5d16e7e81783ecbaR234 should prevent the block from being executed.

Please sign in to comment.