-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refact playbook #1727
Refact playbook #1727
Conversation
789b338
to
2167272
Compare
5657d68
to
7b5c0db
Compare
# These checks are used to avoid running handlers at initial deployment. | ||
- name: check for a ceph socket | ||
shell: | | ||
{{ docker_exec_cmd }} bash -c 'stat {{ rbd_client_admin_socket_path }}/*.asok > /dev/null 2>&1' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean to have {{ docker_exec_cmd }}
here? It's undefined unless a containerized_deployment
is True
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andrewschoen yes, docker_exec_cmd
has a default value set to ''
then it's set here so we can use this check for both ceph-common
and ceph-docker-common
, at least this was already used that way before that PR, I just moved all this from ceph-mon
to ceph-defaults
a66af6f
to
656c427
Compare
@@ -9,15 +9,15 @@ SOCKET=/var/run/ceph/${CLUSTER}-mon.${MONITOR_NAME}.asok | |||
|
|||
check_quorum() { | |||
while [ $RETRIES -ne 0 ]; do | |||
MEMBERS=$(ceph --cluster ${CLUSTER} -s --format json | sed -r 's/.*"quorum_names":(\[[^]]+\]).*/\1/') | |||
MEMBERS=$({{ docker_exec_cmd }} ceph --cluster ${CLUSTER} -s --format json | sed -r 's/.*"quorum_names":(\[[^]]+\]).*/\1/') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unrelated to the PR but I wonder if CLUSTER="{{ cluster }}"
is really necessary, we can just have --cluster {{ cluster }}
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree
|
||
wait_for_socket_in_docker() { | ||
docker exec $1 timeout 10 bash -c "while [ ! -e /var/run/ceph/*.asok ]; do sleep 1 ; done" | ||
#TODO(guits): What if socket never shows up? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
timeout will return 1 but we better have a message for this, this code should be better:
if ! docker exec $1 timeout 10 bash -c "while [ ! -e /var/run/ceph/*.asok ]; do sleep 1 ; done"; then
log "Timed out while trying to look for a Ceph OSD socket."
log "Abort mission!"
exit 1
fi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will definitely try this
local count | ||
count=10 | ||
while [ $count -ne 0 ]; do | ||
id=$(docker ps -q -f "name=$1") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
# We need to wait because it may take some time for the socket to actually exists | ||
COUNT=10 | ||
# Wait and ensure the socket exists after restarting the daemon | ||
{% if not containerized_deployment -%} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a big fan of reverse conditions so perhaps we should replace this with:
{% if containerized_deployment -%}
...
{% else %}
...
{% endif %}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok i switched
{% endif %} | ||
SOCKET=/var/run/ceph/test-osd.${osd_id}.asok | ||
while [ $COUNT -ne 0 ]; do | ||
# $docker_exec bash -c "test -S $SOCKET" && check_pgs && continue 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leftover?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
5d3e099
to
1f3d713
Compare
f1228a2
to
0b28ca1
Compare
Merge `ceph-docker-common` and `ceph-common` defaults vars in `ceph-defaults` role. Remove redundant variables declaration in `ceph-mon` and `ceph-osd` roles. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Add a new role `ceph-defaults`. This role aims to handle all defaults vars for `ceph-docker-common` and `ceph-common` and set basic facts (eg. `fsid`) Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is looking good now. I do agree that this needs some documentation though. Before we merge should we create an issue or something so people are aware of the changes being made here? There will be implications for ansible-galaxy users at the least.
|
||
roles: | ||
- ceph-defaults | ||
- ceph.ceph-common |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be just ceph-common
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I modified it
This will give us more flexibility and avoid a lot of useless when skipping all tasks from a non-desired role. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Move `fsid`,`monitor_name`,`docker_exec_cmd` and `ceph_release` set_fact to `ceph-defaults` role. It will allow to reuse these facts without having to play `ceph-common` or `ceph-docker-common`. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Until now, there is no handlers for containerized deployments. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
in containerized deployment, if you try to update your `ceph.conf` file it won't be actually updated on your nodes because it is overwritten by the copy of the file which is present in your fetch directory. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Signed-off-by: Sébastien Han <seb@redhat.com>
Signed-off-by: Sébastien Han <seb@redhat.com>
A massive refactor of ceph-ansible merged about 18 hours ago in ceph/ceph-ansible#1727, which breaks our ceph gate. A patch already exists[1] which will adjust our ceph deployment according to the upstream changes, however there is some work to be done upstream[2] before our fix can merge. Until then, to keep the OSA master gate unblocked, we should pin the ceph-ansible roles immediately prior to the refactor commits. [1] https://review.openstack.org/#/c/490476/ [2] ceph/ceph-ansible#1740 Change-Id: Ie96984d207163404bb7f98cc10aa02a3035a362c
@@ -93,10 +88,24 @@ | |||
# Hard code this so we will skip the entire file instead of individual tasks (Default isn't Consistent) | |||
static: False | |||
|
|||
- include: facts.yml | |||
- include: ./checks/check_socket.yml | |||
- name: get ceph version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fails on nodes that have roles like "rbd_mirror" since the "ceph" command hasn't been installed yet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dillaman this task was already executed in ceph-common
before this refactor. Do you mean before this commit you didn't hit this issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm -- looks like it never would have passed since PR #737 (which was merged long before support for rbd-mirror). Apparently the installation of rbd-mirror never worked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might have missed something but at the task you highlighted here, you should have ceph
command available, the task which installs ceph packages are executed earlier in that same role but I'll investigate this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you have a VM/docker container that is just running the rbd-mirror role, nothing is installed at this point (at least on CentOS). There are some hooks to install ceph-base on CentOS if the client role is assigned to the host so I'd imagine that should be expanded to cover any host where Ceph ansible is running.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ceph-ansible has undergone significant refactoring in their v3 development, adding new required roles such as ceph-defaults and ceph-config, which are used to provide vars and configuration to the ceph service roles. These roles must be executed before the service roles to avoid missing vars[1][2]. During the v3 refactoring, ceph-common was removed as a galaxy-style role dependency in the service roles meta files[3]. This means we will need to explicitly execute ceph-common from now on also. This change adds the defaults and config roles and executes them. Also some minor cleanup such as alphabetizing the OpenStack roles list is done. Also added is an upgrade playbook, reno, and docs to assist in cleaning up the older galaxy-named ceph common roles which are no longer galaxy namespaced in our cloning configuration. [1] http://jenkins-logs.objects-us-dfw-1.cloud.lstn.net/osa-ci/490192/1/21/logs/console.log [2] ceph/ceph-ansible#1737 [3] ceph/ceph-ansible#1727 Change-Id: Ia8c0cb0a23f331fce7914afbfc05ef54ee3ffb0e
ceph-ansible has undergone significant refactoring in their v3 development, adding new required roles such as ceph-defaults and ceph-config, which are used to provide vars and configuration to the ceph service roles. These roles must be executed before the service roles to avoid missing vars[1][2]. During the v3 refactoring, ceph-common was removed as a galaxy-style role dependency in the service roles meta files[3]. This means we will need to explicitly execute ceph-common from now on also. This change adds the defaults and config roles and executes them. Also some minor cleanup such as alphabetizing the OpenStack roles list is done. Also added is an upgrade playbook, reno, and docs to assist in cleaning up the older galaxy-named ceph common roles which are no longer galaxy namespaced in our cloning configuration. [1] http://jenkins-logs.objects-us-dfw-1.cloud.lstn.net/osa-ci/490192/1/21/logs/console.log [2] ceph/ceph-ansible#1737 [3] ceph/ceph-ansible#1727 Conflicts: ansible-role-requirements.yml Change-Id: Ia8c0cb0a23f331fce7914afbfc05ef54ee3ffb0e (cherry picked from commit a53f1ae)
Refact all playbook
Fixes: #1681, #1698, #1701, #1670