Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refact playbook #1727

Merged
merged 9 commits into from
Aug 2, 2017
Merged

Refact playbook #1727

merged 9 commits into from
Aug 2, 2017

Conversation

guits
Copy link
Collaborator

@guits guits commented Jul 28, 2017

Refact all playbook

Fixes: #1681, #1698, #1701, #1670

@guits guits force-pushed the refact branch 17 times, most recently from 789b338 to 2167272 Compare July 31, 2017 13:59
@guits guits force-pushed the refact branch 4 times, most recently from 5657d68 to 7b5c0db Compare July 31, 2017 14:15
# These checks are used to avoid running handlers at initial deployment.
- name: check for a ceph socket
shell: |
{{ docker_exec_cmd }} bash -c 'stat {{ rbd_client_admin_socket_path }}/*.asok > /dev/null 2>&1'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean to have {{ docker_exec_cmd }} here? It's undefined unless a containerized_deployment is True.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrewschoen yes, docker_exec_cmd has a default value set to '' then it's set here so we can use this check for both ceph-common and ceph-docker-common, at least this was already used that way before that PR, I just moved all this from ceph-mon to ceph-defaults

@guits guits force-pushed the refact branch 2 times, most recently from a66af6f to 656c427 Compare August 1, 2017 04:20
@@ -9,15 +9,15 @@ SOCKET=/var/run/ceph/${CLUSTER}-mon.${MONITOR_NAME}.asok

check_quorum() {
while [ $RETRIES -ne 0 ]; do
MEMBERS=$(ceph --cluster ${CLUSTER} -s --format json | sed -r 's/.*"quorum_names":(\[[^]]+\]).*/\1/')
MEMBERS=$({{ docker_exec_cmd }} ceph --cluster ${CLUSTER} -s --format json | sed -r 's/.*"quorum_names":(\[[^]]+\]).*/\1/')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unrelated to the PR but I wonder if CLUSTER="{{ cluster }}" is really necessary, we can just have --cluster {{ cluster }}.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree


wait_for_socket_in_docker() {
docker exec $1 timeout 10 bash -c "while [ ! -e /var/run/ceph/*.asok ]; do sleep 1 ; done"
#TODO(guits): What if socket never shows up?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

timeout will return 1 but we better have a message for this, this code should be better:

  if ! docker exec $1 timeout 10 bash -c "while [ ! -e /var/run/ceph/*.asok ]; do sleep 1 ; done"; then
    log "Timed out while trying to look for a Ceph OSD socket."
    log "Abort mission!"
    exit 1
  fi

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will definitely try this

local count
count=10
while [ $count -ne 0 ]; do
id=$(docker ps -q -f "name=$1")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indent?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

# We need to wait because it may take some time for the socket to actually exists
COUNT=10
# Wait and ensure the socket exists after restarting the daemon
{% if not containerized_deployment -%}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a big fan of reverse conditions so perhaps we should replace this with:

{% if containerized_deployment -%}
...
{% else %}
...
{% endif %}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok i switched

{% endif %}
SOCKET=/var/run/ceph/test-osd.${osd_id}.asok
while [ $COUNT -ne 0 ]; do
# $docker_exec bash -c "test -S $SOCKET" && check_pgs && continue 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leftover?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

@guits guits force-pushed the refact branch 7 times, most recently from f1228a2 to 0b28ca1 Compare August 2, 2017 11:51
Merge `ceph-docker-common` and `ceph-common` defaults vars in
`ceph-defaults` role.
Remove redundant variables declaration in `ceph-mon` and `ceph-osd` roles.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Add a new role `ceph-defaults`.
This role aims to handle all defaults vars for `ceph-docker-common` and
`ceph-common` and set basic facts (eg. `fsid`)

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Copy link
Contributor

@andrewschoen andrewschoen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is looking good now. I do agree that this needs some documentation though. Before we merge should we create an issue or something so people are aware of the changes being made here? There will be implications for ansible-galaxy users at the least.


roles:
- ceph-defaults
- ceph.ceph-common
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be just ceph-common?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I modified it

guits and others added 6 commits August 2, 2017 17:12
This will give us more flexibility and avoid a lot of useless when
skipping all tasks from a non-desired role.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Move `fsid`,`monitor_name`,`docker_exec_cmd` and `ceph_release` set_fact
to `ceph-defaults` role.
It will allow to reuse these facts without having to play `ceph-common`
or `ceph-docker-common`.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Until now, there is no handlers for containerized deployments.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
in containerized deployment, if you try to update your `ceph.conf` file
it won't be actually updated on your nodes because it is overwritten by
the copy of the file which is present in your fetch directory.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Signed-off-by: Sébastien Han <seb@redhat.com>
Signed-off-by: Sébastien Han <seb@redhat.com>
@guits guits changed the title DNM: Refact Refact playbook Aug 2, 2017
@leseb leseb merged commit 6e37915 into master Aug 2, 2017
@leseb leseb deleted the refact branch August 2, 2017 20:01
openstack-gerrit pushed a commit to openstack/openstack-ansible that referenced this pull request Aug 3, 2017
A massive refactor of ceph-ansible merged about 18 hours ago in
ceph/ceph-ansible#1727, which breaks our
ceph gate. A patch already exists[1] which will adjust our ceph
deployment according to the upstream changes, however there is some
work to be done upstream[2] before our fix can merge. Until then,
to keep the OSA master gate unblocked, we should pin the ceph-ansible
roles immediately prior to the refactor commits.

[1] https://review.openstack.org/#/c/490476/
[2] ceph/ceph-ansible#1740

Change-Id: Ie96984d207163404bb7f98cc10aa02a3035a362c
@@ -93,10 +88,24 @@
# Hard code this so we will skip the entire file instead of individual tasks (Default isn't Consistent)
static: False

- include: facts.yml
- include: ./checks/check_socket.yml
- name: get ceph version

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fails on nodes that have roles like "rbd_mirror" since the "ceph" command hasn't been installed yet

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dillaman this task was already executed in ceph-common before this refactor. Do you mean before this commit you didn't hit this issue?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm -- looks like it never would have passed since PR #737 (which was merged long before support for rbd-mirror). Apparently the installation of rbd-mirror never worked.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might have missed something but at the task you highlighted here, you should have ceph command available, the task which installs ceph packages are executed earlier in that same role but I'll investigate this.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have a VM/docker container that is just running the rbd-mirror role, nothing is installed at this point (at least on CentOS). There are some hooks to install ceph-base on CentOS if the client role is assigned to the host so I'd imagine that should be expanded to cover any host where Ceph ansible is running.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

openstack-gerrit pushed a commit to openstack/openstack-ansible that referenced this pull request Sep 19, 2017
ceph-ansible has undergone significant refactoring in their v3
development, adding new required roles such as ceph-defaults and
ceph-config, which are used to provide vars and configuration to
the ceph service roles. These roles must be executed before the
service roles to avoid missing vars[1][2].

During the v3 refactoring, ceph-common was removed as a galaxy-style
role dependency in the service roles meta files[3]. This means we
will need to explicitly execute ceph-common from now on also.

This change adds the defaults and config roles and executes them.
Also some minor cleanup such as alphabetizing the OpenStack roles
list is done.

Also added is an upgrade playbook, reno, and docs to assist in
cleaning up the older galaxy-named ceph common roles which are
no longer galaxy namespaced in our cloning configuration.

[1] http://jenkins-logs.objects-us-dfw-1.cloud.lstn.net/osa-ci/490192/1/21/logs/console.log
[2] ceph/ceph-ansible#1737
[3] ceph/ceph-ansible#1727

Change-Id: Ia8c0cb0a23f331fce7914afbfc05ef54ee3ffb0e
openstack-gerrit pushed a commit to openstack/openstack-ansible that referenced this pull request Oct 25, 2017
ceph-ansible has undergone significant refactoring in their v3
development, adding new required roles such as ceph-defaults and
ceph-config, which are used to provide vars and configuration to
the ceph service roles. These roles must be executed before the
service roles to avoid missing vars[1][2].

During the v3 refactoring, ceph-common was removed as a galaxy-style
role dependency in the service roles meta files[3]. This means we
will need to explicitly execute ceph-common from now on also.

This change adds the defaults and config roles and executes them.
Also some minor cleanup such as alphabetizing the OpenStack roles
list is done.

Also added is an upgrade playbook, reno, and docs to assist in
cleaning up the older galaxy-named ceph common roles which are
no longer galaxy namespaced in our cloning configuration.

[1] http://jenkins-logs.objects-us-dfw-1.cloud.lstn.net/osa-ci/490192/1/21/logs/console.log
[2] ceph/ceph-ansible#1737
[3] ceph/ceph-ansible#1727

Conflicts:
  ansible-role-requirements.yml

Change-Id: Ia8c0cb0a23f331fce7914afbfc05ef54ee3ffb0e
(cherry picked from commit a53f1ae)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants