mgr/ansible: Ansible orchestrator module #24445

jmolmo · 2018-10-05T10:38:15Z

A Ceph Manager Orchestrator that uses a external REST API service to execute Ansible playbooks.
Signed-off-by: Juan Miguel Olmo Martínez jolmomar@redhat.com

A first running version of this orchestrator manager module.

Still lot of things to do, but this allows to start getting feedback from the community.

Just manual tests ran:

enable/disable module and check logs
test command line (get inventory)

Details

[root@ceph build]# ./bin/ceph mgr module ls
*** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH ***
2018-10-04 13:09:12.430 7fa64be26700 -1 WARNING: all dangerous and experimental features are enabled.
2018-10-04 13:09:12.499 7fa64be26700 -1 WARNING: all dangerous and experimental features are enabled.
{
    "enabled_modules": [
        "balancer",
        "dashboard",
        "devicehealth",
        "iostat",
        "prometheus",
        "restful",
        "status"
    ],
    "disabled_modules": [
        {
            "name": "ansible_orchestrator",
            "can_run": true,
            "error_string": ""
        },
        {
            "name": "diskprediction",
            ...
------------------------------------------------------------------------------------------------------------
[root@ceph build]# ./bin/ceph mgr module enable ansible_orchestrator
*** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH ***
2018-10-04 13:10:34.121 7f91d0c71700 -1 WARNING: all dangerous and experimental features are enabled.
2018-10-04 13:10:34.237 7f91d0c71700 -1 WARNING: all dangerous and experimental features are enabled.

------------------------------------------------------------------------------------------------------------
Ceph Mgr logs:
2018-10-04 13:10:34.638 7ff6ecc646c0  1 mgr[py] Loading python module 'ansible_orchestrator'
2018-10-04 13:10:34.663 7ff6ecc646c0  4 mgr[py] load_subclass_of: found class: 'ansible_orchestrator.Module'
2018-10-04 13:10:34.663 7ff6ecc646c0  4 mgr[py] Standby mode not provided by module 'ansible_orchestrator'
2018-10-04 13:10:35.608 7ff6c869b700  4 mgr[py] Starting ansible_orchestrator
2018-10-04 13:10:35.608 7ff6c869b700  1 mgr load Constructed class from module: ansible_orchestrator
2018-10-04 13:10:35.608 7ff6c869b700  4 mgr start_one Starting thread for ansible_orchestrator
2018-10-04 13:10:35.609 7ff6c6697700  4 mgr entry Entering thread for ansible_orchestrator
2018-10-04 13:10:35.609 7ff6c6697700  4 mgr[ansible_orchestrator] Starting Ansible Orchestrator module ...
2018-10-04 13:10:35.609 7ff6c6697700  4 mgr[ansible_orchestrator] No pending operations
2018-10-04 13:10:45.609 7ff6c6697700  4 mgr[ansible_orchestrator] No pending operations
....

------------------------------------------------------------------------------------------------------------

[root@ceph build]# ./bin/ceph  inventory
*** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH ***
2018-10-04 13:11:36.733 7f7247aeb700 -1 WARNING: all dangerous and experimental features are enabled.
2018-10-04 13:11:36.794 7f7247aeb700 -1 WARNING: all dangerous and experimental features are enabled.
Textual result of the playbook execution
<<<Textual result of the playbook execution>>>

References tracker ticket
Updates documentation if necessary
Includes tests for new functionality or reproducer for bug

doc/mgr/ansible_orchestrator.rst

src/pybind/mgr/ansible_orchestrator/ansible_runner_svc.py

jcsp · 2018-10-11T13:24:22Z

With this model of each operation being an individual playbook execution, I'm having trouble seeing how it will re-use ceph-ansible itself. In ceph-ansible users edit their hosts file to define services + optionally devices for OSDs, and then run site.yaml -- i.e. there is one big ansible execution that updates everything in the system to the desired state, rather than lots of little playbooks being run. @leseb can jump in if I'm reading the docs wrong on this.

One of the purposes of the wait() function is to enable this kind of use case: individual operations can update the hosts file and return completions, but the actual playbook execution wouldn't happen until someone called wait on those completions.

So I guess the key question is: is this module intending to re-use ceph-ansible, or is it intended to write some other set of playbooks? And if it will re-use ceph-ansible, then how will that work?

jmolmo · 2018-10-11T18:25:51Z

Thank you for your comments John, your help with the "orientation" and "concepts" will be very much appreciated. :-)

With this model of each operation being an individual playbook execution, I'm having trouble seeing how it will re-use ceph-ansible itself. In ceph-ansible users edit their hosts file to define services + optionally devices for OSDs, and then run site.yaml -- i.e. there is one big ansible execution that updates everything in the system to the desired state, rather than lots of little playbooks being run. @leseb can jump in if I'm reading the docs wrong on this.

I think that the model follows what the Ceph Mgr Orchestrator dictates.
The Orchestrator will have a set of operations to be done, reachable through the Orchestrator API. This operations will be executed through Ansible playbooks executions.(using the Ansible Runner service as provider of these operations).

The Ansible Runner service is just a back-end to ease the execution of playbooks over a set of hosts previously provisioned in the same service. The user has to provision previously the hosts and groups of the hosts. Once provisioned in the service, user can execute playbooks over this hosts/groups of hosts.
So the "magnitude" of the operation depends of what are the tasks implemented in the playbook/s called by the Orchestrator API endpoint.

Example:
If the Orchestrator has a "get_inventory" method, then we will use a "inventory" playbook over the hosts provisioned.
If the Orchestrator has a "build site" method, then we will use a "site.yaml" playbook over the hosts provisioned.

Therefore, is really each of the Orchestrator API endpoints what defines the "magnitude" of each task. (even it will be possible implement one orchestrator API enpoint using several different playbook executions.)

One of the purposes of the wait() function is to enable this kind of use case: individual operations can update the hosts file and return completions, but the actual playbook execution wouldn't happen until someone called wait on those completions.

Maybe i do not understand well the documentation, and i have implemented this in the wrong way. What we have in the documentation is:
"All methods that read or modify the state of the system can potentially be long running. To handle that, all such methods return a completion object (a ReadCompletion or a WriteCompletion). Orchestrator modules must implement the wait method: this takes a list of completions, and is responsible for checking if they’re finished, and advancing the underlying operations as needed."

So what i understood (and i have drive my implementation in this direction):

Orchestrator methods launch operations and return completions objects where we can check the operation status. (so playbook execution starts here)
Wait method check operations finished (and clean them) and advance operations, in our case the completions objects that we are going to use represent basically playbooks executions, so once the playbook is launched the "wait" method can't do nothing except checking if the execution has finished or not. Is the completion object the responsible of check the operation status and update this information. ( so the "wait" method only is able to clean finished operations)

About "individual operations can update the hosts file"...

I think we are not aligned with this. I explain how works the Ansible Runner Service.

In the Ansible Runner service is the User who provides the "inventory" of hosts to the Service. Althought it is possible to manage the "inventory", our assumption is that for the moment the only one that can say what hosts are in the cluster and what is the function of each one is the User.

So... although we can add/remove hosts(groups) from the Orchestrator, i think that it will be difficult to know what we have to do ... Can you explain with more detail your idea/assumption?

So I guess the key question is: is this module intending to re-use ceph-ansible, or is it intended to write some other set of playbooks?

The module is intending to execute ansible playbooks, most of the functionality we have is in the ceph-ansible playbooks so we will try to reuse it.
For example...

for the "get_inventory" method need a completelly new "playbook" to obtain the information (this is not available in ceph.ansible, in general "discovery" playbooks are not present)
for the "create_osd" we will use the available playbooks in the ceph-ansible repo.

And if it will re-use ceph-ansible, then how will that work?

One Orchestrator method will be called, this will launch a new completion object that will be responsible for the execution of one or more playbooks, this completion object will be returned to the caller. The caller will use the "status" and "result" attributes of the completion object to get the information required and to know if the operation has been executed successfully.

I expect a high degree of ceph-ansible functionality reuse, because i think that most of the operations needed in the Orchestrator are things that are being covered by the current ceph-ansible playbooks.

I think that the big challenge here is to create an Orchestrator API that provides a very easy way to manage all Ceph clusters operations.

jcsp · 2018-10-12T12:36:20Z

If the Orchestrator has a "build site" method, then we will use a "site.yaml" playbook over the hosts provisioned.

If you do a bunch of operations, and then call wait(), then the wait() method is essentially your "build site" method. If I can use a Star Trek analogy... think of all the normal operations (like creating an OSD) as Captain Picard giving his orders, and then the wait() method as him saying "Make it so!".

I am not trying to say that every module (or even this module) has to work that way, just that the interface is designed so that it's a a possibility (i.e. having a wait() implementation that essentially updates the Ansible inventory/config, and runs site.yaml).

Orchestrator methods launch operations and return completions objects where we can check the operation status. (so playbook execution starts here)

The "playbook execution starts here" part is not a requirement of the interface. This is a key point. Nothing requires or promises that operations will begin at the point the completion object is constructed -- they don't have to advance at all until someone calls wait().

So... although we can add/remove hosts(groups) from the Orchestrator, i think that it will be difficult to know what we have to do ... Can you explain with more detail your idea/assumption?

I'm looking at the current workflow in ceph-ansible, where the way to create an OSD is to put a host in the [osd] section with a devices= line (I hope we have all read https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html/installation_guide_for_red_hat_enterprise_linux/deploying-red-hat-ceph-storage)

jmolmo · 2018-10-15T10:54:38Z

If I can use a Star Trek analogy... think of all the normal operations (like creating an OSD) as Captain Picard giving his orders, and then the wait() method as him saying "Make it so!".

Thanks for the analogy! Now i understand better the aim of the design... (the "wait" method name does not helped too much, although in the documentation is clearly defined what you points) ... Ok i will modify my implementation in order that make the wait method the engine that make operations in completion objects to progress/advance.

I'm looking at the current workflow in ceph-ansible, where the way to create an OSD is to put a host in the [osd] section with a devices= line (I hope we have all read https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html/installation_guide_for_red_hat_enterprise_linux/deploying-red-hat-ceph-storage)

Using Ansible Runner Service the User do not have directly an inventory file, although the User can manage the groups and the hosts in each group using the REST service.
The User can execute any kind of playbook (loaded playbooks) over any of the hosts using the right parameters.

The basic flow of operations in Ansible Runner Service is:

The User must add the "hosts" in the cluster to the Ansible Runner Service. The user should include the public key of the Ansible Runner Service in each of this hosts in order to allow passwordless ssh access.
The user execute a playbook over one of the hosts/one of the host groups providing the right parameters to the playbook

Taking this into account, we can use the orchestrator to provide the User a set of higher logic level / and more easy operations, for example, OSD management like creating, replacing OSD's

Maybe this differences with the "Ceph Ansible way of work" can clarify the working behavior of The Ansible Orchestrator:

Ansible Runner Service has an internal list of nodes/groups of nodes while In Ceph Ansible what we have is the inventory of nodes in a file.
In Ceph Ansible you can have a cluster playbook where you define the composition/features of your cluster. With the Ansible Runner Service you don't have this kind of file, what you have is the possibility of execute any of the roles playbooks over the provisioned set of hosts.
( this does not imply that we can implement this feature)
I think that the real power of the Orchestrator is more like a day 2 or 3 tool, it is not intended to install the whole cluster ( although it can do) , is aimed to ease/provide any kind of management operations over a installed cluster.

jmolmo · 2018-10-15T11:03:08Z

jenkins retest this please

jcsp · 2018-10-15T12:14:11Z

I don't think any limitations of ansible-runner-service are important -- it's brand new unreleased code, so it can be changed however is necessary. If it needs an extension to its API to define inventories in the ceph-ansible style, then that shouldn't be hard.

I think that the real power of the Orchestrator is more like a day 2 or 3 tool, it is not intended to install the whole cluster

The orchestrator interface absolutely is intended for installation of all the Ceph services apart from the initial mon and mgr services. Mons and managers require little or no configuration or decision making (it's easy to set them up with a simple CLI tool), whereas OSDs require a guided process to select how devices should be used (a GUI is strongly preferred), so it makes sense to ensure that the OSD installation part of the process happens in the Ceph dashboard.

There is no meaningful separation between "day 1" and "day 2" when it comes to Ceph OSDs, because part of the ongoing lifecycle of a Ceph cluster is adding new OSDs (as the cluster grows, as drives fail).

jmolmo · 2018-10-17T20:36:47Z

Manual test

test lab used:

Three vagrant vm machines (mon0, mgr0, osd0) with:
ceph version 14.0.0-4023-gd03a830 (d03a830) nautilus (dev)
A container with the last version of Ansible Runner Rest Service

Operations

Disable modules

[root@mon0 ~]# ceph mgr module disable ansible_orchestrator
[root@mon0 ~]# ceph mgr module disable orchestrator_cli

Enable modules

[root@mon0 ~]# ceph mgr module enable ansible_orchestrator
[root@mon0 ~]# ceph mgr module enable orchestrator_cli

Set ansible_orchestrator as backend of orchestrator-cli

[root@mon0 ~]# ceph orchestrator set backend ansible_orchestrator
[root@mon0 ~]#  ceph orchestrator status
Backend: ansible_orchestrator
Available: True

Get cluster nodes and free devices

[root@mon0 ~]# ceph orchestrator device ls
192.168.121.245:
192.168.121.61:
192.168.121.254:
  sdc (hdd, 53687091200b)

Checking the devices availability in osd0 (192.168.121.254):

[vagrant@osd0 ~]$ lsblk
NAME                                                                                                               MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda                                                                                                                  8:0    0   50G  0 disk 
├─ceph--filestore--d251cce4--e04f--4ea2--ba9e--b1afacc3797e-osd--data--c1ee62a2--f7da--49c7--bc99--a5b249402adc    253:4    0   47G  0 lvm  /var/lib/ceph/osd/ceph-0
└─ceph--filestore--d251cce4--e04f--4ea2--ba9e--b1afacc3797e-osd--journal--af4f2e06--d670--49bc--8d11--803916bbc858 253:5    0    2G  0 lvm  
sdb                                                                                                                  8:16   0   50G  0 disk 
├─ceph--filestore--c36f9d1a--c0e8--4e03--b27f--86049dc895c4-osd--data--8b2643e6--e8f9--4783--96eb--0ec241b5df70    253:2    0   47G  0 lvm  /var/lib/ceph/osd/ceph-1
└─ceph--filestore--c36f9d1a--c0e8--4e03--b27f--86049dc895c4-osd--journal--c90b5bd0--33fe--4b88--ac2a--5788be9de990 253:3    0    2G  0 lvm  
sdc                                                                                                                  8:32   0   50G  0 disk 
vda                                                                                                                252:0    0   41G  0 disk 
├─vda1                                                                                                             252:1    0    1M  0 part 
├─vda2                                                                                                             252:2    0    1G  0 part /boot
└─vda3                                                                                                             252:3    0   39G  0 part 
  ├─VolGroup00-LogVol00                                                                                            253:0    0 37.5G  0 lvm  /
  └─VolGroup00-LogVol01                                                                                            253:1    0  1.5G  0 lvm  [SWAP]

Manager logs during the operation

Note: Once enabled orchestrator modules, in the manager node is needed to raise the log level in order to get this king of log events output

[root@mgr0 ansible_orchestrator]# sudo ceph daemon mgr.mgr0 config set debug_mgr 20/5


[root@mgr0 ceph]# tail -f ceph-mgr.mgr0.log  | grep ansible_orchestrator

2018-10-17 17:45:19.695 7f525093b700 10 ceph_config_get orchestrator found: ansible_orchestrator
2018-10-17 17:45:19.695 7f525093b700 20 mgr dispatch_remote Calling ansible_orchestrator.get_inventory...
2018-10-17 17:45:19.704 7f525093b700  0 mgr[ansible_orchestrator] http POST https://192.168.121.1:5001/api/v1/playbooks/probe-disks.yml [{}] <--> (202 - ACCEPTED)
2018-10-17 17:45:19.704 7f525093b700  4 mgr[ansible_orchestrator] Playbook execution launched succesfuly
2018-10-17 17:45:19.704 7f525093b700 10 ceph_config_get orchestrator found: ansible_orchestrator
2018-10-17 17:45:19.704 7f525093b700 20 mgr dispatch_remote Calling ansible_orchestrator.wait...
2018-10-17 17:45:19.715 7f525093b700  4 mgr[ansible_orchestrator] http GET https://192.168.121.1:5001/api/v1/playbooks/6a388e60-d234-11e8-a922-2016b900e38f <--> (200 - {
2018-10-17 17:45:19.715 7f525093b700  4 mgr[ansible_orchestrator] Requested playbook execution status is: 2
2018-10-17 17:45:19.715 7f525093b700  4 mgr[ansible_orchestrator] playbook <probe-disks.yml> status:2
2018-10-17 17:45:19.715 7f525093b700  4 mgr[ansible_orchestrator] Operations pending: 1
2018-10-17 17:45:24.720 7f525093b700 10 ceph_config_get orchestrator found: ansible_orchestrator
2018-10-17 17:45:24.720 7f525093b700 20 mgr dispatch_remote Calling ansible_orchestrator.wait...
2018-10-17 17:45:24.735 7f525093b700  4 mgr[ansible_orchestrator] http GET https://192.168.121.1:5001/api/v1/playbooks/6a388e60-d234-11e8-a922-2016b900e38f <--> (200 - {
2018-10-17 17:45:24.735 7f525093b700  4 mgr[ansible_orchestrator] Requested playbook execution status is: 2
2018-10-17 17:45:24.735 7f525093b700  4 mgr[ansible_orchestrator] playbook <probe-disks.yml> status:2
2018-10-17 17:45:24.735 7f525093b700  4 mgr[ansible_orchestrator] Operations pending: 1
2018-10-17 17:45:29.741 7f525093b700 10 ceph_config_get orchestrator found: ansible_orchestrator
2018-10-17 17:45:29.741 7f525093b700 20 mgr dispatch_remote Calling ansible_orchestrator.wait...
2018-10-17 17:45:29.748 7f525093b700  4 mgr[ansible_orchestrator] http GET https://192.168.121.1:5001/api/v1/playbooks/6a388e60-d234-11e8-a922-2016b900e38f <--> (200 - {
2018-10-17 17:45:29.748 7f525093b700  4 mgr[ansible_orchestrator] Requested playbook execution status is: 2
2018-10-17 17:45:29.748 7f525093b700  4 mgr[ansible_orchestrator] playbook <probe-disks.yml> status:2
2018-10-17 17:45:29.748 7f525093b700  4 mgr[ansible_orchestrator] Operations pending: 1
2018-10-17 17:45:34.754 7f525093b700 10 ceph_config_get orchestrator found: ansible_orchestrator
2018-10-17 17:45:34.754 7f525093b700 20 mgr dispatch_remote Calling ansible_orchestrator.wait...
2018-10-17 17:45:34.767 7f525093b700  4 mgr[ansible_orchestrator] http GET https://192.168.121.1:5001/api/v1/playbooks/6a388e60-d234-11e8-a922-2016b900e38f <--> (200 - {
2018-10-17 17:45:34.767 7f525093b700  4 mgr[ansible_orchestrator] Requested playbook execution status is: 0
2018-10-17 17:45:34.767 7f525093b700  4 mgr[ansible_orchestrator] playbook <probe-disks.yml> status:0
2018-10-17 17:45:34.793 7f525093b700  4 mgr[ansible_orchestrator] http GET https://192.168.121.1:5001/api/v1/jobs/6a388e60-d234-11e8-a922-2016b900e38f/events <--> (200 - {
2018-10-17 17:45:34.794 7f525093b700  4 mgr[ansible_orchestrator] Requested playbook result is: {"37-63977577-38d7-4a3a-ad59-b451ae59a56b": {"host": "192.168.121.254", "task": "RESULTS", "event": "runner_on_ok"}}
2018-10-17 17:45:34.800 7f525093b700  4 mgr[ansible_orchestrator] http GET https://192.168.121.1:5001/api/v1/jobs/6a388e60-d234-11e8-a922-2016b900e38f/events/37-63977577-38d7-4a3a-ad59-b451ae59a56b <--> (200 - {
2018-10-17 17:45:34.800 7f525093b700  4 mgr[ansible_orchestrator] Operations pending: 0

sebastian-philipp · 2018-10-19T10:23:32Z

src/pybind/mgr/ansible_orchestrator/ansible_runner_svc.py

+
+            response = r
+
+        except Exception as ex:


Catching Exception is a bit broad. Do you want to catch more thanrequests.exceptions.RequestException here?

No special action will be taken in case of any kind of error here , so in my opinion differenciate the errors is not adding too much value.
In any case, following your advice, I changed the log method to ""exception" in order to have more information about "the context" of the error

The problem with just logging exceptions is that the ceph-mgr log is not visible to users unless they go and poke around on the node where it's running. To actually make an error visible to users, it's better to let the exception surface.

In other words, if there's no special action to take in the case of the exception, then don't catch it. In the case of login(), it makes sense to have the caller catch exceptions, rather than to catch them inside and then have the caller check is_operable -- that way the caller can see the actual exception, and surface it to the user (e.g. via a health check) if they choose to.

I have checked the behavior of the module with the login implementation with/without error management.
And In this case (login), as John is pointing, it seems sensible to move the error management to the caller.
(not too much difference, but the error message/stack trace is clear because the explanation appears first).

In the case of the rest of the http methods, i think that is better to leave the error management in the method, if i remove it, then probably a generic error management will be implemented at the caller level to avoid repeating the error management in each http call.

Details:

I stopped the Ansible Runner Service to check the error:

When i try to login again: ( error management in login method)

2018-10-24 14:38:40.301 7fa08c337700 0 mgr[ansible_orchestrator] Ansible runner service - Unexpected error Traceback (most recent call last): File "/usr/lib64/ceph/mgr/ansible_orchestrator/ansible_runner_svc.py", line 171, in login verify = self.certificate) File "/usr/lib/python2.7/site-packages/requests/api.py", line 68, in get return request('get', url, **kwargs) File "/usr/lib/python2.7/site-packages/requests/api.py", line 50, in request response = session.request(method=method, url=url, **kwargs) File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 464, in request resp = self.send(prep, **send_kwargs) File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 576, in send r = adapter.send(request, **kwargs) File "/usr/lib/python2.7/site-packages/requests/adapters.py", line 415, in send raise ConnectionError(err, request=request) ConnectionError: ('Connection aborted.', error(111, 'Connection refused')) 2018-10-24 14:38:40.301 7fa08c337700 0 mgr[ansible_orchestrator] Ansible Runner Service not available. Check external server status or connection options. If configuration options changed try to disable/enable the module.

When i try to login again: (login method without error management, it is implemented in the caller.) (This is the current version)

2018-10-24 15:05:31.908 7f1056d41700 0 mgr[ansible_orchestrator] Ansible Runner Service not available. Check external server status or connection options. If configuration options changed try to disable/enable the module. Traceback (most recent call last): File "/usr/lib64/ceph/mgr/ansible_orchestrator/module.py", line 250, in serve logger = self.log) File "/usr/lib64/ceph/mgr/ansible_orchestrator/ansible_runner_svc.py", line 159, in __init__ self.login() File "/usr/lib64/ceph/mgr/ansible_orchestrator/ansible_runner_svc.py", line 171, in login verify = self.certificate) File "/usr/lib/python2.7/site-packages/requests/api.py", line 68, in get return request('get', url, **kwargs) File "/usr/lib/python2.7/site-packages/requests/api.py", line 50, in request response = session.request(method=method, url=url, **kwargs) File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 464, in request resp = self.send(prep, **send_kwargs) File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 576, in send r = adapter.send(request, **kwargs) File "/usr/lib/python2.7/site-packages/requests/adapters.py", line 415, in send raise ConnectionError(err, request=request) ConnectionError: ('Connection aborted.', error(111, 'Connection refused'))

sebastian-philipp · 2018-10-19T10:24:43Z

src/pybind/mgr/ansible_orchestrator/ansible_runner_svc.py

+            response = r
+
+        except Exception as ex:
+            self.log.error("Ansible runner service - Unexpected error: %s", ex)


if you call log.execption, it will also print the stack trace.

self.log.exception("Ansible runner service")

sebastian-philipp · 2018-10-19T10:27:00Z

src/pybind/mgr/ansible_orchestrator/ansible_runner_svc.py

+        @returns: A requests object
+        """
+        # TODO
+        pass


Suggested change

pass

raise NotImplementedError("TODO")

?

sebastian-philipp · 2018-10-19T10:28:58Z

src/pybind/mgr/ansible_orchestrator/module.py

+    inventory_nodes = []
+
+    # Loop over the result events and request the event data
+    for event_key, data in inventory_events.iteritems():


iteritems is not supported in Python 3

Removed!. Thanks!

sebastian-philipp · 2018-10-19T10:29:24Z

src/pybind/mgr/ansible_orchestrator/module.py

+            event_data = json.loads(event_response.text)["data"]["event_data"]
+
+            free_disks = event_data["res"]["free_disks"]
+            for item, data in free_disks.iteritems():


iteritems see above

sebastian-philipp · 2018-10-19T10:29:54Z

src/pybind/mgr/ansible_orchestrator/module.py

+                if item not in [host.name for host in inventory_nodes]:
+
+                    devs = []
+                    for dev_key, dev_data in data.iteritems():


iteritems see above

leseb · 2018-10-19T13:47:11Z

src/pybind/mgr/ansible_orchestrator/module.py

+# Auxiliary functions
+#==============================================================================
+
+def process_inventary_json(inventory_events, ar_client, playbook_uuid):


Suggested change

def process_inventary_json(inventory_events, ar_client, playbook_uuid):

def process_inventory_json(inventory_events, ar_client, playbook_uuid):

Fixed! Thanks!

leseb · 2018-10-19T13:50:33Z

src/pybind/mgr/ansible_orchestrator/module.py

+                                                 params = "{}")
+
+        # Assing the process_output function
+        ansible_operation.process_output = process_inventary_json


Suggested change

ansible_operation.process_output = process_inventary_json

ansible_operation.process_output = process_inventory_json

leseb · 2018-10-19T13:51:27Z

src/pybind/mgr/ansible_orchestrator/module.py

+
+
+# List of playbooks names used
+GET_INVENTORY_PLAYBOOK = "probe-disks.yml"


can you share this file as an example?

This is playbook i'm using in this moment is only providing "free disks". In any case it was good enough to be used as base for implement the "get_inventory" method. I'm modifying it in order to get a list of all the devices. ( in any case if you know other playbook with this function it will be welcome)

[jolmomar@localhost tmp]$ cat probe-disks.yml --- # # Playbook to scan a set of hosts and return a dict indexed by host containing # a list of disks that are unused. Each disk is represented by a dict with the # following fields; # # size_txt (str) e.g 10.0GB # size_bytes (int) e.g. 21474836480 # sectorsize (int) e.g. 512 # sectors (int) e.g 41943040 # # example output; # ok: [con-1 -> 127.0.0.1] => { # "free_disks": { # "con-1": { # "vdd": { # "rotational": true, # "sectors": 41943040, # "sectorsize": 512, # "size_bytes": 21474836480, # "size_txt": "20.00 GB" # } # }, - name: probe hosts for free disks hosts: - osds - mgrs - mons vars: free_disks: | {%- set disk_table = dict() %} {%- for host in play_hosts %} {%- set _x = disk_table.__setitem__(host, {}) %} {%- set _devdata = dict() %} {%- for disk in hostvars[host].host_disk %} {%- set _meta = hostvars[host]['ansible_devices'][disk] %} {%- set _x = _devdata.__setitem__(disk, dict(size_txt=_meta['size'], rotational=_meta['rotational']|bool, sectors=_meta['sectors']|int, sectorsize=_meta['sectorsize']|int, size_bytes=_meta['sectors']|int * _meta['sectorsize']|int)) %} {%- endfor %} {%- set _x = disk_table.__setitem__(host, _devdata) %} {%- endfor %} {{ disk_table }} gather_facts: true tasks: - name: setup set_fact: host_disk: [] - name: Get a list of block devices (excludes loop and child devices) command: lsblk -n --o NAME --nodeps --exclude 7 register: lsblk_out - name: check if disk {{ item }} is free command: pvcreate --test /dev/{{ item }} ignore_errors: true register: pv_status with_items: "{{lsblk_out.stdout_lines}}" - name: Update hosts freedisk list set_fact: host_disk: "{{host_disk + [item.item]}}" ignore_errors: true when: item.rc == 0 with_items: "{{ pv_status.results}}" - name: RESULTS debug: var: free_disks delegate_to: 127.0.0.1 run_once: True

Changed the name of the playbook to "host-disks.yml"

leseb · 2018-10-19T13:54:29Z

src/pybind/mgr/ansible_orchestrator/module.py

+
+
+# List of playbooks names used
+GET_INVENTORY_PLAYBOOK = "probe-disks.yml"


Also the name is weird, you call it GET_INVENTORY_PLAYBOOK but is this the playbook or the inventory? It's confusing. Based on https://github.com/ceph/ceph/pull/24445/files#diff-5940840b32ed5f2781084ee6e9a7d408R270 we would think it's the playbook but https://github.com/ceph/ceph/pull/24445/files#diff-5940840b32ed5f2781084ee6e9a7d408R261 indicates an inventory...

This constant must contain the name of the playbook used to retrieve the list of storage devices present in the host affected by the playbook run.

As you pointed the name of the playbook is weird...
i will change it. I think that "get_storage_devices.yml" is more understandable.

Finally changed to "host-disks.yml".

leseb · 2018-10-19T13:57:57Z

src/pybind/mgr/ansible_orchestrator/module.py

+
+        # Create a new read completion object for execute the playbook
+        ansible_operation = AnsibleReadOperation(client = self.ar_client,
+                                                 playbook = GET_INVENTORY_PLAYBOOK,


playbook is confusing if we are actually called an inventory

I should leave the name of the constant without changes, although i will add the following comment over the definition of the constant.
Name of the playbook used in the "get_inventory" method. This playbook is expected to provide a list of storage devices in the host where the playbook is executed.

jcsp · 2018-10-22T11:44:13Z

doc/mgr/ansible_orchestrator.rst

+.. _ansible-orchestrator-module:
+
+====================
+Ansible Orchestrator


I'd suggest sticking with plain ansible as the name. If we find it becomes necessary to highlight/identify which modules are orchestrator modules, we should do that programmatically rather than with long names.

(I mean the actual python module name)

Done!
Now it follows the same pattern that other Orchestratpor modules and it is more elegant in commands. Thanks!

jcsp · 2018-10-22T11:45:34Z

src/pybind/mgr/ansible_orchestrator/module.py

+    """
+
+    OPTIONS = [
+        {'name': 'server_addr'},


I'd suggest merging the addr+port settings into a single URL setting.

So that the server's URL is set atomically (i.e. port and hostname together) -- if changing the server's address and port, you don't want to go through an intermediate stage where it's trying to talk to the wrong port on the right hostname, or vice versa.

Ok... but in the orchestrator these values are not effective when you change them, only when you disable/enable the module, ( unless we change implementation to allow "hot" change of config variables)

Yes, I'm assuming that you would at some point want to improve the connection/authentication stuff so that an authentication error or a config change didn't require a ceph-mgr restart.

A ceph mgr restart is not needed at all to refresh/change configuration values.

The sequence is:

Disable module

Change configuration values as required

Enable module

When the ansible module is enabled, it reads all the configuration values, and once readed and validated the module can start to use them. So you can deem the read of the configuration values as a "transactional" operation.

Disabling or enabling a module restarts ceph-mgr.

8-| ... i didn't realize of that... in fact the service is not restarted, the pid of the binary continues being the same ... but internally as you say a restart is executed ....
Does not seem very healthy the fact of changing and make effective a setting in one module, force the restart of other modules ... now i understand your comment:

you would at some point want to improve the connection/authentication stuff so that an authentication error or a config change didn't require a ceph-mgr restart

But this is a problem that affects all the modules...
So what i think what we need is basically a method that refreshes configuration and can be called from CLI.

I propose:

To add in the MgrModule base class a method "refresh_config" to be overwritten by modules:
In this method the module must read all the settings and apply the changes detected.

To implement in the Ansible Orchestrator this method:

Add in the Orchestrator_CLI a new command to call the "refresh_config" in the backend orchestrator

If you agree... i can do this... i think that this is really 'adding value' ... more than join/not join together two different settings.

Firstly, I still think that you should store your URL as a URL. Your web browser doesn't have different text boxes for the hostname vs. port, and neither should your settings. Trust me on this. The way to specify the destination for an HTTP connection is to have a URL setting, this isn't controversial.

While some modules would benefit from a notification on changes (and that could be implemented pretty easily from PyModuleRegistry::handle_config calling through to ActivePyModules::notify_all), you don't need it here. Because you're a client rather than a server, you can just look at the configured URL each time you make a request. If it's different from your established client session, just throw away your session and open a new one.

Done. server + port is now one setting: server_url

jcsp · 2018-10-22T11:47:04Z

src/pybind/mgr/ansible_orchestrator/module.py

+        {'name': 'server_port'},
+        {'name': 'username'},
+        {'name': 'password'},
+        {'name': 'certificate'}  # Ansible runner https server certificate file


Is this a filename? Manager modules should not depend on files on local filesystem, rather the certificate should be stored like the username/password (see how this is done for server side certs in dashboard)

Ok. I will check the dashboard implementation and i will follow your advice.
But this is not part of this PR.. so i prefer to implement it later. I wouldn't like to add features over features in an endless PR

jcsp · 2018-10-22T11:48:24Z

src/pybind/mgr/ansible_orchestrator/module.py

+    OPTIONS = [
+        {'name': 'server_addr'},
+        {'name': 'server_port'},
+        {'name': 'username'},


Does ansible-runner-service actually have/need multi-tenancy (multiple user accounts)? It feels like an unnecessary complexity when the service is just for a single cluster.

This is how Ansible runner service works. Even in a single cluster multiple users with different privileges use to exist.

Some of the following may need input from @pcuzner on the intended security model:

I don't think that ARS has users with different privileges. From reading https://github.com/pcuzner/ansible-runner-service/blob/master/runner_service/controllers/login.py#L46 it seems like they just have a password stored in plain text in their config file, and once you're logged in it doesn't matter what user you are.

Even if ARS did expand its user account concept beyond a dict in the config file, all the ansible playbooks are being run as root out on the cluster nodes, so would it be any meaningful security isolation?

BTW, I also notice that ansible_runner_service has a default crypto secret of "secret" and nothing in the installation instructions about how to set it to something unique per-installation, so hopefully there is a plan for resolving that.

I also don't see any mechanism for revoking JWT tokens, so it seems like even if a user account was removed from the ARS configuration file, login sessions would continue until expiry (default 24 hours).

The username/password handling seems quite superficial, so I'm left wondering why we don't just use the client TLS certificates for authentication. The user/pass stuff seems like it's liable to give a false sense of security -- if the certs are handled properly, they should be enough security.

There aren't users with different privileges in Ansible Runner Service... but probably users will want a certain level of security. (what i mean is that not all the users that can access/use the servers has the same possibilities of doing things)

Ansible Runner Service is quite new ( like me) so we need a little time in order to be completely functional. :-)
By the moment what we have is the "user login" and the use of tokens... and as you said ... we have several ways to improve this point.

sebastian-philipp · 2018-10-26T13:10:55Z

src/pybind/mgr/ansible/ansible_runner_svc.py

+        # Once authenticated this token will be used in all the requests
+        self.token = ""
+
+        self.server_url = "https://{0}:{1}".format(self.server, self.port)


This may not IPv6 safe. For simplicity reasons, I'd suggest to not assemble URLs by hand.

Changed: Thx!
server, port settings are changed to only one setting: server_url

yup

sebastian-philipp · 2018-11-05T14:29:01Z

src/pybind/mgr/ansible/ansible_runner_svc.py

+
+        # Used to verify or not https server identity
+        if not certificate:
+            self.certificate = False


Can we set the default to True? Disabling HTTPS validation is questionable. This is related to https://github.com/requests/requests/blob/master/requests/api.py#L41-L43

There is a setting ( "certificate") config to specify the path to the CA Bundle to use for verification.
If this is not provided then is assumed that we cannot verify the Ansible Server Identity, (this would be like a "dev" mode.)

BTW, probably the name of the config setting is not the best one.... -

In any case, in the Ansible Runner Service there is a change to be implement for using client TLS certificates.
pcuzner/ansible-runner-service#74
And this probably will imply changes in the login method of the orchestrator.

There is a setting ( "certificate") config to specify the path to the CA Bundle to use for verification. If this is not provided then is assumed that we cannot verify the Ansible Server Identity, (this would be like a "dev" mode.)

Instead of assuming per default that we cannot verify the Identity, would it be possible to let the user explicitly disable verification?

Ok ... safe by default... i got it. Sure!!. I will change it asap. thx!

jmolmo · 2018-11-05T19:17:27Z

jenkins retest this please

jmolmo · 2018-11-06T15:47:05Z

jenkins retest this please

sebastian-philipp · 2018-11-09T14:14:39Z

@jmolmo any progress with your virtualenv here? Maybe @noahdesu has a clue?

dotnwat · 2018-11-09T16:01:24Z

@jmolmo what seems to be the issue? i struggled to get the tox tests for the insights plugin to work, but i think you should be able to effectively copy that over for the ansible case.

dotnwat · 2018-11-09T19:51:30Z

i've gotta say that failure is a bit baffling to me. the error seems to be complaining about a bad symbol in the path of insights plugin test, but I cannot even see the word insights appear in your patch!

dotnwat · 2018-11-09T19:54:22Z

jenkins retest this please

jmolmo · 2018-11-12T15:59:26Z

i've gotta say that failure is a bit baffling to me. the error seems to be complaining about a bad symbol in the path of insights plugin test, but I cannot even see the word insights appear in your patch!

Thanks for your help @noahdesu. It seems that the problems resides in some kind of weird dependency between v.env in insights and ansible... i continue investigation ....

dotnwat · 2018-11-13T01:10:26Z

@jmolmo i don't think my fix will work. but kefu has some really helpful tips in that PR that might help fix this issue! #25065

jmolmo · 2018-11-15T17:52:39Z

@noahdesu , @tchaikov Thank you very much for your help with this!! Finally it seems solved!
As Kefu pointed in #25065 the problem was in the reuse of tox settings for the different virtual environments.(same tox workdir)
Probably and to avoid problems in the future it would be also good to change the tox work dir in "insights" and in the "dashboard". What do you think?

tchaikov · 2018-11-16T02:02:36Z

Probably and to avoid problems in the future it would be also good to change the tox work dir in "insights" and in the "dashboard". What do you think?

yeah, that'd be simpler and better than what we have now.

leseb

Minor nits, thanks @jmolmo

src/pybind/mgr/ansible/module.py

src/pybind/mgr/ansible/run-tox.sh

doc/mgr/ansible.rst

jmolmo · 2018-11-22T18:22:02Z

jenkins retest this please

leseb

Thanks! Good initial iteration :)

sebastian-philipp · 2018-11-23T08:35:08Z

Jenkins says:

The following tests FAILED:
	  8 - run-tox-mgr-ansible (Failed)

jmolmo · 2018-11-23T11:37:51Z

jenkins retest this please

sebastian-philipp · 2018-11-23T12:30:39Z

src/pybind/mgr/ansible/tox.ini

+envlist = py27,py3
+skipsdist = true
+toxworkdir = {env:CEPH_BUILD_DIR}/ansible
+minversion = 2.8.1


I will change to see what happen ... no clues about why it was working before (o_o)

jmolmo · 2018-11-26T07:43:37Z

jenkins retest this please

sebastian-philipp · 2018-11-26T10:45:31Z

The arm jenkins failed:

No uninstalled build requires
New python executable in /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/install-deps-python2.7_tmp/bin/python
Installing Setuptools..............................................................................................................................................................................................................................done.
Installing Pip.....................................................................................................................................................................................................................................................................................................................................done.
Downloading/unpacking virtualenv
  Running setup.py egg_info for package virtualenv
    /usr/lib64/python2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'python_requires'
      warnings.warn(msg)
    error in virtualenv setup command: 'extras_require' must be a dictionary whose values are strings or lists of strings containing valid project/version requirement specifiers.
    Complete output from command python setup.py egg_info:
    /usr/lib64/python2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'python_requires'

  warnings.warn(msg)

error in virtualenv setup command: 'extras_require' must be a dictionary whose values are strings or lists of strings containing valid project/version requirement specifiers.

----------------------------------------
Cleaning up...
Command python setup.py egg_info failed with error code 1 in /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/install-deps-python2.7_tmp/build/virtualenv
Storing complete log in /home/jenkins-build/.pip/pip.log
./install-deps.sh: line 383: /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/install-deps-python2.7_tmp/bin/virtualenv: No such file or directory
./install-deps.sh: line 386: /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/install-deps-python2.7/bin/activate: No such file or directory
Requirement already satisfied (use --upgrade to upgrade): setuptools<36,>=0.8 in /usr/lib/python2.7/site-packages
Requirement already satisfied (use --upgrade to upgrade): pip>=7.0 in /usr/lib/python2.7/site-packages
Collecting wheel>=0.24
  Downloading https://files.pythonhosted.org/packages/ff/47/1dfa4795e24fd6f93d5d58602dd716c3f101cfd5a77cd9acbe519b44a0a9/wheel-0.32.3-py2.py3-none-any.whl
Installing collected packages: wheel
Exception:
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/pip/basecommand.py", line 215, in main
    status = self.run(options, args)
  File "/usr/lib/python2.7/site-packages/pip/commands/install.py", line 326, in run
    strip_file_prefix=options.strip_file_prefix,
  File "/usr/lib/python2.7/site-packages/pip/req/req_set.py", line 742, in install
    **kwargs
  File "/usr/lib/python2.7/site-packages/pip/req/req_install.py", line 834, in install
    strip_file_prefix=strip_file_prefix
  File "/usr/lib/python2.7/site-packages/pip/req/req_install.py", line 1037, in move_wheel_files
    strip_file_prefix=strip_file_prefix,
  File "/usr/lib/python2.7/site-packages/pip/wheel.py", line 346, in move_wheel_files
    clobber(source, lib_dir, True)
  File "/usr/lib/python2.7/site-packages/pip/wheel.py", line 317, in clobber
    ensure_dir(destdir)
  File "/usr/lib/python2.7/site-packages/pip/utils/__init__.py", line 83, in ensure_dir
    os.makedirs(path)
  File "/usr/lib64/python2.7/os.py", line 157, in makedirs
    mkdir(name, mode)
OSError: [Errno 13] Permission denied: '/usr/lib/python2.7/site-packages/wheel'

jmolmo · 2018-11-26T10:54:21Z

jenkins retest this please

sebastian-philipp · 2018-11-26T12:00:09Z

Test project /home/jenkins-build/build/workspace/ceph-pull-requests/build
        Start   1: run-rbd-unit-tests.sh
        Start   2: run-cli-tests
        Start   3: test_objectstore_memstore.sh
        Start   4: smoke.sh
        Start   5: unittest_bufferlist.sh
        Start   6: run-tox-mgr-dashboard
        Start   7: run-tox-mgr-insights
        Start   8: run-tox-mgr-ansible
  1/163 Test   #8: run-tox-mgr-ansible .....................***Failed    0.51 sec
Traceback (most recent call last):
  File "/usr/bin/tox", line 9, in <module>
    load_entry_point('tox==1.4.2', 'console_scripts', 'tox')()
  File "/usr/lib/python2.7/site-packages/tox/_cmdline.py", line 25, in main
    retcode = Session(config).runcommand()
  File "/usr/lib/python2.7/site-packages/tox/_cmdline.py", line 273, in runcommand
    return self.subcommand_test()
  File "/usr/lib/python2.7/site-packages/tox/_cmdline.py", line 353, in subcommand_test
    sdist_path = self.sdist()
  File "/usr/lib/python2.7/site-packages/tox/_cmdline.py", line 339, in sdist
    sdist_path = self._makesdist()
  File "/usr/lib/python2.7/site-packages/tox/_cmdline.py", line 291, in _makesdist
    raise tox.exception.MissingFile(setup)
tox.MissingFile: MissingFile: /home/jenkins-build/build/workspace/ceph-pull-requests/src/pybind/mgr/ansible/setup.py

dotnwat · 2018-11-26T20:41:47Z

@sebastian-philipp @jmolmo the failure on ARM was fixed in 9538675#diff-47a21b3706c13e08943e223c12323aa1

Looks like you can resolve that by rebasing onto master.

tchaikov · 2018-11-27T01:43:26Z

retest this please.

sebastian-philipp · 2018-11-27T10:28:25Z

@sebastian-philipp @jmolmo the failure on ARM was fixed in 9538675#diff-47a21b3706c13e08943e223c12323aa1

Looks like you can resolve that by rebasing onto master.

One last thing. Arm64 failes with

  3/160 Test   #8: run-tox-mgr-ansible .....................***Failed    0.82 sec
ERROR: tox version is 1.4.2, required is at least 2.3.1

sebastian-philipp · 2018-11-28T08:28:08Z

Argh. Something went horribly wrong with your git branch.

jmolmo · 2018-11-28T17:10:13Z

After rebasing onto master, i checked build environment/tests execution is OK in local:

`
[jolmomar@juanmipc build]$ sudo make mgr-ansible-test-venv
-- NSS_LIBRARIES: /usr/lib64/libssl3.so;/usr/lib64/libsmime3.so;/usr/lib64/libnss3.so;/usr/lib64/libnssutil3.so
-- NSS_INCLUDE_DIRS: /usr/include/nss3
...
Installing collected packages: virtualenv, toml, pluggy, py, filelock, tox
Successfully installed filelock-3.0.10 pluggy-0.8.0 py-1.7.0 toml-0.10.0 tox-3.5.3 virtualenv-16.1.0
Built target mgr-ansible-test-venv

[jolmomar@juanmipc build]$ sudo ctest -R run-tox-mgr-ansible -V
UpdateCTestConfiguration  from :/home/jolmomar/Code/ceph/build/DartConfiguration.tcl
...
8: ========================== 10 passed in 0.07 seconds ===========================
8: ___________________________________ summary ____________________________________
8:   py27: commands succeeded
8:   congratulations :)
1/1 Test #8: run-tox-mgr-ansible ..............   Passed    1.03 sec

The following tests passed:
	run-tox-mgr-ansible

100% tests passed, 0 tests failed out of 1

Total Test time (real) =   1.04 sec
[jolmomar@juanmipc build]$

`

but unfortunately i have another different fail. @tchaikov, @noahdesu can you give me advice or any clue?

..` 
4/160 Test   #8: run-tox-mgr-ansible .....................***Failed    1.12 sec
Traceback (most recent call last):
  File "/usr/bin/tox", line 9, in <module>
    load_entry_point('tox==1.4.2', 'console_scripts', 'tox')()
  File "/usr/lib/python2.7/site-packages/tox/_cmdline.py", line 25, in main
    retcode = Session(config).runcommand()
  File "/usr/lib/python2.7/site-packages/tox/_cmdline.py", line 273, in runcommand
    return self.subcommand_test()
  File "/usr/lib/python2.7/site-packages/tox/_cmdline.py", line 353, in subcommand_test
    sdist_path = self.sdist()
  File "/usr/lib/python2.7/site-packages/tox/_cmdline.py", line 339, in sdist
    sdist_path = self._makesdist()
  File "/usr/lib/python2.7/site-packages/tox/_cmdline.py", line 291, in _makesdist
    raise tox.exception.MissingFile(setup)
tox.MissingFile: MissingFile: /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/src/pybind/mgr/ansible/setup.py
`

dotnwat · 2018-11-28T21:38:32Z

@jmolmo that error is caused from an older version of tox (1.4.2) being used. that older version is installed on the system, but it looks like you commented out the virtualenv path, which should contain 2.9.1 (at least that's what it looks like from the log).

diff --git a/src/pybind/mgr/ansible/run-tox.sh b/src/pybind/mgr/ansible/run-tox.sh
index d14065e197..951ea23150 100644
--- a/src/pybind/mgr/ansible/run-tox.sh
+++ b/src/pybind/mgr/ansible/run-tox.sh
@@ -17,7 +17,7 @@ fi
 unset PYTHONPATH
 export CEPH_BUILD_DIR=$CEPH_BUILD_DIR
 
-# source ${MGR_ANSIBLE_VIRTUALENV}/bin/activate
+source ${MGR_ANSIBLE_VIRTUALENV}/bin/activate
 
 if [ "$WITH_PYTHON2" = "ON" ]; then
   ENV_LIST+="py27"

A Ceph Manager Orchestrator that uses a external REST API service to execute Ansible playbooks. get_inventory implementation Signed-off-by: Juan Miguel Olmo Martínez <jolmomar@redhat.com> Document how to use CLI through Orchestrator CLI Signed-off-by: Juan Miguel Olmo Martínez <jolmomar@redhat.com>

jmolmo · 2018-11-29T14:03:54Z

@noahdesu, @sebastian-philipp thanks for your help with this. When i finished yesterday with this last error i was completely fustrated. I didn't remember that i commented out the "venv" activate command !!!. It seems that now everything is working ok. I have squashed all the commits in only one. I hope no new issues arise!!!.

sebastian-philipp · 2018-11-29T16:50:10Z

jenkins retest this please

sebastian-philipp · 2018-11-30T08:41:35Z

jenkins retest this please

sebastian-philipp · 2018-11-30T10:56:43Z

I'm going to run Jenkins again for the last time, to be really sure, we're not running again into this tox version dependency issue.

sebastian-philipp · 2018-11-30T10:56:46Z

jenkins retest this please

sebastian-philipp · 2018-11-30T13:20:40Z

jenkins retest this please

batrick added needs-review mgr labels Oct 5, 2018

jcsp reviewed Oct 11, 2018

View reviewed changes

doc/mgr/ansible_orchestrator.rst Outdated Show resolved Hide resolved

doc/mgr/ansible_orchestrator.rst Outdated Show resolved Hide resolved

src/pybind/mgr/ansible_orchestrator/ansible_runner_svc.py Outdated Show resolved Hide resolved

smithfarm added the feature label Oct 15, 2018

sebastian-philipp mentioned this pull request Oct 19, 2018

mgr/orchestrator: use result property in Completion classes #24672

Merged

sebastian-philipp reviewed Oct 19, 2018

View reviewed changes

leseb reviewed Oct 19, 2018

View reviewed changes

jcsp reviewed Oct 22, 2018

View reviewed changes

tserong mentioned this pull request Oct 24, 2018

mgr/deepsea: DeepSea orchestrator module #24610

Merged

3 tasks

sebastian-philipp previously requested changes Oct 26, 2018

View reviewed changes

sebastian-philipp added the orchestrator label Oct 31, 2018

sebastian-philipp reviewed Nov 5, 2018

View reviewed changes

dotnwat mentioned this pull request Nov 12, 2018

test: deactivate venv after tox runs #25065

Closed

3 tasks

sebastian-philipp approved these changes Nov 22, 2018

View reviewed changes

leseb requested changes Nov 22, 2018

View reviewed changes

leseb approved these changes Nov 23, 2018

View reviewed changes

sebastian-philipp suggested changes Nov 23, 2018

View reviewed changes

sebastian-philipp approved these changes Nov 30, 2018

View reviewed changes

sebastian-philipp merged commit ce28976 into ceph:master Dec 3, 2018

	def process_inventary_json(inventory_events, ar_client, playbook_uuid):
	def process_inventory_json(inventory_events, ar_client, playbook_uuid):

	ansible_operation.process_output = process_inventary_json
	ansible_operation.process_output = process_inventory_json



		# List of playbooks names used
		GET_INVENTORY_PLAYBOOK = "probe-disks.yml"

mgr/ansible: Ansible orchestrator module #24445

mgr/ansible: Ansible orchestrator module #24445

Conversation

jmolmo commented Oct 5, 2018

jcsp commented Oct 11, 2018 • edited Loading

jmolmo commented Oct 11, 2018

jcsp commented Oct 12, 2018

jmolmo commented Oct 15, 2018

jmolmo commented Oct 15, 2018

jcsp commented Oct 15, 2018

jmolmo commented Oct 17, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jcsp Oct 22, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmolmo Oct 24, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmolmo Oct 26, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jcsp Oct 22, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sebastian-philipp Nov 5, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmolmo commented Nov 5, 2018

jmolmo commented Nov 6, 2018

sebastian-philipp commented Nov 9, 2018

dotnwat commented Nov 9, 2018

dotnwat commented Nov 9, 2018

dotnwat commented Nov 9, 2018

jmolmo commented Nov 12, 2018

dotnwat commented Nov 13, 2018

jmolmo commented Nov 15, 2018

tchaikov commented Nov 16, 2018 • edited Loading

leseb left a comment

Choose a reason for hiding this comment

jmolmo commented Nov 22, 2018

jcsp commented Oct 11, 2018 •

edited

Loading

jcsp Oct 22, 2018 •

edited

Loading

jmolmo Oct 24, 2018 •

edited

Loading

jmolmo Oct 26, 2018 •

edited

Loading

jcsp Oct 22, 2018 •

edited

Loading

sebastian-philipp Nov 5, 2018 •

edited

Loading

tchaikov commented Nov 16, 2018 •

edited

Loading

sebastian-philipp Nov 23, 2018 •

edited

Loading

jmolmo commented Nov 28, 2018 •

edited by sebastian-philipp

Loading