Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python3 seems to break TASK [ceph-mon : create monitor initial keyring] #3565

Closed
pcfe opened this issue Feb 2, 2019 · 4 comments

Comments

Projects
None yet
3 participants
@pcfe
Copy link
Contributor

commented Feb 2, 2019

Bug Report

What happened:

Using stable-3.2 to control Fedora ARM 29 nodes, when I use Python3 on those ARM nodes; the firewall gets set up as expected but I get a failure on TASK [ceph-mon : create monitor initial keyring].

To be able to run a copy of site.yml.sample, I have to use the default of Pyton2 on those Fedora ARM 29 nodes and can thus not configure the firewall (It is not ceph-ansible's problem that F29 offers no python2-firewall).

details with Python3

While ansible_python_interpreter=/usr/bin/python3 allows me to configure firewall (configure_firewall: True) it fails on TASK [ceph-mon : create monitor initial keyring]

TASK [ceph-mon : create monitor initial keyring] ****************************************************************************************
Saturday 02 February 2019  13:22:05 +0100 (0:00:00.578)       0:03:51.103 *****
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: TypeError: rstrip arg must be None or str
fatal: [odroid-hc2-00]: FAILED! => {"changed": false, "module_stderr": "Traceback (most recent call last):\n  File \"/tmp/ansible_wt9j1z5d/ansible_module_ceph_key.py\", line 697, in <module>\n    main()\n  File \"/tmp/ansible_wt9j1z5d/ansible_module_ceph_key.py\", line 693, in main\n    run_module()\n  File \"/tmp/ansible_wt9j1z5d/ansible_module_ceph_key.py\", line 681, in run_module\n    stdout=out.rstrip(b\"\\r\\n\"),\nTypeError: rstrip arg must be None or str\n", "module_stdout": "", "msg": "MODULE FAILURE", "rc": 1}
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: TypeError: rstrip arg must be None or str
fatal: [odroid-hc2-02]: FAILED! => {"changed": false, "module_stderr": "Traceback (most recent call last):\n  File \"/tmp/ansible_fvc_9har/ansible_module_ceph_key.py\", line 697, in <module>\n    main()\n  File \"/tmp/ansible_fvc_9har/ansible_module_ceph_key.py\", line 693, in main\n    run_module()\n  File \"/tmp/ansible_fvc_9har/ansible_module_ceph_key.py\", line 681, in run_module\n    stdout=out.rstrip(b\"\\r\\n\"),\nTypeError: rstrip arg must be None or str\n", "module_stdout": "", "msg": "MODULE FAILURE", "rc": 1}
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: TypeError: rstrip arg must be None or str
fatal: [odroid-hc2-01]: FAILED! => {"changed": false, "module_stderr": "Traceback (most recent call last):\n  File \"/tmp/ansible_77ptji0m/ansible_module_ceph_key.py\", line 697, in <module>\n    main()\n  File \"/tmp/ansible_77ptji0m/ansible_module_ceph_key.py\", line 693, in main\n    run_module()\n  File \"/tmp/ansible_77ptji0m/ansible_module_ceph_key.py\", line 681, in run_module\n    stdout=out.rstrip(b\"\\r\\n\"),\nTypeError: rstrip arg must be None or str\n", "module_stdout": "", "msg": "MODULE FAILURE", "rc": 1}

note on Python2

without overriding the ansible_python_interpreter, I must set configure_firewall: False as there is no python2-firewall.noarch for Fedora 29. A copy of site.yml.sample runs through just fine with Python2 and I get a working cluster. Obviously I need to deal with firewall myself.

[root@odroid-hc2-00 ~]# ceph -s
  cluster:
    id:     d4fe8da4-bad1-4564-bfaa-358e1ab8e02c
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum odroid-hc2-00,odroid-hc2-01,odroid-hc2-02
    mgr: odroid-hc2-00(active), standbys: odroid-hc2-02, odroid-hc2-01
    osd: 5 osds: 5 up, 5 in
 
  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0B
    usage:   5.01GiB used, 8.18TiB / 8.19TiB avail
    pgs:     
 

I verified with ansible -m setup odroid-hc2-00|less that Python 2 gets used in that case. 2.7.15 to be precise.

What you expected to happen:

Being able to have ceph-ansible set up the firewall on Fedora 29 nodes. Ideally by being able to use ansible_python_interpreter=/usr/bin/python3 (allowing the ansible firewall module to be used).

How to reproduce it (minimal and precise):

  1. Have a RHEL 7 x86_64 machine to run ceph-ansible. Be it ceph-ansible-3.2.4-1.el7cp.noarch or branch stable-3.2 from origin git@github.com:ceph/ceph-ansible.git; I can reproduce the problem with both. (While I could have run ceph-ansible from one of the Fedora ARM 29 nodes, using a RHSM-registered RHEL7 VM simply made it easy for me to yum install ceph-ansible)
  2. Have 5 OSD hosts, one disk each, running Fedora ARM 29 (mine are ODROID-HC2, sadly no RHEL7 for that platform)
  3. cp site.ym.samle site.yml
  4. ansible-playbook site.ym

Share your group_vars files, inventory

This is my play cluster while learning Ceph, so there are ceph_conf_overrides, silly small journal sizes etc, don't mind those.

[ansible@ceph-ansible-rhel7 ceph-ansible]$ pwd
/usr/share/ceph-ansible
[ansible@ceph-ansible-rhel7 ceph-ansible]$ rpm -qf /usr/share/ceph-ansible
ceph-ansible-3.2.4-1.el7cp.noarch

/etc/ansible/hosts is as follows, obviously I toggle the ansible_python_interpreter=… line on or off while rproducing for this bug report. And yes, I just noticed I set the ansible_user needlessly twice ;-)

[ceph-arm-nodes]
odroid-hc2-[00:04]

[ceph-arm-nodes:vars]
ansible_user=ansible
#ansible_python_interpreter=/usr/bin/python3

[ceph-housenet]
ceph-ansible-rhel7
odroid-hc2-[00:04]

[ceph-housenet:vars]
ansible_user=ansible

[mons]
odroid-hc2-[00:02]

# MGRs are typically collocated with MONs
[mgrs]
odroid-hc2-[00:02]

[osds]
odroid-hc2-[00:04]

[clients]
ceph-ansible-rhel7
odroid-hc2-00
[ansible@ceph-ansible-rhel7 group_vars]$ diff all.yml all.yml.sample 
45c45
< cluster: ceph
---
> #cluster: ceph
63d62
< #configure_firewall: False
110d108
< ntp_daemon_type: chronyd
139c137
< ceph_origin: distro
---
> ceph_origin: repository
197d194
< ceph_repository_type: cdn
301d297
< rbd_cache_writethrough_until_flush: "false"
305d300
< rbd_client_directories: false # as per  CEPH125-RHCS3.0-en-1-20180517 pages 45 and 60
350,351d344
< monitor_interface: eth0
< 
374d366
< journal_size: 1024 # As per CEPH125-RHCS3.0-en-1-20180517 page 45
377,378c369
< public_network: 192.168.50.0/24 # HouseNet
< cluster_network: "{{ public_network | regex_replace(' ', '') }}"
---
> #cluster_network: "{{ public_network | regex_replace(' ', '') }}"
528,537d518
< # Overrides from  CEPH125-RHCS3.0-en-1-20180517
< ceph_conf_overrides:
<   global:
<     mon_osd_allow_primary_affinity: 1
<     mon_clock_drift_allowed: 0.5
<     mon_pg_warn_min_per_osd: 0
<     mon_allow_pool_delete: true
<   client:
<     rbd_default_features: 1
< 
585a567,570
> 
> # this is only here for usage with the switch-from-non-containerized-to-containerized-ceph-daemons.yml playbook
> # do not ever change this here
> #switch_to_container: false
[ansible@ceph-ansible-rhel7 ceph-ansible]$ diff /usr/share/ceph-ansible/group_vars/osds.yml.sample /usr/share/ceph-ansible/group_vars/osds.yml
22a23
> copy_admin_key: true
46a48,49
> devices:
>   - /dev/sda
61a65
> dmcrypt: True
89a94
> osd_scenario: non-collocated # collocated was as per CEPH125-RHCS3.0-en-1-20180517 page 36, this is for my fiddlings
131,133c136,137
< # - The devices in 'dedicated_devices' will get one partition for RocksDB DB, called 'block.db'
< #  and one for RocksDB WAL, called 'block.wal'. To use a single partition for RocksDB and WAL together
< #  set bluestore_wal_devices to [].
---
> # - The devices in 'dedicated_devices' will get 1 partition for RocksDB DB, called 'block.db'
> #  and one for RocksDB WAL, called 'block.wal'
147a152,153
> dedicated_devices:
>   - /dev/mmcblk0
156,157d161
< #
< # Set bluestore_wal_devices: [] to use the same partition for RocksDB and WAL.
[ansible@ceph-ansible-rhel7 ceph-ansible]$ diff /usr/share/ceph-ansible/group_vars/clients.yml.sample /usr/share/ceph-ansible/group_vars/clients.yml
18a19
> copy_admin_key: true

Environment details

Environment of RHEL7 x86_64 VM running ceph-ansible:

  • OS (e.g. from /etc/os-release): Red Hat Enterprise Linux Server release 7.6 (Maipo)
  • Kernel (e.g. uname -a): Linux ceph-ansible-rhel7.internal.pcfe.net 3.10.0-862.el7.x86_64 #1 SMP Wed Mar 21 18:14:51 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux
  • Docker version if applicable (e.g. docker version): n/a
  • Ansible version (e.g. ansible-playbook --version): ansible-playbook 2.6.12
    config file = /usr/share/ceph-ansible/ansible.cfg
    configured module search path = [u'/home/ansible/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
    ansible python module location = /usr/lib/python2.7/site-packages/ansible
    executable location = /usr/bin/ansible-playbook
    python version = 2.7.5 (default, Sep 12 2018, 05:31:16) [GCC 4.8.5 20150623 (Red Hat 4.8.5-36)]
  • ceph-ansible version (e.g. git head or tag or stable branch): ceph-ansible-3.2.4-1.el7cp.noarch and stable-3.2 from git both allow to reproduce the problem
  • Ceph version (e.g. ceph -v): ceph version 12.2.8-52.el7cp (3af3ca15b68572a357593c261f95038d02f46201) luminous (stable)

Environment of Fedora ARM 29 OSD nodes:

  • OS (e.g. from /etc/os-release): Fedora release 29 (Twenty Nine)
  • Kernel (e.g. uname -a): Linux odroid-hc2-00.fritz.box 4.20.3-200.fc29.armv7hl #1 SMP Thu Jan 17 17:09:08 UTC 2019 armv7l armv7l armv7l GNU/Linux
  • Docker version if applicable (e.g. docker version): n/a
  • Ansible version (e.g. ansible-playbook --version): ansible-playbook 2.7.5
    -m setup run on the RHEL7 b
     "ansible_python": {
            "executable": "/usr/bin/python", 
            "has_sslcontext": true, 
            "type": "CPython", 
            "version": {
                "major": 2, 
                "micro": 15, 
                "minor": 7, 
                "releaselevel": "final", 
                "serial": 0
            }, 
            "version_info": [
                2, 
                7, 
                15, 
                "final", 
                0
            ]
        }, 
* ceph-ansible version (e.g. `git head or tag or stable branch`):
* Ceph version (e.g. `ceph -v`): ceph version 12.2.10 (177915764b752804194937482a39e95e0ca3de94) luminous (stable)

# additional info
I do not expect this to get fixed in stable-3.2, after all the firewall config functionality in ceph-ansible is quite recent, but it would be nice if it was fixed in the next release
@dsavineau

This comment has been minimized.

Copy link
Collaborator

commented Feb 15, 2019

This is coming from the ceph_key module and python3. The issue was fixed in f5c2ca3.
However the fix is only available on master.
@guits @leseb do we support python3 with ceph-ansible 3.2 ?

@leseb

This comment has been minimized.

Copy link
Contributor

commented Feb 18, 2019

@dsavineau we don't have any official statement but I guess we should. This commit is part of a huge change, so we can't easily backport the whole PR but backporting this commit only sounds good. @dsavineau do you mind creating the backport for this? Thanks!

@dsavineau

This comment has been minimized.

Copy link
Collaborator

commented Feb 18, 2019

@leseb sure I'll try to send the PR today

@pcfe

This comment has been minimized.

Copy link
Contributor Author

commented Feb 20, 2019

Thank you, much appreciated.
Closing as it was merged into stable-3.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.