New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ceph-disk: rhel 7 and up can use partprobe #3838
Conversation
Previous rhedaht based versions needed to use partx when creating new partitions. This is no longer an issue with rhel 7 and up. this breaks ceph-disk for rhel 7 due to the fact partx no longer supports the parameters used. I have added a check to make all versions of redhat from 7 and up to use partprobe instead.
SUCCESS: the output of run-make-check.sh on centos-centos7 for 188de86 is http://paste2.org/EKNFtmdx |
@itamarla could you create a bug for this at http://tracker.ceph.com/projects/ceph/issues/new and open a new pull request against hammer instead of master ? It looks like a bug fix we need for hammer. |
can you please run bash -x test/ceph-disk-root.sh from sources after compiling on rhel7 ? That will demonstrate your patch works in a repeatable way. Last time I checked centos 7 with partx worked. Maybe the scope of your patch should be reduced to just rhel and not centos ? |
Created bug in tracker: Bug http://tracker.ceph.com/issues/10987 |
@itamarla thanks ! could you please amend the commit message to include "Fixes: #10987" ? It will help redmine at http://tracker.ceph.com/issues/10987 find the commit that fixes the problem. |
On centos 7 it fails with ValueError: invalid literal for float(): 7.0.1406. You can reproduce the error with the following: bash-4.2$ sudo test/ceph-disk.sh test_activate_dev + source test/test_btrfs_common.sh + PS4='${FUNCNAME[0]}: $LINENO: ' : 24: export PATH=:/sbin:/bin:/usr/sbin:/usr/bin : 24: PATH=:/sbin:/bin:/usr/sbin:/usr/bin : 25: DIR=test-ceph-disk : 26: OSD_DATA=test-ceph-disk/osd : 27: MON_ID=a : 28: MONA=127.0.0.1:7451 : 29: TEST_POOL=rbd :: 30: uuidgen : 30: FSID=6a9d709c-38fe-4b18-91fa-89c2f22258c5 : 31: export CEPH_CONF=test-ceph-disk/ceph.conf : 31: CEPH_CONF=test-ceph-disk/ceph.conf : 32: export 'CEPH_ARGS=--fsid 6a9d709c-38fe-4b18-91fa-89c2f22258c5' : 32: CEPH_ARGS='--fsid 6a9d709c-38fe-4b18-91fa-89c2f22258c5' : 33: CEPH_ARGS+=' --chdir=' : 34: CEPH_ARGS+=' --run-dir=test-ceph-disk' : 35: CEPH_ARGS+=' --osd-failsafe-full-ratio=.99' : 36: CEPH_ARGS+=' --mon-host=127.0.0.1:7451' : 37: CEPH_ARGS+=' --log-file=test-ceph-disk/$name.log' : 38: CEPH_ARGS+=' --pid-file=test-ceph-disk/$name.pidfile' : 39: CEPH_ARGS+=' --osd-pool-default-erasure-code-directory=.libs' : 40: CEPH_ARGS+=' --auth-supported=none' : 41: CEPH_ARGS+=' --osd-journal-size=100' : 42: CEPH_DISK_ARGS= : 43: CEPH_DISK_ARGS+=' --statedir=test-ceph-disk' : 44: CEPH_DISK_ARGS+=' --sysconfdir=test-ceph-disk' : 45: CEPH_DISK_ARGS+=' --prepend-to-path=' : 46: CEPH_DISK_ARGS+=' --verbose' : 47: TIMEOUT=360 :: 49: which cat : 49: cat=/bin/cat :: 50: which timeout : 50: timeout=/bin/timeout :: 51: which diff : 51: diff=/bin/diff :: 52: which mkdir : 52: mkdir=/bin/mkdir :: 53: which rm : 53: rm=/bin/rm : 481: run test_activate_dev run: 465: local default_actions run: 466: default_actions+='test_path ' run: 467: default_actions+='test_no_path ' run: 468: default_actions+='test_find_cluster_by_uuid ' run: 469: default_actions+='test_prepend_to_path ' run: 470: default_actions+='test_activate_dir_magic ' run: 471: default_actions+='test_activate_dir ' run: 472: default_actions+='test_keyring_path ' run: 473: local actions=test_activate_dev run: 474: for action in '$actions' run: 475: setup setup: 56: teardown teardown: 64: kill_daemons kkill_daemons: 92: find test-ceph-disk kkill_daemons: 92: grep pidfile find: 'test-ceph-disk': No such file or directory tteardown: 65: stat -f -c %T . teardown: 65: '[' ext2/ext3 == btrfs ']' teardown: 69: rm -fr test-ceph-disk setup: 57: mkdir test-ceph-disk setup: 58: mkdir test-ceph-disk/osd setup: 60: touch test-ceph-disk/ceph.conf run: 476: test_activate_dev ttest_activate_dev: 342: id -u test_activate_dev: 342: test 0 '!=' 0 ttest_activate_dev: 347: create_dev vdf.disk ccreate_dev: 300: local name=vdf.disk ccreate_dev: 302: dd if=/dev/zero of=vdf.disk bs=1024k count=200 200+0 records in 200+0 records out 209715200 bytes (210 MB) copied, 0.200794 s, 1.0 GB/s ccreate_dev: 303: losetup --find vdf.disk cccreate_dev: 304: losetup --associated vdf.disk cccreate_dev: 304: cut -f1 -d: ccreate_dev: 304: local dev=/dev/loop2 ccreate_dev: 305: ceph-disk zap /dev/loop2 ccreate_dev: 306: echo /dev/loop2 test_activate_dev: 347: local disk=/dev/loop2 ttest_activate_dev: 348: create_dev vdg.disk ccreate_dev: 300: local name=vdg.disk ccreate_dev: 302: dd if=/dev/zero of=vdg.disk bs=1024k count=200 200+0 records in 200+0 records out 209715200 bytes (210 MB) copied, 0.218903 s, 958 MB/s ccreate_dev: 303: losetup --find vdg.disk cccreate_dev: 304: losetup --associated vdg.disk cccreate_dev: 304: cut -f1 -d: ccreate_dev: 304: local dev=/dev/loop3 ccreate_dev: 305: ceph-disk zap /dev/loop3 ccreate_dev: 306: echo /dev/loop3 test_activate_dev: 348: local journal=/dev/loop3 ttest_activate_dev: 349: create_dev vdh.disk ccreate_dev: 300: local name=vdh.disk ccreate_dev: 302: dd if=/dev/zero of=vdh.disk bs=1024k count=200 200+0 records in 200+0 records out 209715200 bytes (210 MB) copied, 0.213236 s, 983 MB/s ccreate_dev: 303: losetup --find vdh.disk cccreate_dev: 304: losetup --associated vdh.disk cccreate_dev: 304: cut -f1 -d: ccreate_dev: 304: local dev=/dev/loop4 ccreate_dev: 305: ceph-disk zap /dev/loop4 ccreate_dev: 306: echo /dev/loop4 test_activate_dev: 349: local newdisk=/dev/loop4 test_activate_dev: 351: activate_dev_body /dev/loop2 /dev/loop3 /dev/loop4 activate_dev_body: 321: local disk=/dev/loop2 activate_dev_body: 322: local journal=/dev/loop3 activate_dev_body: 323: local newdisk=/dev/loop4 activate_dev_body: 325: setup setup: 56: teardown teardown: 64: kill_daemons kkill_daemons: 92: find test-ceph-disk kkill_daemons: 92: grep pidfile tteardown: 65: stat -f -c %T . teardown: 65: '[' ext2/ext3 == btrfs ']' teardown: 69: rm -fr test-ceph-disk setup: 57: mkdir test-ceph-disk setup: 58: mkdir test-ceph-disk/osd setup: 60: touch test-ceph-disk/ceph.conf activate_dev_body: 326: run_mon run_mon: 73: local mon_dir=test-ceph-disk/a run_mon: 76: ./ceph-mon --id a --mkfs --mon-data=test-ceph-disk/a --mon-initial-members=a ./ceph-mon: mon.noname-a 127.0.0.1:7451/0 is local, renaming to mon.a ./ceph-mon: set fsid to 6a9d709c-38fe-4b18-91fa-89c2f22258c5 ./ceph-mon: created monfs at test-ceph-disk/a for mon.a run_mon: 83: ./ceph-mon --id a --mon-data=test-ceph-disk/a --mon-osd-full-ratio=.99 --mon-data-avail-crit=1 --mon-cluster-log-file=test-ceph-disk/a/log --public-addr 127.0.0.1:7451 activate_dev_body: 327: test_activate /dev/loop2 /dev/loop2p1 /dev/loop3 test_activate: 202: local to_prepare=/dev/loop2 test_activate: 203: local to_activate=/dev/loop2p1 test_activate: 204: local journal=/dev/loop3 test_activate: 206: /bin/mkdir -p test-ceph-disk/osd test_activate: 208: ./ceph-disk --statedir=test-ceph-disk --sysconfdir=test-ceph-disk --prepend-to-path= --verbose prepare /dev/loop2 /dev/loop3 INFO:ceph-disk:Running command: ceph-osd --cluster=ceph --show-config-value=fsid INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_mkfs_type INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_type INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_mkfs_options_xfs INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mkfs_options_xfs INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_mount_options_xfs INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mount_options_xfs INFO:ceph-disk:Running command: ceph-osd --cluster=ceph --show-config-value=osd_journal_size INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_cryptsetup_parameters INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_dmcrypt_key_size INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_dmcrypt_type INFO:ceph-disk:Running command: /sbin/parted --machine -- /dev/loop3 print WARNING:ceph-disk:OSD will not be hot-swappable if journal is not the same device as the osd data DEBUG:ceph-disk:Creating journal partition num 1 size 100 on /dev/loop3 INFO:ceph-disk:Running command: /sbin/sgdisk --new=1:0:+100M --change-name=1:ceph journal --partition-guid=1:c0128d4c-2580-44c2-b8be-7d76b227ec80 --typecode=1:45b0969e-9b03-4f30-b4c6-b4b80ceff106 --mbrtogpt -- /dev/loop3 Information: Moved requested sector from 34 to 2048 in order to align on 2048-sector boundaries. Warning: The kernel is still using the old partition table. The new table will be used at the next reboot. The operation has completed successfully. Traceback (most recent call last): File "./ceph-disk", line 2995, in main() File "./ceph-disk", line 2973, in main args.func(args) File "./ceph-disk", line 1649, in main_prepare luks=luks File "./ceph-disk", line 1286, in prepare_journal return prepare_journal_dev(data, journal, journal_size, journal_uuid, journal_dm_keypath, cryptsetup_parameters, luks) File "./ceph-disk", line 1193, in prepare_journal_dev update_partition('-a', journal, 'prepared') File "./ceph-disk", line 1029, in update_partition if float(rel) < 7.0 and platform_distro().startswith(('centos', 'red', 'scientific')): ValueError: invalid literal for float(): 7.0.1406 test_activate: 209: return 1 activate_dev_body: 327: return 1 test_activate_dev: 352: status=1 test_activate_dev: 354: destroy_dev vdf.disk /dev/loop2 destroy_dev: 310: local name=vdf.disk destroy_dev: 311: local dev=/dev/loop2 destroy_dev: 313: for partition in 1 2 3 4 destroy_dev: 314: umount /dev/loop2p1 umount: /dev/loop2p1: mountpoint not found destroy_dev: 314: true destroy_dev: 313: for partition in 1 2 3 4 destroy_dev: 314: umount /dev/loop2p2 umount: /dev/loop2p2: mountpoint not found destroy_dev: 314: true destroy_dev: 313: for partition in 1 2 3 4 destroy_dev: 314: umount /dev/loop2p3 umount: /dev/loop2p3: mountpoint not found destroy_dev: 314: true destroy_dev: 313: for partition in 1 2 3 4 destroy_dev: 314: umount /dev/loop2p4 umount: /dev/loop2p4: mountpoint not found destroy_dev: 314: true destroy_dev: 316: losetup --detach /dev/loop2 destroy_dev: 317: rm vdf.disk test_activate_dev: 355: destroy_dev vdg.disk /dev/loop3 destroy_dev: 310: local name=vdg.disk destroy_dev: 311: local dev=/dev/loop3 destroy_dev: 313: for partition in 1 2 3 4 destroy_dev: 314: umount /dev/loop3p1 umount: /dev/loop3p1: mountpoint not found destroy_dev: 314: true destroy_dev: 313: for partition in 1 2 3 4 destroy_dev: 314: umount /dev/loop3p2 umount: /dev/loop3p2: mountpoint not found destroy_dev: 314: true destroy_dev: 313: for partition in 1 2 3 4 destroy_dev: 314: umount /dev/loop3p3 umount: /dev/loop3p3: mountpoint not found destroy_dev: 314: true destroy_dev: 313: for partition in 1 2 3 4 destroy_dev: 314: umount /dev/loop3p4 umount: /dev/loop3p4: mountpoint not found destroy_dev: 314: true destroy_dev: 316: losetup --detach /dev/loop3 destroy_dev: 317: rm vdg.disk test_activate_dev: 356: destroy_dev vdh.disk /dev/loop4 destroy_dev: 310: local name=vdh.disk destroy_dev: 311: local dev=/dev/loop4 destroy_dev: 313: for partition in 1 2 3 4 destroy_dev: 314: umount /dev/loop4p1 umount: /dev/loop4p1: mountpoint not found destroy_dev: 314: true destroy_dev: 313: for partition in 1 2 3 4 destroy_dev: 314: umount /dev/loop4p2 umount: /dev/loop4p2: mountpoint not found destroy_dev: 314: true destroy_dev: 313: for partition in 1 2 3 4 destroy_dev: 314: umount /dev/loop4p3 umount: /dev/loop4p3: mountpoint not found destroy_dev: 314: true destroy_dev: 313: for partition in 1 2 3 4 destroy_dev: 314: umount /dev/loop4p4 umount: /dev/loop4p4: mountpoint not found destroy_dev: 314: true destroy_dev: 316: losetup --detach /dev/loop4 destroy_dev: 317: rm vdh.disk test_activate_dev: 358: return 1 run: 476: return 1 |
|
||
# This was fixed in rhel 7.x, checking for it. | ||
rel = platform_information()[1] | ||
if float(rel) < 7.0 and platform_distro().startswith(('centos', 'red', 'scientific')): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this doesn't need a call to float()
here as it can potentially break (as @dachary points in the comments).
A simple `startswith('7') would be enough.
Also, calling the version rel
is misleading. I would suggest to keep the first if
condition and then add the version
comparison afterwards to make it explicit that this is only intended for RHEL:
# On RHEL and CentOS distros, calling partprobe forces a reboot of the
# server. Since we are not resizing partitons so we rely on calling
# partx
if platform_distro().startswith('red'):
# This was fixed in rhel 7.x, checking for it.
_, version, _ = platform_information()
if not version.startswith('7'):
That was my original intention but then I thought it would break again once rhel 8 is out. |
@itamarla when RHEL8 comes out and only if it displays a similar issue we can address it then. Note though that a) RHEL7 doesn't need this fix and b) we have no idea what RHEL8 will need. |
@itamarla it looks like @alfredodeza tested with rhel7 and it works. It would greatly help if you can list the steps to reproduce the problem you have. |
when you create an osd using |
@itamarla can you show the error output ? That will clarify things. The tests are done on real hardware. |
SUCCESS: the output of run-make-check.sh on centos-centos7 for 9c76e42 is http://paste2.org/B8gAHLMX |
this code as is will not work. My example was not meant to be a full replacement. The |
It will, the next block of code runs partprobe for all
|
# This was fixed in rhel 7.x, checking for it. | ||
_, version, _ = platform_information() | ||
if not version.startswith('7'): | ||
if platform_distro().startswith(('centos', 'red', 'scientific')): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will never match because of 1026. Can't start with centos if it starts with red.
Temporarily closing pending more information / discussions at http://tracker.ceph.com/issues/10987#note-6. It can be re-opened later, when the bug is reproduced. |
Although partprobe could be used on rhel7, partx is also valid and tests at http://tracker.ceph.com/issues/10987#note-15 show it works. Whatever issue http://tracker.ceph.com/issues/10987 is about does not seem related to the use of partx or partprobe. |
Previous rhedaht based versions needed to use partx when creating new partitions.
This is no longer an issue with rhel 7 and up. this breaks ceph-disk for
rhel 7 due to the fact partx no longer supports the parameters used. I have added
a check to make all versions of redhat from 7 and up to use partprobe instead.
Signed-off-by: Itamar Landsman itamar@itamar.org