Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ceph-disk: rhel 7 and up can use partprobe #3838

Closed
wants to merge 2 commits into from
Closed

ceph-disk: rhel 7 and up can use partprobe #3838

wants to merge 2 commits into from

Conversation

itamarla
Copy link

@itamarla itamarla commented Mar 2, 2015

Previous rhedaht based versions needed to use partx when creating new partitions.
This is no longer an issue with rhel 7 and up. this breaks ceph-disk for
rhel 7 due to the fact partx no longer supports the parameters used. I have added
a check to make all versions of redhat from 7 and up to use partprobe instead.

Signed-off-by: Itamar Landsman itamar@itamar.org

Previous rhedaht based versions needed to use partx when creating new partitions.
This is no longer an issue with rhel 7 and up. this breaks ceph-disk for
rhel 7 due to the fact partx no longer supports the parameters used. I have added
a check to make all versions of redhat from 7 and up to use partprobe instead.
@loic-bot
Copy link

loic-bot commented Mar 2, 2015

SUCCESS: the output of run-make-check.sh on centos-centos7 for 188de86 is http://paste2.org/EKNFtmdx

:octocat: Sent from GH.

@itamarla itamarla changed the title rhel 7 and up can use partprobe ceph-disk: rhel 7 and up can use partprobe Mar 2, 2015
@ghost ghost added cleanup core labels Mar 2, 2015
@ghost ghost self-assigned this Mar 2, 2015
@ghost ghost added bug-fix and removed cleanup labels Mar 2, 2015
@ghost
Copy link

ghost commented Mar 2, 2015

@itamarla could you create a bug for this at http://tracker.ceph.com/projects/ceph/issues/new and open a new pull request against hammer instead of master ? It looks like a bug fix we need for hammer.

@ghost
Copy link

ghost commented Mar 2, 2015

can you please run bash -x test/ceph-disk-root.sh from sources after compiling on rhel7 ? That will demonstrate your patch works in a repeatable way. Last time I checked centos 7 with partx worked. Maybe the scope of your patch should be reduced to just rhel and not centos ?

@itamarla
Copy link
Author

itamarla commented Mar 2, 2015

Created bug in tracker: Bug http://tracker.ceph.com/issues/10987
This fix can be applied to firefly and above.

@ghost
Copy link

ghost commented Mar 2, 2015

@itamarla thanks ! could you please amend the commit message to include "Fixes: #10987" ? It will help redmine at http://tracker.ceph.com/issues/10987 find the commit that fixes the problem.

@ghost
Copy link

ghost commented Mar 2, 2015

On centos 7 it fails with

ValueError: invalid literal for float(): 7.0.1406
. You can reproduce the error with the following:

bash-4.2$ sudo test/ceph-disk.sh test_activate_dev
+ source test/test_btrfs_common.sh
+ PS4='${FUNCNAME[0]}: $LINENO: '
: 24: export PATH=:/sbin:/bin:/usr/sbin:/usr/bin
: 24: PATH=:/sbin:/bin:/usr/sbin:/usr/bin
: 25: DIR=test-ceph-disk
: 26: OSD_DATA=test-ceph-disk/osd
: 27: MON_ID=a
: 28: MONA=127.0.0.1:7451
: 29: TEST_POOL=rbd
:: 30: uuidgen
: 30: FSID=6a9d709c-38fe-4b18-91fa-89c2f22258c5
: 31: export CEPH_CONF=test-ceph-disk/ceph.conf
: 31: CEPH_CONF=test-ceph-disk/ceph.conf
: 32: export 'CEPH_ARGS=--fsid 6a9d709c-38fe-4b18-91fa-89c2f22258c5'
: 32: CEPH_ARGS='--fsid 6a9d709c-38fe-4b18-91fa-89c2f22258c5'
: 33: CEPH_ARGS+=' --chdir='
: 34: CEPH_ARGS+=' --run-dir=test-ceph-disk'
: 35: CEPH_ARGS+=' --osd-failsafe-full-ratio=.99'
: 36: CEPH_ARGS+=' --mon-host=127.0.0.1:7451'
: 37: CEPH_ARGS+=' --log-file=test-ceph-disk/$name.log'
: 38: CEPH_ARGS+=' --pid-file=test-ceph-disk/$name.pidfile'
: 39: CEPH_ARGS+=' --osd-pool-default-erasure-code-directory=.libs'
: 40: CEPH_ARGS+=' --auth-supported=none'
: 41: CEPH_ARGS+=' --osd-journal-size=100'
: 42: CEPH_DISK_ARGS=
: 43: CEPH_DISK_ARGS+=' --statedir=test-ceph-disk'
: 44: CEPH_DISK_ARGS+=' --sysconfdir=test-ceph-disk'
: 45: CEPH_DISK_ARGS+=' --prepend-to-path='
: 46: CEPH_DISK_ARGS+=' --verbose'
: 47: TIMEOUT=360
:: 49: which cat
: 49: cat=/bin/cat
:: 50: which timeout
: 50: timeout=/bin/timeout
:: 51: which diff
: 51: diff=/bin/diff
:: 52: which mkdir
: 52: mkdir=/bin/mkdir
:: 53: which rm
: 53: rm=/bin/rm
: 481: run test_activate_dev
run: 465: local default_actions
run: 466: default_actions+='test_path '
run: 467: default_actions+='test_no_path '
run: 468: default_actions+='test_find_cluster_by_uuid '
run: 469: default_actions+='test_prepend_to_path '
run: 470: default_actions+='test_activate_dir_magic '
run: 471: default_actions+='test_activate_dir '
run: 472: default_actions+='test_keyring_path '
run: 473: local actions=test_activate_dev
run: 474: for action in '$actions'
run: 475: setup
setup: 56: teardown
teardown: 64: kill_daemons
kkill_daemons: 92: find test-ceph-disk
kkill_daemons: 92: grep pidfile
find: 'test-ceph-disk': No such file or directory
tteardown: 65: stat -f -c %T .
teardown: 65: '[' ext2/ext3 == btrfs ']'
teardown: 69: rm -fr test-ceph-disk
setup: 57: mkdir test-ceph-disk
setup: 58: mkdir test-ceph-disk/osd
setup: 60: touch test-ceph-disk/ceph.conf
run: 476: test_activate_dev
ttest_activate_dev: 342: id -u
test_activate_dev: 342: test 0 '!=' 0
ttest_activate_dev: 347: create_dev vdf.disk
ccreate_dev: 300: local name=vdf.disk
ccreate_dev: 302: dd if=/dev/zero of=vdf.disk bs=1024k count=200
200+0 records in
200+0 records out
209715200 bytes (210 MB) copied, 0.200794 s, 1.0 GB/s
ccreate_dev: 303: losetup --find vdf.disk
cccreate_dev: 304: losetup --associated vdf.disk
cccreate_dev: 304: cut -f1 -d:
ccreate_dev: 304: local dev=/dev/loop2
ccreate_dev: 305: ceph-disk zap /dev/loop2
ccreate_dev: 306: echo /dev/loop2
test_activate_dev: 347: local disk=/dev/loop2
ttest_activate_dev: 348: create_dev vdg.disk
ccreate_dev: 300: local name=vdg.disk
ccreate_dev: 302: dd if=/dev/zero of=vdg.disk bs=1024k count=200
200+0 records in
200+0 records out
209715200 bytes (210 MB) copied, 0.218903 s, 958 MB/s
ccreate_dev: 303: losetup --find vdg.disk
cccreate_dev: 304: losetup --associated vdg.disk
cccreate_dev: 304: cut -f1 -d:
ccreate_dev: 304: local dev=/dev/loop3
ccreate_dev: 305: ceph-disk zap /dev/loop3
ccreate_dev: 306: echo /dev/loop3
test_activate_dev: 348: local journal=/dev/loop3
ttest_activate_dev: 349: create_dev vdh.disk
ccreate_dev: 300: local name=vdh.disk
ccreate_dev: 302: dd if=/dev/zero of=vdh.disk bs=1024k count=200
200+0 records in
200+0 records out
209715200 bytes (210 MB) copied, 0.213236 s, 983 MB/s
ccreate_dev: 303: losetup --find vdh.disk
cccreate_dev: 304: losetup --associated vdh.disk
cccreate_dev: 304: cut -f1 -d:
ccreate_dev: 304: local dev=/dev/loop4
ccreate_dev: 305: ceph-disk zap /dev/loop4
ccreate_dev: 306: echo /dev/loop4
test_activate_dev: 349: local newdisk=/dev/loop4
test_activate_dev: 351: activate_dev_body /dev/loop2 /dev/loop3 /dev/loop4
activate_dev_body: 321: local disk=/dev/loop2
activate_dev_body: 322: local journal=/dev/loop3
activate_dev_body: 323: local newdisk=/dev/loop4
activate_dev_body: 325: setup
setup: 56: teardown
teardown: 64: kill_daemons
kkill_daemons: 92: find test-ceph-disk
kkill_daemons: 92: grep pidfile
tteardown: 65: stat -f -c %T .
teardown: 65: '[' ext2/ext3 == btrfs ']'
teardown: 69: rm -fr test-ceph-disk
setup: 57: mkdir test-ceph-disk
setup: 58: mkdir test-ceph-disk/osd
setup: 60: touch test-ceph-disk/ceph.conf
activate_dev_body: 326: run_mon
run_mon: 73: local mon_dir=test-ceph-disk/a
run_mon: 76: ./ceph-mon --id a --mkfs --mon-data=test-ceph-disk/a --mon-initial-members=a
./ceph-mon: mon.noname-a 127.0.0.1:7451/0 is local, renaming to mon.a
./ceph-mon: set fsid to 6a9d709c-38fe-4b18-91fa-89c2f22258c5
./ceph-mon: created monfs at test-ceph-disk/a for mon.a
run_mon: 83: ./ceph-mon --id a --mon-data=test-ceph-disk/a --mon-osd-full-ratio=.99 --mon-data-avail-crit=1 --mon-cluster-log-file=test-ceph-disk/a/log --public-addr 127.0.0.1:7451
activate_dev_body: 327: test_activate /dev/loop2 /dev/loop2p1 /dev/loop3
test_activate: 202: local to_prepare=/dev/loop2
test_activate: 203: local to_activate=/dev/loop2p1
test_activate: 204: local journal=/dev/loop3
test_activate: 206: /bin/mkdir -p test-ceph-disk/osd
test_activate: 208: ./ceph-disk --statedir=test-ceph-disk --sysconfdir=test-ceph-disk --prepend-to-path= --verbose prepare /dev/loop2 /dev/loop3
INFO:ceph-disk:Running command: ceph-osd --cluster=ceph --show-config-value=fsid
INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_mkfs_type
INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_type
INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_mkfs_options_xfs
INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mkfs_options_xfs
INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_mount_options_xfs
INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mount_options_xfs
INFO:ceph-disk:Running command: ceph-osd --cluster=ceph --show-config-value=osd_journal_size
INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_cryptsetup_parameters
INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_dmcrypt_key_size
INFO:ceph-disk:Running command: ceph-conf --cluster=ceph --name=osd. --lookup osd_dmcrypt_type
INFO:ceph-disk:Running command: /sbin/parted --machine -- /dev/loop3 print
WARNING:ceph-disk:OSD will not be hot-swappable if journal is not the same device as the osd data
DEBUG:ceph-disk:Creating journal partition num 1 size 100 on /dev/loop3
INFO:ceph-disk:Running command: /sbin/sgdisk --new=1:0:+100M --change-name=1:ceph journal --partition-guid=1:c0128d4c-2580-44c2-b8be-7d76b227ec80 --typecode=1:45b0969e-9b03-4f30-b4c6-b4b80ceff106 --mbrtogpt -- /dev/loop3
Information: Moved requested sector from 34 to 2048 in
order to align on 2048-sector boundaries.
Warning: The kernel is still using the old partition table.
The new table will be used at the next reboot.
The operation has completed successfully.
Traceback (most recent call last):
  File "./ceph-disk", line 2995, in 
    main()
  File "./ceph-disk", line 2973, in main
    args.func(args)
  File "./ceph-disk", line 1649, in main_prepare
    luks=luks
  File "./ceph-disk", line 1286, in prepare_journal
    return prepare_journal_dev(data, journal, journal_size, journal_uuid, journal_dm_keypath, cryptsetup_parameters, luks)
  File "./ceph-disk", line 1193, in prepare_journal_dev
    update_partition('-a', journal, 'prepared')
  File "./ceph-disk", line 1029, in update_partition
    if float(rel) < 7.0 and platform_distro().startswith(('centos', 'red', 'scientific')):
ValueError: invalid literal for float(): 7.0.1406
test_activate: 209: return 1
activate_dev_body: 327: return 1
test_activate_dev: 352: status=1
test_activate_dev: 354: destroy_dev vdf.disk /dev/loop2
destroy_dev: 310: local name=vdf.disk
destroy_dev: 311: local dev=/dev/loop2
destroy_dev: 313: for partition in 1 2 3 4
destroy_dev: 314: umount /dev/loop2p1
umount: /dev/loop2p1: mountpoint not found
destroy_dev: 314: true
destroy_dev: 313: for partition in 1 2 3 4
destroy_dev: 314: umount /dev/loop2p2
umount: /dev/loop2p2: mountpoint not found
destroy_dev: 314: true
destroy_dev: 313: for partition in 1 2 3 4
destroy_dev: 314: umount /dev/loop2p3
umount: /dev/loop2p3: mountpoint not found
destroy_dev: 314: true
destroy_dev: 313: for partition in 1 2 3 4
destroy_dev: 314: umount /dev/loop2p4
umount: /dev/loop2p4: mountpoint not found
destroy_dev: 314: true
destroy_dev: 316: losetup --detach /dev/loop2
destroy_dev: 317: rm vdf.disk
test_activate_dev: 355: destroy_dev vdg.disk /dev/loop3
destroy_dev: 310: local name=vdg.disk
destroy_dev: 311: local dev=/dev/loop3
destroy_dev: 313: for partition in 1 2 3 4
destroy_dev: 314: umount /dev/loop3p1
umount: /dev/loop3p1: mountpoint not found
destroy_dev: 314: true
destroy_dev: 313: for partition in 1 2 3 4
destroy_dev: 314: umount /dev/loop3p2
umount: /dev/loop3p2: mountpoint not found
destroy_dev: 314: true
destroy_dev: 313: for partition in 1 2 3 4
destroy_dev: 314: umount /dev/loop3p3
umount: /dev/loop3p3: mountpoint not found
destroy_dev: 314: true
destroy_dev: 313: for partition in 1 2 3 4
destroy_dev: 314: umount /dev/loop3p4
umount: /dev/loop3p4: mountpoint not found
destroy_dev: 314: true
destroy_dev: 316: losetup --detach /dev/loop3
destroy_dev: 317: rm vdg.disk
test_activate_dev: 356: destroy_dev vdh.disk /dev/loop4
destroy_dev: 310: local name=vdh.disk
destroy_dev: 311: local dev=/dev/loop4
destroy_dev: 313: for partition in 1 2 3 4
destroy_dev: 314: umount /dev/loop4p1
umount: /dev/loop4p1: mountpoint not found
destroy_dev: 314: true
destroy_dev: 313: for partition in 1 2 3 4
destroy_dev: 314: umount /dev/loop4p2
umount: /dev/loop4p2: mountpoint not found
destroy_dev: 314: true
destroy_dev: 313: for partition in 1 2 3 4
destroy_dev: 314: umount /dev/loop4p3
umount: /dev/loop4p3: mountpoint not found
destroy_dev: 314: true
destroy_dev: 313: for partition in 1 2 3 4
destroy_dev: 314: umount /dev/loop4p4
umount: /dev/loop4p4: mountpoint not found
destroy_dev: 314: true
destroy_dev: 316: losetup --detach /dev/loop4
destroy_dev: 317: rm vdh.disk
test_activate_dev: 358: return 1
run: 476: return 1


# This was fixed in rhel 7.x, checking for it.
rel = platform_information()[1]
if float(rel) < 7.0 and platform_distro().startswith(('centos', 'red', 'scientific')):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't need a call to float() here as it can potentially break (as @dachary points in the comments).

A simple `startswith('7') would be enough.

Also, calling the version rel is misleading. I would suggest to keep the first if condition and then add the version
comparison afterwards to make it explicit that this is only intended for RHEL:

# On RHEL and CentOS distros, calling partprobe forces a reboot of the
# server. Since we are not resizing partitons so we rely on calling
# partx
if platform_distro().startswith('red'):
    # This was fixed in rhel 7.x, checking for it.
    _, version, _ = platform_information()
    if not version.startswith('7'):

@itamarla
Copy link
Author

itamarla commented Mar 2, 2015

That was my original intention but then I thought it would break again once rhel 8 is out.

@alfredodeza
Copy link
Contributor

@itamarla when RHEL8 comes out and only if it displays a similar issue we can address it then.

Note though that a) RHEL7 doesn't need this fix and b) we have no idea what RHEL8 will need.

@ghost
Copy link

ghost commented Mar 2, 2015

@itamarla it looks like @alfredodeza tested with rhel7 and it works. It would greatly help if you can list the steps to reproduce the problem you have.

@itamarla
Copy link
Author

itamarla commented Mar 2, 2015

when you create an osd using ceph-disk prepare /dev/sdb /dev/sdc it fails as it tries to run partx on the newly created devices. I'm not sure your tests actualy try this out on real hardware..

@ghost
Copy link

ghost commented Mar 2, 2015

@itamarla can you show the error output ? That will clarify things. The tests are done on real hardware.

@loic-bot
Copy link

loic-bot commented Mar 2, 2015

SUCCESS: the output of run-make-check.sh on centos-centos7 for 9c76e42 is http://paste2.org/B8gAHLMX

:octocat: Sent from GH.

@alfredodeza
Copy link
Contributor

this code as is will not work.

My example was not meant to be a full replacement.

The if condition there catches all RHEL distros, but we still need to run partx on everything else that is not RHEL 7. The current version of this code will not run anything on RHEL7

@itamarla
Copy link
Author

itamarla commented Mar 3, 2015

It will, the next block of code runs partprobe for all
On Mar 3, 2015 12:07 AM, "Alfredo Deza" notifications@github.com wrote:

this code as is will not work.

My example was not meant to be a full replacement.

The if condition there catches all RHEL distros, but we still need to run
partx on everything else that is not RHEL 7. The current version of this
code will not run anything on RHEL7


Reply to this email directly or view it on GitHub
#3838 (comment).

# This was fixed in rhel 7.x, checking for it.
_, version, _ = platform_information()
if not version.startswith('7'):
if platform_distro().startswith(('centos', 'red', 'scientific')):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will never match because of 1026. Can't start with centos if it starts with red.

@ghost
Copy link

ghost commented Mar 3, 2015

Temporarily closing pending more information / discussions at http://tracker.ceph.com/issues/10987#note-6. It can be re-opened later, when the bug is reproduced.

@ghost ghost closed this Mar 3, 2015
@ghost ghost reopened this Mar 3, 2015
@ghost
Copy link

ghost commented Mar 8, 2015

Although partprobe could be used on rhel7, partx is also valid and tests at http://tracker.ceph.com/issues/10987#note-15 show it works. Whatever issue http://tracker.ceph.com/issues/10987 is about does not seem related to the use of partx or partprobe.

@ghost ghost closed this Mar 8, 2015
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants