New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
need workaround for ceph bug, for OSD nodes and ceph-volume batch mode #4855
Comments
Did you zap/purge/destroy (whatever you use) the devices before the initial deployment ?
Can you share the On my side I can rerun the ceph-volume batch command without any issue either manually or via new ceph-ansible execution. # ceph-volume lvm --report --bluestore --yes batch /dev/sdb /dev/sdc
--> All devices are already used by ceph. No OSDs will be created.
That's not true. If you don't have the same devices configuration on all OSD nodes then instead of using the group_vars you can use the host_vars.
You don't need to do that. Again, share the logs, configuration or anything that could help. BTW before using stuff like |
|
For your ceph-volume test above, was one of those devicdes SSD and the other one HDD ? on the brighter side, looks like I can share the ceph-volume log, after I filtered out the keys. |
|
Okay. I happen to have had to recreate my test cluster from scratch. so here is explicit demo output |
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
|
Still an issue |
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
|
This is still an issue. I just ran into this on a RHOP 16 cluster. |
|
This has been fixed by e4c24f3 |
You mean just take out all the OSD nodes from host group and only running ansible script adding the new node?
|
ansible 2.8
ceph nautilus
This is similar to issue #4748
In THAT issue, someone commented along the lines of,
"well yes its expected to fail, if you initially create a bunch of OSDs, then call ceph-volume with a different set of devices".
Except that I'm NOT using a different set of devices. its the exact same set that was first set up on the machine.
So apparently, theres some kind of bug in ceph, where it happily sets up a set of mixed ssd/hdd devices the first time in batch mode... but then --report will be broken ever after ?
Unless I'm misunderstanding, and the previous bug report is saying that ALL machines in the cluster must have the EXACT SAME device paths as each other.
Which would be insane, given that linux can randomly rewrite device paths, and /dev/sda can become /dev/sdl after a reboot.
Right now, the only workaround I would seem to have, is:
Add a machine one time to the [osds] ansible host group. get it configured... then take it OUT of the group.
Would be nice to have a better alternative.
(or is that kind of expected ceph-ansible behaviour? to create a dynamic inventory file, and only put in it, what needs to be (re)configured this time ? )
The text was updated successfully, but these errors were encountered: