New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nautilus: mgr/fs/volumes misc fixes #36167
Conversation
|
jenkins test dashboard backend |
@ajarr There are a couple more that you might consider including here: |
|
Thanks, @smithfarm. The fixes for the following two issues are in different parts of the code base, so I preferred to handle them as separate PRs. Even though the conflicts are minor I've asked the PR author to backport it. The PR is here #36180 |
jenkins test dashboard backend |
@kotreshhr can you quickly take a look at this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
New round of testing after fixing the |
6e9f469
to
5aba130
Compare
src/pybind/mgr/volumes/fs/volume.py
Outdated
@@ -98,9 +98,24 @@ def delete_fs_volume(self, volname, confirm): | |||
"that is what you want, re-issue the command followed by " \ | |||
"--yes-i-really-mean-it.".format(volname) | |||
|
|||
ret, out, err = self.mgr.check_mon_command({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kotreshhr tests are failing now with,
AttributeError: 'Module' object has no attribute 'check_mon_command'
check_mon_command
is not in nautilus. Is it OK just to use mon_command
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, mon_command can be used. The only difference between them is that 'check_mon_command' raises if ret !=0 . So we can add that validation here in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed that'd raise an unhandled exception. If ret !=0 I just return the output of mon_command().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
I now see that all the jobs in the above run fail at test_volume_create() when trying to remove the volume created by the test. I see this in the mgr log, /a/rraja-2020-07-22_15:24:27-fs-ajarr-nautilus-testing-2020-07-22-distro-basic-smithi/5249500/remote/smithi169/log/ceph-mgr.x.log.gz
And in the mon log, /a/rraja-2020-07-22_15:24:27-fs-ajarr-nautilus-testing-2020-07-22-distro-basic-smithi/5249500/remote/smithi169/log/ceph-mon.b.log.gz
|
@@ -98,12 +98,14 @@ def delete_fs_volume(self, volname, confirm): | |||
"that is what you want, re-issue the command followed by " \ | |||
"--yes-i-really-mean-it.".format(volname) | |||
|
|||
ret, out, err = self.mgr.check_mon_command({ | |||
ret, out, err = self.mgr.mon_command({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from the test failure looks like the error could be due to this line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested it locally. The config command is not taking 'mon', it is expecting specific mon, like mon.0.
It works with 'mon.*' though, so please change it and it should work.
0174573
to
6a7efb1
Compare
I see this error now here,
And this earlier in teuthology.log,
|
This is because 'ceph fs status' cmd's json support patch is missing. I think this needs following two patches.
Not sure about this. |
I verified that both tests pass locally after pulling in 4da6381 and 138117f
|
Yes, we can workaround by using the 'ceph osd poo ls' with 'detail' option to validate whether any pool has volume name as metadata. I have sent you the patch. Please apply it and run the test. Hopefully, there are no further surprises this time :) |
While volume deletion, the associated pools are not always removed. The pools are removed only if the volume is created using mgr plugin and not if created with custom osd pools. This is because mgr plugin generates pool names with specific pattern. Both create and delete volume relies on it. This patch fixes the issue by identifying the pools of the volume without relying on the pattern. Fixes: https://tracker.ceph.com/issues/45910 Signed-off-by: Kotresh HR <khiremat@redhat.com> (cherry picked from commit d07ea8d) Conflicts: src/pybind/mgr/volumes/fs/operations/volume.py: - In nautilus, fs volume create doesn't have placement arg src/pybind/mgr/volumes/fs/volume.py: - In nautilus, VolumeClient code not moved to mgr_util.py
This provides a generic framework for modifying Ceph configuration changes in tests through the monitors rather than the asok interface or local ceph.conf changes. Any changes are reverted during test teardown. A future patch will convert existing tests manipulating the local ceph.conf or admin socket. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> (cherry picked from commit 8729281)
Volume deletion wasn't validating mon_allow_pool_delete config before destroying volume metadata. Hence when mon_allow_pool_delete is set to false, it was deleting metadata but failed to delete pool resulting in inconsistent state. This patch validates the config before going ahead with deletion. Fixes: https://tracker.ceph.com/issues/45662 Signed-off-by: Kotresh HR <khiremat@redhat.com> (cherry picked from commit e770bb9)
Loop logic would bail out if it first sees any file system that does not match the volume it's looking for. Fixes: https://tracker.ceph.com/issues/46277 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> (cherry picked from commit be74a81)
... for python versions earlier than 3.5. Signed-off-by: Ramana Raja <rraja@redhat.com>
... instead of 'check_mon_command' which is not in nautilus, and not compatible with python2 and early versions of py3. Signed-off-by: Ramana Raja <rraja@redhat.com>
…status' Signed-off-by: Kotresh HR <khiremat@redhat.com>
6a7efb1
to
3116a25
Compare
@kotreshhr do we still need 138117f ? |
Nope, we don't need for this now. |
failures were unrelated. |
Looks like #33325 is a follow-on fix for 8729281 based on #33325 (comment). Was it backported to nautilus with 9f323e0? |
see #36377 |
@neha-ojha @batrick @yuriw thanks! missed that it was modifying ceph_test_case and needed rados suite testing. |
Fixes: https://tracker.ceph.com/issues/46235
Fixes: https://tracker.ceph.com/issues/46478
Fixes: https://tracker.ceph.com/issues/46466