Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cephadm/ceph-volume: do not use lvm binary in containers #43536

Merged
merged 2 commits into from Nov 10, 2021

Conversation

guits
Copy link
Contributor

@guits guits commented Oct 14, 2021

ceph-volume should run pv/vg/lv commands in the host namespace rather than
running them inside the container in order to avoid lvm metadata corruption.

Fixes: https://tracker.ceph.com/issues/51592
Fixes: https://tracker.ceph.com/issues/52926

src/cephadm/cephadm Outdated Show resolved Hide resolved
src/cephadm/cephadm Outdated Show resolved Hide resolved
@sebastian-philipp sebastian-philipp added the wip-swagner-testing My Teuthology tests label Oct 19, 2021
@sebastian-philipp
Copy link
Contributor

sebastian-philipp commented Oct 20, 2021

2021-10-20T09:54:32.749 INFO:teuthology.orchestra.run.smithi081.stderr:Error EINVAL: Traceback (most recent call last):
2021-10-20T09:54:32.752 INFO:teuthology.orchestra.run.smithi081.stderr:  File "/usr/share/ceph/mgr/mgr_module.py", line 1590, in _handle_command
2021-10-20T09:54:32.752 INFO:teuthology.orchestra.run.smithi081.stderr:    return self.handle_command(inbuf, cmd)
2021-10-20T09:54:32.752 INFO:teuthology.orchestra.run.smithi081.stderr:  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 167, in handle_command
2021-10-20T09:54:32.753 INFO:teuthology.orchestra.run.smithi081.stderr:    return dispatch[cmd['prefix']].call(self, cmd, inbuf)
2021-10-20T09:54:32.753 INFO:teuthology.orchestra.run.smithi081.stderr:  File "/usr/share/ceph/mgr/mgr_module.py", line 416, in call
2021-10-20T09:54:32.753 INFO:teuthology.orchestra.run.smithi081.stderr:    return self.func(mgr, **kwargs)
2021-10-20T09:54:32.753 INFO:teuthology.orchestra.run.smithi081.stderr:  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 107, in <lambda>
2021-10-20T09:54:32.754 INFO:teuthology.orchestra.run.smithi081.stderr:    wrapper_copy = lambda *l_args, **l_kwargs: wrapper(*l_args, **l_kwargs)  # noqa: E731
2021-10-20T09:54:32.754 INFO:teuthology.orchestra.run.smithi081.stderr:  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 96, in wrapper
2021-10-20T09:54:32.754 INFO:teuthology.orchestra.run.smithi081.stderr:    return func(*args, **kwargs)
2021-10-20T09:54:32.754 INFO:teuthology.orchestra.run.smithi081.stderr:  File "/usr/share/ceph/mgr/orchestrator/module.py", line 824, in _daemon_add_osd
2021-10-20T09:54:32.754 INFO:teuthology.orchestra.run.smithi081.stderr:    raise_if_exception(completion)
2021-10-20T09:54:32.755 INFO:teuthology.orchestra.run.smithi081.stderr:  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 224, in raise_if_exception
2021-10-20T09:54:32.755 INFO:teuthology.orchestra.run.smithi081.stderr:    raise e
2021-10-20T09:54:32.755 INFO:teuthology.orchestra.run.smithi081.stderr:RuntimeError: cephadm exited with an error code: 1, stderr:Inferring config /var/lib/ceph/3feb7906-318b-11ec-8c28-001a4aab830c/mon.a/config
2021-10-20T09:54:32.755 INFO:teuthology.orchestra.run.smithi081.stderr:Non-zero exit code 1 from /bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/sbin/ceph-volume --privileged >
2021-10-20T09:54:32.756 INFO:teuthology.orchestra.run.smithi081.stderr:/bin/podman: stderr --> passed data devices: 1 physical, 0 LVM
2021-10-20T09:54:32.756 INFO:teuthology.orchestra.run.smithi081.stderr:/bin/podman: stderr --> relative data size: 1.0
2021-10-20T09:54:32.756 INFO:teuthology.orchestra.run.smithi081.stderr:/bin/podman: stderr Running command: /usr/bin/ceph-authtool --gen-print-key
2021-10-20T09:54:32.756 INFO:teuthology.orchestra.run.smithi081.stderr:/bin/podman: stderr Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.ke>
2021-10-20T09:54:32.756 INFO:teuthology.orchestra.run.smithi081.stderr:/bin/podman: stderr Running command: nsenter --root=/rootfs --mount=/rootfs/proc/1/ns/mnt --ipc=/rootfs/proc/1/ns/ipc --net=/rootfs/proc/1/>
2021-10-20T09:54:32.757 INFO:teuthology.orchestra.run.smithi081.stderr:/bin/podman: stderr  stderr: Failed to find sysfs mount point
2021-10-20T09:54:32.757 INFO:teuthology.orchestra.run.smithi081.stderr:/bin/podman: stderr  stderr: dev/block/259:0/holders/: opendir failed: Not a directory
2021-10-20T09:54:32.759 INFO:teuthology.orchestra.run.smithi081.stderr:/bin/podman: stderr   dev/block/8:0/holders/: opendir failed: Not a directory
2021-10-20T09:54:32.759 INFO:teuthology.orchestra.run.smithi081.stderr:/bin/podman: stderr   dev/block/253:0/holders/: opendir failed: Not a directory
2021-10-20T09:54:32.759 INFO:teuthology.orchestra.run.smithi081.stderr:/bin/podman: stderr   dev/block/259:1/holders/: opendir failed: Not a directory
2021-10-20T09:54:32.759 INFO:teuthology.orchestra.run.smithi081.stderr:/bin/podman: stderr   dev/block/8:1/holders/: opendir failed: Not a directory
2021-10-20T09:54:32.760 INFO:teuthology.orchestra.run.smithi081.stderr:/bin/podman: stderr   dev/block/253:1/holders/: opendir failed: Not a directory
2021-10-20T09:54:32.760 INFO:teuthology.orchestra.run.smithi081.stderr:/bin/podman: stderr   dev/block/259:2/holders/: opendir failed: Not a directory
2021-10-20T09:54:32.760 INFO:teuthology.orchestra.run.smithi081.stderr:/bin/podman: stderr   dev/block/253:2/holders/: opendir failed: Not a directory
2021-10-20T09:54:32.765 INFO:teuthology.orchestra.run.smithi081.stderr:/bin/podman: stderr   dev/block/259:3/holders/: opendir failed: Not a directory
2021-10-20T09:54:32.765 INFO:teuthology.orchestra.run.smithi081.stderr:/bin/podman: stderr   dev/block/253:3/holders/: opendir failed: Not a directory
2021-10-20T09:54:32.765 INFO:teuthology.orchestra.run.smithi081.stderr:/bin/podman: stderr   dev/block/259:4/holders/: opendir failed: Not a directory
2021-10-20T09:54:32.766 INFO:teuthology.orchestra.run.smithi081.stderr:/bin/podman: stderr  stderr: dev/block/253:4/holders/: opendir failed: Not a directory
2021-10-20T09:54:32.766 INFO:teuthology.orchestra.run.smithi081.stderr:/bin/podman: stderr  stdout: Physical volume "/dev/nvme4n1" successfully created.
2021-10-20T09:54:32.766 INFO:teuthology.orchestra.run.smithi081.stderr:/bin/podman: stderr  stderr: Cannot archive volume group metadata for ceph-07ca3403-3ba9-4a6e-a971-ae4fba37cf74 to read-only filesystem.
2021-10-20T09:54:32.766 INFO:teuthology.orchestra.run.smithi081.stderr:/bin/podman: stderr --> Was unable to complete a new OSD, will rollback changes
2021-10-20T09:54:32.767 INFO:teuthology.orchestra.run.smithi081.stderr:/bin/podman: stderr Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.ke>
2021-10-20T09:54:32.767 INFO:teuthology.orchestra.run.smithi081.stderr:/bin/podman: stderr  stderr: purged osd.0

https://pulpito.ceph.com/swagner-2021-10-20_09:33:48-orch:cephadm-wip-swagner-testing-2021-10-19-1336-distro-basic-smithi/6451804

@sebastian-philipp sebastian-philipp removed the wip-swagner-testing My Teuthology tests label Oct 20, 2021
@guits
Copy link
Contributor Author

guits commented Nov 1, 2021

jenkins test ceph-volume tox

@guits
Copy link
Contributor Author

guits commented Nov 1, 2021

jenkins test ceph-volume all

@guits
Copy link
Contributor Author

guits commented Nov 1, 2021

jenkins test ceph-volume tox

@guits
Copy link
Contributor Author

guits commented Nov 1, 2021

jenkins test ceph-volume all

@guits
Copy link
Contributor Author

guits commented Nov 1, 2021

jenkins test ceph-volume all

@guits
Copy link
Contributor Author

guits commented Nov 1, 2021

jenkins test ceph-volume tox

@sebastian-philipp sebastian-philipp added the wip-swagner-testing My Teuthology tests label Nov 1, 2021
@sebastian-philipp
Copy link
Contributor

sebastian-philipp commented Nov 3, 2021

RuntimeError: cephadm exited with an error code: 1, stderr:Inferring config /var/lib/ceph/872584d6-3c00-11ec-8c28-001a4aab830c/mon.a/config
Non-zero exit code 1 from /bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/sbin/ceph-volume --privileged >
/bin/podman: stderr --> passed data devices: 1 physical, 0 LVM
/bin/podman: stderr --> relative data size: 1.0
/bin/podman: stderr Running command: /usr/bin/ceph-authtool --gen-print-key
/bin/podman: stderr Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.ke>
/bin/podman: stderr Running command: nsenter --root=/rootfs --mount=/rootfs/proc/1/ns/mnt --ipc=/rootfs/proc/1/ns/ipc --net=/rootfs/proc/1/>
/bin/podman: stderr  stderr: Failed to find sysfs mount point
/bin/podman: stderr  stderr: dev/block/259:0/holders/: opendir failed: Not a directory
/bin/podman: stderr   dev/block/8:0/holders/: opendir failed: Not a directory
/bin/podman: stderr   dev/block/253:0/holders/: opendir failed: Not a directory
/bin/podman: stderr   dev/block/259:1/holders/: opendir failed: Not a directory
/bin/podman: stderr   dev/block/8:1/holders/: opendir failed: Not a directory
/bin/podman: stderr   dev/block/253:1/holders/: opendir failed: Not a directory
/bin/podman: stderr   dev/block/259:2/holders/: opendir failed: Not a directory
/bin/podman: stderr   dev/block/253:2/holders/: opendir failed: Not a directory
/bin/podman: stderr  stderr: dev/block/259:3/holders/: opendir failed: Not a directory
/bin/podman: stderr   dev/block/253:3/holders/: opendir failed: Not a directory
/bin/podman: stderr   dev/block/259:4/holders/: opendir failed: Not a directory
/bin/podman: stderr   dev/block/253:4/holders/: opendir failed: Not a directory
/bin/podman: stderr  stdout: Physical volume "/dev/nvme4n1" successfully created.
/bin/podman: stderr  stderr: Cannot archive volume group metadata for ceph-7bfd24e0-ec83-492d-977e-044d0d79f958 to read-only filesystem.
/bin/podman: stderr --> Was unable to complete a new OSD, will rollback changes
/bin/podman: stderr Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.ke>
/bin/podman: stderr  stderr: purged osd.0
/bin/podman: stderr -->  RuntimeError: command returned non-zero exit status: 5

https://pulpito.ceph.com/swagner-2021-11-02_13:16:45-orch:cephadm-wip-swagner-testing-2021-11-02-1007-distro-basic-smithi/6477778

@guits
Copy link
Contributor Author

guits commented Nov 4, 2021

jenkins test ceph-volume tox

@guits
Copy link
Contributor Author

guits commented Nov 4, 2021

jenkins test ceph-volume all

ceph-volume should run pv/vg/lv commands in the host namespace rather than
running them inside the container in order to avoid lvm metadata corruption.

Fixes: https://tracker.ceph.com/issues/52926

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
See ceph-volume tracker for details [1]

[1] https://tracker.ceph.com/issues/52926

Fixes: https://tracker.ceph.com/issues/51592

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
@guits
Copy link
Contributor Author

guits commented Nov 4, 2021

jenkins test ceph-volume all

@guits
Copy link
Contributor Author

guits commented Nov 4, 2021

jenkins test ceph-volume tox

@guits
Copy link
Contributor Author

guits commented Nov 4, 2021

@sebastian-philipp could you retest it? Thanks!

@guits
Copy link
Contributor Author

guits commented Nov 5, 2021

jenkins test make check

@sebastian-philipp sebastian-philipp merged commit dca8827 into ceph:master Nov 10, 2021
@guits guits deleted the lvm-wrapper branch November 15, 2021 08:35
guits added a commit to guits/ceph that referenced this pull request Jan 10, 2022
The recent changes from PR ceph#43536 introduced a regeression preventing from
running ceph-volume in a containerized context on Ubuntu 18.04.

Given that the path for the binary `lvs` differs between CentOS 8 and Ubuntu 18.04.
(`/usr/sbin/lvs` and `/sbin/lvs` respictively). It means that ceph-volume running
in the container on CentOS 8 sees the `lvs` binary at `/usr/sbin/lvs` and try to
run it with `nsenter` on the host which is running Ubuntu 18.04.

Fixes: https://tracker.ceph.com/issues/53812

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
guits added a commit to guits/ceph that referenced this pull request Jan 10, 2022
The recent changes from PR ceph#43536 introduced a regeression preventing from
running ceph-volume in a containerized context on Ubuntu 18.04.

Given that the path for the binary `lvs` differs between CentOS 8 and Ubuntu 18.04.
(`/usr/sbin/lvs` and `/sbin/lvs` respictively). It means that ceph-volume running
in the container on CentOS 8 sees the `lvs` binary at `/usr/sbin/lvs` and try to
run it with `nsenter` on the host which is running Ubuntu 18.04.

Fixes: https://tracker.ceph.com/issues/53812

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
guits added a commit to guits/ceph that referenced this pull request Jan 17, 2022
The recent changes from PR ceph#43536 introduced a regeression preventing from
running ceph-volume in a containerized context on Ubuntu 18.04.

Given that the path for the binary `lvs` differs between CentOS 8 and Ubuntu 18.04.
(`/usr/sbin/lvs` and `/sbin/lvs` respictively). It means that ceph-volume running
in the container on CentOS 8 sees the `lvs` binary at `/usr/sbin/lvs` and try to
run it with `nsenter` on the host which is running Ubuntu 18.04.

Fixes: https://tracker.ceph.com/issues/53812

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 95e88cda3df76b59b548ae808df0ef7f19db1f63)
guits added a commit to guits/ceph that referenced this pull request Jan 17, 2022
The recent changes from PR ceph#43536 introduced a regeression preventing from
running ceph-volume in a containerized context on Ubuntu 18.04.

Given that the path for the binary `lvs` differs between CentOS 8 and Ubuntu 18.04.
(`/usr/sbin/lvs` and `/sbin/lvs` respictively). It means that ceph-volume running
in the container on CentOS 8 sees the `lvs` binary at `/usr/sbin/lvs` and try to
run it with `nsenter` on the host which is running Ubuntu 18.04.

Fixes: https://tracker.ceph.com/issues/53812

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 95e88cda3df76b59b548ae808df0ef7f19db1f63)
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
sebastian-philipp added a commit that referenced this pull request Jan 18, 2022
ceph-volume: fix regression introcuded via #43536

Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Teoman Onay <tonay@redhat.com>
guits added a commit to guits/ceph that referenced this pull request Jan 18, 2022
The recent changes from PR ceph#43536 introduced a regeression preventing from
running ceph-volume in a containerized context on Ubuntu 18.04.

Given that the path for the binary `lvs` differs between CentOS 8 and Ubuntu 18.04.
(`/usr/sbin/lvs` and `/sbin/lvs` respictively). It means that ceph-volume running
in the container on CentOS 8 sees the `lvs` binary at `/usr/sbin/lvs` and try to
run it with `nsenter` on the host which is running Ubuntu 18.04.

Fixes: https://tracker.ceph.com/issues/53812

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 95e88cda3df76b59b548ae808df0ef7f19db1f63)
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 3c93ffd)
guits added a commit to guits/ceph that referenced this pull request Jan 19, 2022
The recent changes from PR ceph#43536 introduced a regeression preventing from
running ceph-volume in a containerized context on Ubuntu 18.04.

Given that the path for the binary `lvs` differs between CentOS 8 and Ubuntu 18.04.
(`/usr/sbin/lvs` and `/sbin/lvs` respictively). It means that ceph-volume running
in the container on CentOS 8 sees the `lvs` binary at `/usr/sbin/lvs` and try to
run it with `nsenter` on the host which is running Ubuntu 18.04.

Fixes: https://tracker.ceph.com/issues/53812

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 95e88cda3df76b59b548ae808df0ef7f19db1f63)
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 3c93ffd)
sebastian-philipp added a commit that referenced this pull request Jan 19, 2022
pacific: ceph-volume: fix regression introcuded via #43536

Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
guits added a commit that referenced this pull request Jan 25, 2022
The recent changes from PR #43536 introduced a regeression preventing from
running ceph-volume in a containerized context on Ubuntu 18.04.

Given that the path for the binary `lvs` differs between CentOS 8 and Ubuntu 18.04.
(`/usr/sbin/lvs` and `/sbin/lvs` respictively). It means that ceph-volume running
in the container on CentOS 8 sees the `lvs` binary at `/usr/sbin/lvs` and try to
run it with `nsenter` on the host which is running Ubuntu 18.04.

Fixes: https://tracker.ceph.com/issues/53812

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 95e88cda3df76b59b548ae808df0ef7f19db1f63)
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 3c93ffd)
guits added a commit that referenced this pull request Jan 25, 2022
octopus: ceph-volume: fix regression introcuded via #43536
guits added a commit to guits/ceph that referenced this pull request Jan 25, 2022
The recent changes from PR ceph#43536 introduced a regeression preventing from
running ceph-volume in a containerized context on Ubuntu 18.04.

Given that the path for the binary `lvs` differs between CentOS 8 and Ubuntu 18.04.
(`/usr/sbin/lvs` and `/sbin/lvs` respictively). It means that ceph-volume running
in the container on CentOS 8 sees the `lvs` binary at `/usr/sbin/lvs` and try to
run it with `nsenter` on the host which is running Ubuntu 18.04.

Fixes: https://tracker.ceph.com/issues/53812

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 95e88cda3df76b59b548ae808df0ef7f19db1f63)
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 3c93ffd)
NitzanMordhai pushed a commit to NitzanMordhai/ceph that referenced this pull request Nov 19, 2023
The recent changes from PR ceph#43536 introduced a regeression preventing from
running ceph-volume in a containerized context on Ubuntu 18.04.

Given that the path for the binary `lvs` differs between CentOS 8 and Ubuntu 18.04.
(`/usr/sbin/lvs` and `/sbin/lvs` respictively). It means that ceph-volume running
in the container on CentOS 8 sees the `lvs` binary at `/usr/sbin/lvs` and try to
run it with `nsenter` on the host which is running Ubuntu 18.04.

Fixes: https://tracker.ceph.com/issues/53812

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 95e88cda3df76b59b548ae808df0ef7f19db1f63)
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 3c93ffd)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants