Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compaction #14

Closed
wants to merge 3,185 commits into from
Closed

Compaction #14

wants to merge 3,185 commits into from

Conversation

luosizhe-HW
Copy link

Checklist

  • References tracker ticket
  • Updates documentation if necessary
  • Includes tests for new functionality or reproducer for bug

Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox

batrick and others added 30 commits February 4, 2020 17:52
The master version of this PR deals with the unified asok/tell
interface. Nautilus- are separated.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 9dc07d8)
Rationale can be found in [1]. Point is that EC pools incur a
significant performance penalty when dealing with small files and xattr
updates. This is because _every_ inode has a corresponding data pool
object with backtrace information stored in its xattr.

[1] doc/cephfs/createfs.rst

Fixes: https://tracker.ceph.com/issues/42450
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 3e0aee5)
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit bf0cf8e)

Conflicts:
	qa/tasks/cephfs/filesystem.py
In the future, we should add the EC data pool as a supplementary data
pool but that requires a mount to setup which is awkward in the code
here. When cephfs-shell is more widely available, this will be easier.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 6e448f9)
Connection pointer is not helpful.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 9b71bbe)

Conflicts:
	src/mgr/DaemonServer.cc: actually do print the Connection*, Connection& cannot be dumped in Nautilus.
If the mgr is waiting on daemon metadata from the mons, it has no
DaemonState associated with the daemon yet. If we try to process this
MgrOpen, the metadata sent by the daemon (like its config) will not be
recorded.

Fixes: https://tracker.ceph.com/issues/43037
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 16a1deb)
Added 'telemetry show-device' command to print a preview of telemetry device report.
Added a message at the bottom of 'telemetry show' about 'telemetry show-device' new command.

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
(cherry picked from commit cae87cc)

Conflicts:
- path: src/pybind/mgr/telemetry/module.py
  comment: nautilus version of json.dumps() don't have sort_keys arg.
smartctl JSON output contains the device's serial number in two
different keys ('serial_number' & 'output'). Serial is now obfuscated in
both.

Fixes: https://tracker.ceph.com/issues/43939
Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
(cherry picked from commit be1257f)
get_metada() returns 'None' when requesting a missing service, hence
trying to access its content fails. Added a check for osd and mgr
get_metadata() calls.

Fixes: https://tracker.ceph.com/issues/43642
Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
(cherry picked from commit 9e7a0cb)
nautilus: mgr/dashboard: Using wrong identifiers in RGW user/bucket datatables

Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Volker Theile <vtheile@suse.com>
nautilus: mgr/dashboard: iSCSI targets not available if any gateway is down (and more...)

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Tiago Melo <tmelo@suse.com>
Reviewed-by: Volker Theile <vtheile@suse.com>
nautilus: mgr/dashboard: add debug mode, and accept expected exception when SSL handshaking

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Stephan Müller <smueller@suse.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>
nautilus: mgr/dashboard: Dashboard can't handle self-signed cert on Grafana API

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Volker Theile <vtheile@suse.com>
nautilus: mgr/dashboard: check embedded Grafana dashboard references

Reviewed-by: Laura Paduano <lpaduano@suse.com>
nautilus: mgr/dashboard: check if user has config-opt permissions

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>
nautilus: mgr/dashboard: disable 'Add Capability' button in rgw user edit

Reviewed-by: Laura Paduano <lpaduano@suse.com>
Nothing inherits from PQ.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 761cc0e)

Conflicts:
	src/mds/PurgeQueue.h
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 096a5ca)
This makes the corresponding test not racy.

Fixes: https://tracker.ceph.com/issues/16881
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

Conflicts:
	src/mds/PurgeQueue.cc
	src/mds/PurgeQueue.h
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 98e3b7e)
Note: removed mgr blacklist test which applies to Octopus.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
The squelched error prevented us from knowing connection cleanup doesn't
work on py3.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit b45c08b)
Otherwise this raises an exception.

Fixes: https://tracker.ceph.com/issues/43113
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 03f8080)
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 6714364)
nautilus: mgr/MgrClient: fix open condition

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
nautilus: selinux: Allow ceph to read udev db

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Boris Ranto <branto@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
nautilus: kv: fix shutdown vs async compaction

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
nautilus: osd: Diagnostic logging for upmap cleaning

Reviewed-by: David Zafman <dzafman@redhat.com>
jan--f and others added 27 commits February 14, 2020 18:00
nautilus: ceph-volume: don't remove vg twice when zapping filestore
nautilus: mgr/DaemonServer: fix 'osd ok-to-stop' for EC pools

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: David Zafman <dzafman@redhat.com>
nautilus: qa/suites/rados/multimon/tasks/mon_clock_with_skews: disable ntpd etc

Reviewed-by: Kefu Chai <kchai@redhat.com>
nautilus: common/bl: fix the dangling last_p issue.

Reviewed-by: Kefu Chai <kchai@redhat.com>
nautilus: ceph-monstore-tool: correct the key for storing mgr_command_descs

Reviewed-by: Kefu Chai <kchai@redhat.com>
Add the min_sample lower-bound argument too

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 7be5c13)
Conflicts: had to be backported to enable backporting of
ceph#32903
Backport tracker: https://tracker.ceph.com/issues/43873
… hours

Telemetry module fetches device metrics which were scraped in the last
"telemetry interval"*2 (=48 hours by default) by calling
_get_device_metrics() with min_sample. _get_device_metrics() fetches the
metrics from omap and breaks on the first one that is older than
min_sample. But because it fetched in ascending order (from oldest to
newest) it was breaking on the first one it received, if it was older
than the interval above. We need to pass min_sample to get_omap_vals()
so it will start fetching from that value.

Fixes: https://tracker.ceph.com/issues/43837
Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
(cherry picked from commit 5f7e4a9)
Upgrade to 2.8.1 and stable-4.0 respectively

Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
nautilus: qa/ceph-ansible: ansible-version and ceph_ansible

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>
This was done for octopus in 8283ea9,
but not for nautilus

Signed-off-by: Neha Ojha <nojha@redhat.com>
nautilus: qa/suites/upgrade/mimic-x/stress-split: fix msgr2 vs nautilus ordering
Caused by backport commit cb48be5 which
did not account for the explicit drop of the message reference, only in
Nautilus-.

Fixes: https://tracker.ceph.com/issues/44245
Fixes: cb48be5
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
* refs/pull/33498/head:
	mgr: drop reference to msg on return

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Saw a deadlock when deleting lot of subvolumes -- purge threads were
stuck in accessing global lock for volume access. This can happen
when there is a concurrent remove (which renames and signals the
purge threads) and a purge thread is just about to scan the trash
directory for entries.

For the fix, purge threads fetches entries by accessing the volume
in lockless mode. This is safe from functionality point-of-view as
the rename and directory scan is correctly handled by the filesystem.
Worst case the purge thread would pick up the trash entry on next
scan, never leaving a stale trash entry.

Signed-off-by: Venky Shankar <vshankar@redhat.com>
(cherry picked from commit 808a1ce)
Signed-off-by: Venky Shankar <vshankar@redhat.com>
(cherry picked from commit 5ec09a2)
Fixes: http://tracker.ceph.com/issues/44282
Signed-off-by: Venky Shankar <vshankar@redhat.com>
(cherry picked from commit 92b2008)
* refs/pull/33526/head:
	test: verify purge queue w/ large number of subvolumes
	test: pass timeout argument to mount::wait_for_dir_empty()
	mgr/volumes: access volume in lockless mode when fetching async job

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
nautilus: mgr/devicehealth: fix telemetry stops sending device reports after 48 hours

Reviewed-by: Sage Weil <sage@redhat.com>
If the async threads hit a temporary exception the job is
never unregistered and therefore gets skipped by the async
threads on subsequent scans.

Patrick hit this in nautilus when one of the purge threads
hit an exception when trying to log a message. The trash
entry was never picked up again by the purge threads.

Fixes: http://tracker.ceph.com/issues/44315
Signed-off-by: Venky Shankar <vshankar@redhat.com>
(cherry picked from commit 46476ef)
* refs/pull/33569/head:
	mgr/volumes: unregister job upon async threads exception

Reviewed-by: Ramana Raja <rraja@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
@it-is-a-robot
Copy link

Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

The following commits have not yet signed CLA.

3f7f0c7 | mds: skip tell command scrub on multimds

The master version of this PR deals with the unified asok/tell
interface. Nautilus- are separated.

Signed-off-by: Patrick Donnelly pdonnell@redhat.com
95baab0 | qa: add tests for CephFS admin commands

Signed-off-by: Patrick Donnelly pdonnell@redhat.com
(cherry picked from commit 9dc07d8)
1ee9f2c | mon/MDSMonitor: warn when creating fs with default EC data pool

Rationale can be found in [1]. Point is that EC pools incur a
significant performance penalty when dealing with small files and xattr
updates. This is because every inode has a corresponding data pool
object with backtrace information stored in its xattr.

[1] doc/cephfs/createfs.rst

Fixes: https://tracker.ceph.com/issues/42450
Signed-off-by: Patrick Donnelly pdonnell@redhat.com
(cherry picked from commit 3e0aee5)
7a41fab | qa: add tests for adding EC data pools

Signed-off-by: Patrick Donnelly pdonnell@redhat.com
(cherry picked from commit bf0cf8e)

Conflicts:
qa/tasks/cephfs/filesystem.py
83e9e9c | qa: force creation of fs with EC default data pool

In the future, we should add the EC data pool as a supplementary data
pool but that requires a mount to setup which is awkward in the code
here. When cephfs-shell is more widely available, this will be easier.

Signed-off-by: Patrick Donnelly pdonnell@redhat.com
(cherry picked from commit 6e448f9)
372d184 | mgr: improve debug message information

Connection pointer is not helpful.

Signed-off-by: Patrick Donnelly pdonnell@redhat.com
(cherry picked from commit 9b71bbe)

Conflicts:
src/mgr/DaemonServer.cc: actually do print the Connection*, Connection& cannot be dumped in Nautilus.
cb48be5 | mgr: drop session with Ceph daemon when not ready

If the mgr is waiting on daemon metadata from the mons, it has no
DaemonState associated with the daemon yet. If we try to process this
MgrOpen, the metadata sent by the daemon (like its config) will not be
recorded.

Fixes: https://tracker.ceph.com/issues/43037
Signed-off-by: Patrick Donnelly pdonnell@redhat.com
(cherry picked from commit 16a1deb)
c4dc4f0 | mgr/telemetry: added 'telemetry show-device' command

Added 'telemetry show-device' command to print a preview of telemetry device report.
Added a message at the bottom of 'telemetry show' about 'telemetry show-device' new command.

Signed-off-by: Yaarit Hatuka yaarit@redhat.com
(cherry picked from commit cae87cc)

Conflicts:

  • path: src/pybind/mgr/telemetry/module.py
    comment: nautilus version of json.dumps() don't have sort_keys arg.
    3c7d09e | mgr/telemetry: anonymizing smartctl report itself

smartctl JSON output contains the device's serial number in two
different keys ('serial_number' & 'output'). Serial is now obfuscated in
both.

Fixes: https://tracker.ceph.com/issues/43939
Signed-off-by: Yaarit Hatuka yaarit@redhat.com
(cherry picked from commit be1257f)
6599f5a | mgr/telemetry: check get_metadata return val

get_metada() returns 'None' when requesting a missing service, hence
trying to access its content fails. Added a check for osd and mgr
get_metadata() calls.

Fixes: https://tracker.ceph.com/issues/43642
Signed-off-by: Yaarit Hatuka yaarit@redhat.com
(cherry picked from commit 9e7a0cb)
1bbdf03 | Merge pull request ceph#32888 from shyukri/wip-42033-nautilus

nautilus: mgr/dashboard: Using wrong identifiers in RGW user/bucket datatables

Reviewed-by: Laura Paduano lpaduano@suse.com
Reviewed-by: Volker Theile vtheile@suse.com
5362d0b | Merge pull request ceph#32304 from ricardoasmarques/wip-43333-nautilus

nautilus: mgr/dashboard: iSCSI targets not available if any gateway is down (and more...)

Reviewed-by: Jason Dillaman dillaman@redhat.com
Reviewed-by: Laura Paduano lpaduano@suse.com
Reviewed-by: Tiago Melo tmelo@suse.com
Reviewed-by: Volker Theile vtheile@suse.com
08239d5 | Merge pull request ceph#31190 from rhcs-dashboard/wip-42294-nautilus

nautilus: mgr/dashboard: add debug mode, and accept expected exception when SSL handshaking

Reviewed-by: Kefu Chai kchai@redhat.com
Reviewed-by: Laura Paduano lpaduano@suse.com
Reviewed-by: Stephan Müller smueller@suse.com
Reviewed-by: Tatjana Dehler tdehler@suse.com
539cc8b | Merge pull request ceph#31792 from rhcs-dashboard/wip-42936-nautilus

nautilus: mgr/dashboard: Dashboard can't handle self-signed cert on Grafana API

Reviewed-by: Ernesto Puerta epuertat@redhat.com
Reviewed-by: Laura Paduano lpaduano@suse.com
Reviewed-by: Volker Theile vtheile@suse.com
a38345e | Merge pull request ceph#31808 from bk201/wip-42956-nautilus

nautilus: mgr/dashboard: check embedded Grafana dashboard references

Reviewed-by: Laura Paduano lpaduano@suse.com
f94715a | Merge pull request ceph#32827 from rhcs-dashboard/wip-43811-nautilus

nautilus: mgr/dashboard: check if user has config-opt permissions

Reviewed-by: Ernesto Puerta epuertat@redhat.com
Reviewed-by: Laura Paduano lpaduano@suse.com
3fd5cbb | Merge pull request ceph#32930 from rhcs-dashboard/wip-43845-nautilus

nautilus: mgr/dashboard: disable 'Add Capability' button in rgw user edit

Reviewed-by: Laura Paduano lpaduano@suse.com
c47ea1e | mds: mark purge queue protected members private

Nothing inherits from PQ.

Signed-off-by: Patrick Donnelly pdonnell@redhat.com
(cherry picked from commit 761cc0e)

Conflicts:
src/mds/PurgeQueue.h
3cf7870 | qa: use correct variable for exception debug

Signed-off-by: Patrick Donnelly pdonnell@redhat.com
(cherry picked from commit 096a5ca)
f00276b | mds: track high water mark for purges

This makes the corresponding test not racy.

Fixes: https://tracker.ceph.com/issues/16881
Signed-off-by: Patrick Donnelly pdonnell@redhat.com

Conflicts:
src/mds/PurgeQueue.cc
src/mds/PurgeQueue.h
5e1d712 | qa: test mgr cephfs mount blacklist

Signed-off-by: Patrick Donnelly pdonnell@redhat.com
(cherry picked from commit 98e3b7e)
Note: removed mgr blacklist test which applies to Octopus.
9e11deb | qa: improve variable name

Signed-off-by: Patrick Donnelly pdonnell@redhat.com
e48394d | pybind/mgr/volumes: print errors in cleanup timer

The squelched error prevented us from knowing connection cleanup doesn't
work on py3.

Signed-off-by: Patrick Donnelly pdonnell@redhat.com
(cherry picked from commit b45c08b)
1157035 | pybind/mgr/volumes: use py3 items iterator

Otherwise this raises an exception.

Fixes: https://tracker.ceph.com/issues/43113
Signed-off-by: Patrick Donnelly pdonnell@redhat.com
(cherry picked from commit 03f8080)
c03baa3 | qa: test volumes plugin mount cleanup

Signed-off-by: Patrick Donnelly pdonnell@redhat.com
(cherry picked from commit 6714364)
49587d8 | Merge pull request ceph#32769 from liewegas/fix-42566-nautilus

nautilus: mgr/MgrClient: fix open condition

Reviewed-by: Josh Durgin jdurgin@redhat.com
f5d69c4 | Merge pull request ceph#32259 from smithfarm/wip-43243-nautilus

nautilus: selinux: Allow ceph to read udev db

Reviewed-by: Kefu Chai kchai@redhat.com
Reviewed-by: Boris Ranto branto@redhat.com
Reviewed-by: Neha Ojha nojha@redhat.com
5b624a4 | Merge pull request ceph#32715 from smithfarm/wip-43620-nautilus

nautilus: kv: fix shutdown vs async compaction

Reviewed-by: Neha Ojha nojha@redhat.com
Reviewed-by: Sage Weil sage@redhat.com
250a778 | Merge pull request ceph#32716 from smithfarm/wip-43650-nautilus

nautilus: osd: Diagnostic logging for upmap cleaning

Reviewed-by: David Zafman dzafman@redhat.com
fe2035f | Merge branch 'nautilus' into wip-42120-nautilus
443a6cf | Merge pull request ceph#32773 from dzafman/wip-43246-nautilus

nautilus: osd: Use physical ratio for nearfull (doesn't include backfill resserve)

Reviewed-by: Neha Ojha nojha@redhat.com
5030d73 | Merge pull request ceph#32843 from smithfarm/wip-43099-nautilus

nautilus: osd/OSD: enhance osd numa affinity compatibility

Reviewed-by: Kefu Chai kchai@redhat.com
Reviewed-by: Sage Weil sage@redhat.com
48ed41f | Merge pull request ceph#32847 from smithfarm/wip-43319-nautilus

nautilus: osd/PeeringState.cc: skip peer_purged when discovering all missing

Reviewed-by: Sage Weil sage@redhat.com
Reviewed-by: Neha Ojha nojha@redhat.com
78240e3 | Merge pull request ceph#32848 from smithfarm/wip-43346-nautilus

nautilus: qa/suites/rados/thrash: force normal pg log length with cache tiering

Reviewed-by: Sage Weil sage@redhat.com
5b95c39 | Merge pull request ceph#32857 from smithfarm/wip-43471-nautilus

nautilus: osd/PeeringState.cc: don't let num_objects become negative

Reviewed-by: Neha Ojha nojha@redhat.com
e28dea6 | Merge pull request ceph#30689 from smithfarm/wip-42120-nautilus

nautilus: core: osd/OSDMap: health alert for non-power-of-two pg_num

Reviewed-by: Neha Ojha nojha@redhat.com
259dd74 | ceph-volume: util: look for executable in $PATH

Fixes: https://tracker.ceph.com/issues/36728

Fallback to predefined paths for backward compatibility.
Alter test involved for partial match in warning

Signed-off-by: Shyukri Shyukriev shshyukriev@suse.com
(cherry picked from commit a857708)
2c6ca6c | lvm/activate.py: clarify error message: fsid refers to osd_fsid

Users complained[1] the error message isn't clear, and they thought
it referred to the cluster fsid instead of the osd_fsid.
Made it clearer.

[1] rook/rook#4547

Fixes: https://tracker.ceph.com/issues/43442
Signed-off-by: Yaniv Kaul ykaul@redhat.com
(cherry picked from commit ff3ba92)
2dab9b2 | ceph-volume: import mock.mock instead of unittest.mock (py2)

Fixes: bb4de1a
Fixes: https://tracker.ceph.com/issues/42970

Signed-off-by: Jan Fajerski jfajerski@suse.com
(cherry picked from commit c1bd09f)

Conflicts:
src/ceph-volume/ceph_volume/tests/api/test_lvm.py
a098289 | ceph-volume: fix the type mismatch, covert the tries and interval to int from string.
Fixes: https://tracker.ceph.com/issues/43186

Signed-off-by: dongdong tao dongdong.tao@canonical.com
(cherry picked from commit 81ff4be)
3437446 | ceph-volume: use correct extents when using db-devices and >1 osds_per_device

Actual data size depending on osds_per_device needs to be calculated here. Otherwise, if osds_per_device is greater than 1, ceph-volume will allocate 100% of the device to the first osd and then fail to create the LV for the second because the volume group is already full.

Fixes: https://tracker.ceph.com/issues/39442
Signed-off-by: Fabian Niepelt f.niepelt@mittwald.de
(cherry picked from commit ecde6cd)

Conflicts:
src/ceph-volume/ceph_volume/devices/lvm/strategies/bluestore.py

I've removed data_uuid since it's not in nautilus already
eb0c52a | ceph-volume: Dereference symlink in lvm list

This allows for a symlink to be passed to

ceph-volume lvm list <path>

which makes it possible to use /dev/disk/by-path/* devices, for
instance.

Fixes: https://tracker.ceph.com/issues/43497
Signed-off-by: Benoît Knecht bknecht@protonmail.ch
(cherry picked from commit 09fa3df)
24a84c5 | mon/MgrMonitor.cc: warn about missing mgr in a cluster with osds

A lot of our functionality depends on the mgr now. If there is a cluster
set up with osds but no mgr, issue a warning.

Fixes: https://tracker.ceph.com/issues/38942
Signed-off-by: Neha Ojha nojha@redhat.com
(cherry picked from commit c20e51c)
7b595aa | OSD: Allow 64-char hostname to be added as the "host" in CRUSH

On Linux system it is possible to set 64 character length hostname when
HOST_NAME_MAX is set to 64. It means that if we execute gethostname
function we should expect HOST_NAME_MAX characters + 1 for null
character ending hostname string as described here:
http://man7.org/linux/man-pages/man2/sethostname.2.html

With the current code on host with 64 long hostname osd during start
updates crush map with host=unknown_host.

Signed-off-by: Michal Skalski mskalski@juniper.net
(cherry picked from commit 5201048)
0918862 | rgw: maybe coredump when reload operator happened

Fixes: https://tracker.ceph.com/issues/42119

Signed-off-by: Richard Bai(白学余) baixueyu@inspur.com
(cherry picked from commit 78125a8)
d1d725f | rgw: fix one part of the bulk delete(RGWDeleteMultiObj_ObjStore_S3)fails but no error messages
Fixes: None
Signed-off-by: Snow Si silonghu@inspur.com

(cherry picked from commit ff2f4af)
682f231 | mon: Don't put session during feature change

The session gets put as result of the set_session call in the next
block.

Fixes: https://tracker.ceph.com/issues/38345

Signed-off-by: Brad Hubbard bhubbard@redhat.com
(cherry picked from commit 8f23f2c)
4b60c0e | mon/ConfigMonitor: only propose if leader

Introduced by 08fcf01, which activated
this (broken) code path.

Fixes: https://tracker.ceph.com/issues/43892
Signed-off-by: Sage Weil sage@redhat.com
(cherry picked from commit ab64a22)
b074312 | mgr/prometheus: report per-pool pg states

This commit adds per-pool pg states metrics
with unique 'pool_id' label.

Signed-off-by: Aleksei Zakharov zakharov.a.g@yandex.ru
(cherry picked from commit 8fb16e4)
133dd7e | mgr/prometheus: pg count by pool

If we have all other stats by pool, it's better to have total
count by pool too. We always can sum() all of total, but it's
hard to count by-pool total.

Signed-off-by: Aleksei Zakharov zakharov.a.g@yandex.ru
(cherry picked from commit 41e7d20)
1bd1929 | mgr/prometheus: pg counters per pool descriptions

Update pg metrics descriptions to show that we have per
pool stats now.

Signed-off-by: Aleksei Zakharov zakharov.a.g@yandex.ru
(cherry picked from commit 96bd77d)
191ef97 | monitoring/grafana,prometheus: add per-pool pg states support

Signed-off-by: Aleksei Zakharov zaharov@selectel.ru
(cherry picked from commit 4eb58f7)
654071d | mgr/grafana: sum pg states for cluster

Also, revert table formatting.

Signed-off-by: Aleksei Zakharov zaharov@selectel.ru
(cherry picked from commit a37cf38)
e9ff205 | mgr/telemetry: split entity_name only once (handle ids with dots)

If an entity name (id.type) has more than one dot (i.e. 'type' has
dots), split only on the first one

Fixes: https://tracker.ceph.com/issues/43313
Signed-off-by: Dan Mick dan.mick@redhat.com
(cherry picked from commit 626c7df)
0253205 | mgr/pg_autoscaler: calculate pool_pg_target using pool size

Using raw_used_rate to calculate the pool_pg_target results in too
many PGs for erasure coded pools (e.g. EC 4+2 has raw_used_rate=1.5
but size is 6, so there will be 4x too many PGs). Calculate using
p['size'] instead.

Signed-off-by: Dan van der Ster daniel.vanderster@cern.ch
Fixes: https://tracker.ceph.com/issues/43546
(cherry picked from commit b766d5f)
0e0b519 | rgw: update the hash source for multipart entries during resharding
Fixes: https://tracker.ceph.com/issues/43583

Signed-off-by: dongdong tao dongdong.tao@canonical.com
(cherry picked from commit fb6f78a)

Conflicts:
src/rgw/rgw_reshard.cc

nautilus: test: Fix wait_for_state() to wait for a PG to get into a state

Reviewed-by: Neha Ojha nojha@redhat.com
66c57a0 | Merge pull request ceph#32845 from smithfarm/wip-43245-nautilus

nautilus: os/bluestore/BlueStore.cc: set priorities for compression stats

Reviewed-by: Igor Fedotov ifedotov@suse.com
84ab662 | Merge pull request ceph#32858 from smithfarm/wip-43473-nautilus

nautilus: common: fix deadlocky inflight op visiting in OpTracker.

Reviewed-by: Kefu Chai kchai@redhat.com
02cb84c | Merge pull request ceph#32901 from smithfarm/wip-43631-nautilus

nautilus: common/util: use ifstream to read from /proc files

Reviewed-by: Kefu Chai kchai@redhat.com
af24a4a | Merge pull request ceph#32997 from neha-ojha/wip-32939-nautilus

nautilus: mon/MgrMonitor.cc: add always_on_modules to the output of "ceph mgr module ls"

Reviewed-by: David Zafman dzafman@redhat.com
b57c249 | Merge pull request ceph#32846 from smithfarm/wip-43256-nautilus

nautilus: common/config: update values when they are removed via mon

Reviewed-by: Sage Weil sage@redhat.com
6036662 | Merge pull request ceph#32998 from neha-ojha/wip-min-alloc-nautilus

nautilus: common/options: bluestore 4k min_alloc_size for SSD

Reviewed-by: Sage Weil sage@redhat.com
Reviewed-by: Igor Fedotov ifedotov@suse.com
4cdbe54 | Merge pull request ceph#32556 from shyukri/wip-43022-nautilus

nautilus: ceph-volume: minor clean-up of "simple scan" subcommand help
ac76464 | Merge pull request ceph#32558 from shyukri/wip-43275-nautilus

nautilus: ceph-volume/test: patch VolumeGroups
1964650 | ceph-volume: refactor get_devices, don't use os.path.realpath

Fixes: https://tracker.ceph.com/issues/42777

Signed-off-by: Jan Fajerski jfajerski@suse.com
(cherry picked from commit b35e8c4)
15f88a7 | ceph-volume: refactor tests for refactored get_devices

Signed-off-by: Jan Fajerski jfajerski@suse.com
(cherry picked from commit 4749f4c)

Conflicts:
src/ceph-volume/ceph_volume/tests/conftest.py
resolved by importing PropertyMock
b1045cf | ceph-volume/batch: fail on filtered devices when non-interactive

When batch is called non-interactively and a user explicitly specifies,
say a db-device, this will be filtered when unavailable. This can cause
the resulting OSD to be very different from the users intention
(standalone vs external db when the db-device was filtered). If devices
get filtered in non-interactive mode, ceph-volume should fail.

Fixes: https://tracker.ceph.com/issues/43105

Signed-off-by: Jan Fajerski jfajerski@suse.com
(cherry picked from commit 2e98505)
7d55a95 | Merge pull request ceph#31616 from jan--f/wip-42800-nautilus

nautilus: ceph-volume: assume msgrV1 for all branches containing mimic
4744db7 | Merge pull request ceph#32860 from shyukri/wip-43281-nautilus

nautilus: ceph-volume: util: look for executable in $PATH
83bf978 | Merge pull request ceph#32864 from shyukri/wip-43462-nautilus

nautilus: ceph-volume/lvm/activate.py: clarify error message: fsid refers to osd_fsid
885fbbd | Merge pull request ceph#32874 from shyukri/wip-43321-nautilus

nautilus: ceph-volume: use correct extents if using db-devices and >1 osds_per_device
d067bf3 | Merge pull request ceph#32873 from shyukri/wip-43201-nautilus

nautilus: ceph-volume: fix the integer overflow
13f8029 | Merge pull request ceph#32877 from shyukri/wip-43570-nautilus

nautilus: ceph-volume: Dereference symlink in lvm list
34dfcbb | Merge pull request ceph#32870 from shyukri/wip-43117-nautilus

nautilus: ceph-volume: import mock.mock instead of unittest.mock (py2)
fe05674 | Merge pull request ceph#33200 from jan--f/wip-42898-nautilus

nautilus: ceph-volume: make get_devices fs location independent
1b0aa07 | Merge pull request ceph#33202 from jan--f/wip-43853-nautilus

nautilus: ceph-volume/batch: fail on filtered devices when non-interactive
670351a | ceph-volume: api/lvm create or reuse a vg

This changes create_lv so one can pass the desired device and either a
VG with a name starting with ceph is re-used or a new one is created.
This commit also adds two new lvm primitives, making use of lvm's select
feature. The goal is to eventually avoid keeping a full list of lv's (or
vg's) around and query the lvm system as needed.

Signed-off-by: Jan Fajerski jfajerski@suse.com
(cherry picked from commit bb4de1a)
9089bf3 | ceph-volume: make lvm report fields into constants

Signed-off-by: Jan Fajerski jfajerski@suse.com
(cherry picked from commit 01a603f)
3abb9b2 | doc: update ceph-volume lvm prepare

Add option to pass raw physical devices everywhere, restructure a little
(bluestore section before filestore) and reword a few things.

Signed-off-by: Jan Fajerski jfajerski@suse.com
(cherry picked from commit f2018d7)
3f8a4c0 | Merge pull request ceph#32868 from shyukri/wip-42945-nautilus

nautilus: ceph-volume: allow raw block devices everywhere
44b7312 | api/lvm: add deactivate method to Volume class

Simply calls lvchange -an to deactivate a logical volume.

Signed-off-by: Jan Fajerski jfajerski@suse.com
(cherry picked from commit 8087600)
5a8d72b | api/lvm: add get_lv_by_osd_id method

Signed-off-by: Jan Fajerski jfajerski@suse.com
(cherry picked from commit 2558b55)
3a5ea62 | util/system: add unmount_tmpfs helper

Thsi unmounts a path if and only if it's a tmpfs mount.

Signed-off-by: Jan Fajerski jfajerski@suse.com
(cherry picked from commit 705ed11)
644f316 | lvm: add deactivate subcommand

This new subcommand unmounts and OSDs tmpfs mount and closes crypt
devices if there are any.

Signed-off-by: Jan Fajerski jfajerski@suse.com
(cherry picked from commit 9797f6b)
ae181f7 | Merge pull request ceph#30843 from smithfarm/wip-41853-nautilus

nautilus: mds: reject sessionless messages

Reviewed-by: Ramana Raja rraja@redhat.com
0e34ed8 | Merge pull request ceph#32600 from batrick/i43506

nautilus: MDSMonitor: warn if a new file system is being created with an EC default data pool

Reviewed-by: Ramana Raja rraja@redhat.com
9b2c28a | Merge pull request ceph#32602 from batrick/i43558

nautilus: mds: reject forward scrubs when cluster has multiple active MDS (more than one rank)

Reviewed-by: Ramana Raja rraja@redhat.com
859e889 | Merge pull request ceph#32909 from smithfarm/wip-43343-nautilus

nautilus: mds: fix revoking caps after after stale->resume circle

Reviewed-by: Ramana Raja rraja@redhat.com
99802a3 | Merge pull request ceph#32913 from smithfarm/wip-43573-nautilus

nautilus: cephfs-journal-tool: fix crash and usage

Reviewed-by: Ramana Raja rraja@redhat.com
ae50da1 | Merge pull request ceph#32914 from smithfarm/wip-43624-nautilus

nautilus: mds: note client features when rejecting client

Reviewed-by: Ramana Raja rraja@redhat.com
b20a52a | Merge pull request ceph#32915 from smithfarm/wip-43628-nautilus

nautilus: client: disallow changing fuse_default_permissions option at runtime

Reviewed-by: Ramana Raja rraja@redhat.com
Reviewed-by: Jeff Layton jlayton@redhat.com
3e8c4dc | Merge pull request ceph#33209 from jan--f/wip-deactivate-nautilus

nautilus: ceph-volume: lvm deactivate command
04391ea | lvm/deactivate: add unit tests, remove --all

Remove the --all flag until its actually implemented.

Fixes: https://tracker.ceph.com/issues/43330

Signed-off-by: Jan Fajerski jfajerski@suse.com
(cherry picked from commit c13901f)

Conflicts:
src/ceph-volume/ceph_volume/devices/lvm/deactivate.py
3b0f093 | Merge pull request ceph#32912 from smithfarm/wip-43509-nautilus

nautilus: mon: print FSMap regardless of file system count

Reviewed-by: Ramana Raja rraja@redhat.com
eac1044 | ceph-volume: add methods to pass filters to pvs, vgs and lvs commands

Filters can be passed to these commands by using option '-S'.

Signed-off-by: Rishabh Dave ridave@redhat.com
(cherry picked from commit a4f2fce)
3ea3a9c | Merge pull request ceph#33217 from jan--f/wip-32242-notrack-nautilus

nautilus: ceph-volume: add methods to pass filters to pvs, vgs and lvs commands
fdb60ea | Merge pull request ceph#32863 from shyukri/wip-43341-nautilus

nautilus: ceph-volume: lvm/deactivate: add unit tests, remove --all
1224727 | qa/tests: added client-upgrade-nautilus suite to be used on octopus release

Signed-off-by: Yuri Weinstein yweinste@redhat.com
1a4a731 | Merge pull request ceph#33220 from yuriw/wip-yuriw-clients-upgrades-nautilus

qa/tests: added client-upgrade-nautilus suite to be used on octopus …

Reviewed-by: Neha Ojha nojha@redhat.com
3f8ed66 | util/disk: extend Size class

The Size class can now parse strings and has support for arithmetic
operations and comparisons with numbers.

Signed-off-by: Jan Fajerski jfajerski@suse.com
(cherry picked from commit dd89f46)
31fb57f | lvm: add sizing arguments to prepare and create.

This adds options to size to-be-created LVs in the prepare and create
subcommands. Sizing can be done explicitly by passing a sizes or
implicitly by specifying the number of slots per [data|journal|wal|db]
device. The former will try to create a LV of the specified size and use
that to create OSDs if it succeeds. The latter will carve up the device
size into $n slots and use one of those slots for the to-be-created OSD.
If partitions or LVs are passed these options are ignored.
This also creates the foundation to move to byte-based sizing, by moving
VolumeGroup lvm querying and size calculation to bytes as the base unit.

Fixes: https://tracker.ceph.com/issues/43299

Signed-off-by: Jan Fajerski jfajerski@suse.com
(cherry picked from commit 8b8913a)
50685cf | lvm/batch: adjust devices for byte based size calculation

Signed-off-by: Jan Fajerski jfajerski@suse.com
(cherry picked from commit 9e61b80)
3dc93ff | tests: fix tests after batch sizing was fixed

Signed-off-by: Jan Fajerski jfajerski@suse.com
(cherry picked from commit 4051129)
bf67433 | ceph-volume: remove redefinition of [LV,PV,VG]_FIELDS

This was introduced in ceph#32242
erroneously.

Signed-off-by: Jan Fajerski jfajerski@suse.com
(cherry picked from commit 6f107f7)
00317e8 | ceph-volume: batch bluestore fix create_lvs call

Fixes: https://tracker.ceph.com/issues/43844

Signed-off-by: Jan Fajerski jfajerski@suse.com
(cherry picked from commit df18497)
a4df318 | Merge pull request ceph#33116 from batrick/i43137

nautilus: pybind/mgr/volumes: idle connection drop is not working

Reviewed-by: Ramana Raja rraja@redhat.com
Reviewed-by: Sage Weil sage@redhat.com
622d923 | pybind/cephfs: add method that stats symlinks without following

Add a new Python binding equivalent to lstat so that information about
the symlink itself can be also obtained, along with other type of files.

Signed-off-by: Rishabh Dave ridave@redhat.com
(cherry picked from commit 8cce7da)
97ea800 | test: use distinct subvolume/group/snapshot names

Fixes: http://tracker.ceph.com/issues/42646
Signed-off-by: Venky Shankar vshankar@redhat.com
(cherry picked from commit b9cff8a)
9243d59 | qa/tasks: Fix raises that doesn't re-raise

  • Fixed raises that doesn't re-raise
  • Dropped some commands with --force remove commands, as it is unnecessary.

Signed-off-by: Jos Collin jcollin@redhat.com
(cherry picked from commit 992d8b6)
556c7ee | qa/tasks: Fix the commands success

  • Raised RuntimeException when the commands, which were expected to fail succeed.
  • Dropped some commands with --force remove commands, as it is unnecessary.

Signed-off-by: Jos Collin jcollin@redhat.com
(cherry picked from commit c8f6a25)
a6489c7 | qa/tasks: remove subvolume, subvolumegroup and their snapshots with --force

  • tests 'fs subvolume rm --force'
  • tests 'fs subvolume snapshot rm --force'
  • tests 'fs subvolumegroup rm --force'
  • tests 'fs subvolumegroup snapshot rm --force'

Signed-off-by: Jos Collin jcollin@redhat.com
(cherry picked from commit 42c135d)
6e01b15 | qa/tasks: tests for 'fs volume create' and 'fs volume ls'

Fixes: https://tracker.ceph.com/issues/42872
Signed-off-by: Jos Collin jcollin@redhat.com
(cherry picked from commit 5e998bd)
8dd64e8 | qa/tasks: Fix the volume ls in test_volume_rm

Signed-off-by: Jos Collin jcollin@redhat.com
(cherry picked from commit 6913524)
d2586a6 | qa/tasks: Nothing to clean up if the volume was not created

There are only 2 cases which needs cleanup:

  1. The volume is successfully created
  2. The volume is successfully created but create_mds fails

In either case, we could do a 'volume rm'.

Signed-off-by: Jos Collin jcollin@redhat.com
(cherry picked from commit 67e43f4)
e81eb4a | mgr/volumes: cleanup leftovers from earlier purge job implementation

... which was not fully implemented anyway, so just remove the
boilerplates.

Signed-off-by: Venky Shankar vshankar@redhat.com
(cherry picked from commit 45c22bd)
2da46f6 | mgr/volumes: refactor dir handle cleanup

introduce with statement in rmtree. This change
simplifies the code's handling of directory cleanup.

Signed-off-by: Jos Collin jcollin@redhat.com
(cherry picked from commit 9e27cd1)
a9819fe | mgr/volumes: cleanup libcephfs handles when stopping

Signed-off-by: Venky Shankar vshankar@redhat.com
(cherry picked from commit 2eb0c50)
b1f85d8 | mgr/volumes: guard volume delete by waiting for pending ops

Signed-off-by: Venky Shankar vshankar@redhat.com
(cherry picked from commit 968f675)
6e6b55d | mgr/volumes: remove unsed variable

Signed-off-by: Joshua Schmid jschmid@suse.de
(cherry picked from commit 2f705ee)
f2304fd | mgr/volumes: move up 'confirm' validation

Instead of checking if the --yes-i-really-mean-it
flag was set after removing the MDS daemon, we
need to check if before starting any removal operation.

Fixes: https://tracker.ceph.com/issues/42931
Signed-off-by: Joshua Schmid jschmid@suse.de
(cherry picked from commit c1db7f8)
b701323 | mgr/volumes: cleanup on fs create error

  • clean up on fs create error
  • drop unnecessary check in create_pool

Signed-off-by: Jos Collin jcollin@redhat.com
(cherry picked from commit 171c375)
1cfa39e | mgr/volumes: drop obsolete comment in _cmd_fs_volume_create

This is fixed already.
Now the pool names are:
cephfs..meta
cephfs..data

Signed-off-by: Jos Collin jcollin@redhat.com
(cherry picked from commit ffda5f6)
082baca | mgr/volumes: add fs_util helper module

helpers for various filesystem querying routines, utils
for creating/removing filesystem, pool and MDSs.

Signed-off-by: Venky Shankar vshankar@redhat.com
(cherry picked from commit 6682c77)
4c17f91 | mgr/volumes: introduce volume specification module

unlike existing subvolume specification, this is just a
minimal set of globally available configurations. bulk
of other configurations will be moved to the respective
entity modules (subsequent commits).

Signed-off-by: Venky Shankar vshankar@redhat.com
(cherry picked from commit 0039b5d)
c8a5ac0 | mgr/volumes: lock module to serialize volume operations

Signed-off-by: Venky Shankar vshankar@redhat.com
(cherry picked from commit bc89d05)
7c0adc0 | mgr/volumes: implement filesystem volume module

Signed-off-by: Venky Shankar vshankar@redhat.com
(cherry picked from commit f9ae6e3)
f6e36a8 | mgr/volumes: template for implementing groups and subvolumes

Signed-off-by: Venky Shankar vshankar@redhat.com
(cherry picked from commit 74f349f)
cb592a6 | mgr/volumes: snapshot util module

Signed-off-by: Venky Shankar vshankar@redhat.com
(cherry picked from commit b30f0cb)
0afdfbf | mgr/volumes: implement trash as a subvolume group

Signed-off-by: Venky Shankar vshankar@redhat.com
(cherry picked from commit 0e3c48e)
be07465 | mgr/volumes: implement subvolume group based on group template

Signed-off-by: Venky Shankar vshankar@redhat.com
(cherry picked from commit 3eccd61)
4aefbb3 | mgr/volumes: implement subvolume based on subvolume template

subvolume base class implements common routines/helpers and
initializes a metadata manager. later, when v2 subvolume version
is implemented, the metadata manager would be used to persist
subvolume metadata in ceph filesystem. this would allow flexible
metadata management when complex subvolume features are added.

typically, a subvolume would be implemented by subclassing the
subvolume base class and the subvolume template -- instantiating
this would be called a "subvolume object".

with this commit, current subvolume topology is maintained. but
we introduce the concept of subvolume versions. a loader stub
loads available "versions" of subvolumes. right now, the only
available version is v1. since backward compatibility needs to
be maintained for existing subvolumes, the loader API allows
version discovery w/ auto upgradation to the most recent version.

Signed-off-by: Venky Shankar vshankar@redhat.com
(cherry picked from commit 97170d7)
ca4cc71 | mgr/volumes: provide subvolume create/remove/open APIs

create_subvolume() creates a subvolume with the max version known
to the plugin. open_subvolume() performs version discovery by
using loader stub and returns a subvoule object.

Signed-off-by: Venky Shankar vshankar@redhat.com
(cherry picked from commit 8a64914)
c21fa0c | mgr/volumes: tie everything together to implement versioned subvolumes

apart from the new way of provisioning subvolumes, this makes heavy
use of context manager for volumes, groups and subvolumes.

this change classifies volumes, groups and subvolumes to be treated
as filesystem dentries and inodes. a "volume" can be thought as a
dentry with "groups" as it's entries (inodes). likewise, a "group"
is a dentry again with "subvolumes" as entries (inodes). this is
built into the access mechanism as follows:

  with open_volume(...) as fs_handle:
      with open_gorup(fs_handle, ...) as group:
          with open_subvolume(group, ...) as subvolume:
              # call subvolume object API
              path = subvolume.getpath()

this way, lot of redundant checks such as verifying if a volume or
group exist before accessing a subvolume is built right into the
access mechanism, plus, an added bonus of simple error handling.

Signed-off-by: Venky Shankar vshankar@redhat.com
(cherry picked from commit 9b87bd7)
c469aef | test: auto-upgrade subvolume test

Fixes: https://tracker.ceph.com/issues/43349
Signed-off-by: Venky Shankar vshankar@redhat.com
(cherry picked from commit 03ee966)
9a86695 | mgr/volumes: remove stale subvolume module

this was lying around post versioning changes.

Signed-off-by: Venky Shankar vshankar@redhat.com
(cherry picked from commit 5595476)
340c2d8 | mgr/volumes: fail removing subvolume with snapshots

Fixes: http://tracker.ceph.com/issues/43645
Signed-off-by: Venky Shankar vshankar@redhat.com
(cherry picked from commit c158a13)
5bba598 | mgr/volumes: add operation state machine table

... and fetch creation state from state machine table.

Signed-off-by: Venky Shankar vshankar@redhat.com
(cherry picked from commit 46f29bf)
69c6da6 | mgr/volumes: module to track pending clone operations

Signed-off-by: Venky Shankar vshankar@redhat.com
(cherry picked from commit 5318779)
f544694 | mgr/volumes: get/set property for subvolume mode/uid/gid

This will be required when creating a clone as the clone would
inherit source subvolumes creation mode and uid/gid.

Signed-off-by: Venky Shankar vshankar@redhat.com
(cherry picked from commit f02b1e7)
140a5c0 | mgr/volumes: interface for creating a cloned subvolume

Signed-off-by: Venky Shankar vshankar@redhat.com
(cherry picked from commit 7089808)
628496a | mgr/volumes: handle transient subvolume states

Signed-off-by: Venky Shankar vshankar@redhat.com
(cherry picked from commit 461909b)
067b94e | mgr/volumes: add protect/unprotect and snap clone interface

Signed-off-by: Venky Shankar vshankar@redhat.com
(cherry picked from commit 8d68f1a)
c181702 | mgr/volumes: interface for fetching cloned subvolume status

Signed-off-by: Venky Shankar vshankar@redhat.com
(cherry picked from commit fa3c56f)
478a38b | mgr/volumes: add clone specific commands

Signed-off-by: Venky Shankar vshankar@redhat.com
(cherry picked from commit b2145b7)
13ce831 | mgr/volumes: fetch oldest clone entry

Signed-off-by: Venky Shankar vshankar@redhat.com
(cherry picked from commit 7ad14cf)
b05daa5 | mgr/volumes: purge thread uses new async interface

This also makes _cancel_jobs() thread safe, which was not the
case earlier (with _cancel_purge_job()) -- this also makes the
code simpler by sharing the lock betweent two condition variables.

Signed-off-by: Venky Shankar vshankar@redhat.com
(cherry picked from commit f16cc1e)
e994382 | mgr/volumes: asynchronous cloner module

Signed-off-by: Venky Shankar vshankar@redhat.com
(cherry picked from commit 4f09568)
41ca4b7 | mgr/volumes: allow force removal of incomplete failed clones

Signed-off-by: Venky Shankar vshankar@redhat.com
(cherry picked from commit 451be11)
6265db4 | test: add subvolume clone tests

Signed-off-by: Venky Shankar vshankar@redhat.com
(cherry picked from commit b5970ff)
96bcad5 | mgr/volumes: type convert uid and gid to int

This fix is only needed in nautilus, and the issue
was observed during upstream teuthology testing.

File "/usr/share/ceph/mgr/volumes/fs/async_cloner.py", line 114, in cptree
copy_file(fs_handle, d_full_src, d_full_dst, mo, st.st_uid, st.st_gid)
File "/usr/share/ceph/mgr/volumes/fs/fs_util.py", line 97, in copy_file
fs.chown(dst, uid, gid)
File "cephfs.pyx", line 855, in cephfs.LibCephFS.chown
TypeError: uid must be an int

The issue wasn't observed in master/octopus teuthology
testing.

Signed-off-by: Ramana Raja rraja@redhat.com
3b514df | mgr/volumes: fix py2 compat issue

Fix the following issue seen while upstream teuthology testing,
File "/usr/share/ceph/mgr/volumes/fs/operations/versions/subvolume_base.py", line 98, in load_config
self.metadata_mgr = MetadataManager(self.fs, self.legacy_config_path, 0o640)
File "/usr/share/ceph/mgr/volumes/fs/operations/versions/subvolume_base.py", line 73, in legacy_config_path
meta_config = "{0}.meta".format(m.digest().hex())
AttributeError: 'str' object has no attribute 'hex'

This issue is not observed in master/octopus, as it only supports
py3.

Signed-off-by: Ramana Raja rraja@redhat.com
9fe99d3 | Merge pull request ceph#33231 from jan--f/wip-43849-nautilus

nautilus: ceph-volume: add sizing arguments to prepare
2d56465 | Merge pull request ceph#33232 from jan--f/wip-43871-nautilus

nautilus: ceph-volume: batch bluestore fix create_lvs call
136a0fe | ceph-volume: filter based on tags for api.lvm.get_* methods

get_pvs, get_vgs and get_lvs must accept tags and filter volumes based
on tags.

Signed-off-by: Rishabh Dave ridave@redhat.com
(cherry picked from commit fb13909)
65f830e | ceph-volume: add helper methods to get only first LVM devs

These convenience methods shortens following phrase to
"lv = get_first_lv()" -

lvs = get_lvs()
if len(lvs) >= 1:
lvs = lv[0]

These methods do the same things as above phrase internall. Rewrite
listing.py to use these new helper methods.

Signed-off-by: Rishabh Dave ridave@redhat.com
(cherry picked from commit 17957d9)
17cba1f | ceph-volume: add new method in api/lvm.py

The method determines whether given LV is managed by Ceph or not.

Signed-off-by: Rishabh Dave ridave@redhat.com
(cherry picked from commit 876244b)
66bbe16 | ceph-volume: refactor devices/lvm/listing.py

Get rid of duplicate and redundant code and use get_lvs, get_vgs and
get_pvs to simplify the module as much as possible.

Signed-off-by: Rishabh Dave ridave@redhat.com
(cherry picked from commit d02bd7d)
27bd05e | ceph-volume: update tests since listing.py got heavily modified

Signed-off-by: Rishabh Dave ridave@redhat.com
(cherry picked from commit d1ae6d1)
6cb2aad | ceph-volume: delete test_lvs_list_is_created_just_once

lisitng.py doesn't call api.Volumes anymore. Therefore, this test is
redundant.

Signed-off-by: Rishabh Dave ridave@redhat.com
(cherry picked from commit 665ed24)
30dc935 | ceph-volume: fix lvm list

17957d9 introduced a regression in lvm list.

When passing a vg/lv path for generating a single report, it fails
because the filter used in the lvs command isn't right. It uses the lv
name instead of the vg name because os.path.basename(device) is used
while it should be os.path.dirname(device)

Fixes: https://tracker.ceph.com/issues/43969

Signed-off-by: Guillaume Abrioux gabrioux@redhat.com
(cherry picked from commit 0179fed)
6b8b86a | ceph-volume: add get_device_lvs to easily retrieve all lvs per device

Also drop the sep argument from get_lvs and siblings, unused.
Introduce LV_CMD_OPTIONS to unify options to lvs.

Signed-off-by: Jan Fajerski jfajerski@suse.com
(cherry picked from commit ffe5b57)
cad7704 | ceph-volume: fix various lvm list issues

A single report on a non-lvm device now works.
Format was cleaned up, report lvm journal,wal, db only once.

Fixes: https://tracker.ceph.com/issues/44009

Signed-off-by: Jan Fajerski jfajerski@suse.com
(cherry picked from commit 000bf2f)
fd3afb2 | Merge pull request ceph#33238 from jan--f/wip-31700-notracker-nautilus

nautilus: ceph-volume: refactor listing.py + fixes
a095f58 | ceph-volume: fix has_bluestore_label() function

When using vg/lv, this function throws an error like following:

 stderr: unable to read label for test_group/data-lv2: (2) No such file or directory
 stderr: 2020-02-04T21:03:32.153+0000 7fe091af4200 -1 bluestore(test_group/data-lv2) _read_bdev_label failed to open test_group/data-lv2: (2) No such file or directory

using self.abspath fixes this error.

Fixes: https://tracker.ceph.com/issues/43970

Signed-off-by: Guillaume Abrioux gabrioux@redhat.com
(cherry picked from commit 148069a)
ffbf252 | ceph-volume: remove stderr in has_bluestore_label()

We don't want to generate this log when a call to
has_bluestore_label() fails.

Signed-off-by: Guillaume Abrioux gabrioux@redhat.com
(cherry picked from commit 7f8371c)
d16e2bc | ceph-volume: add available property in target specific flavors

This adds two properties available_[lvm,raw] to device (and thus inventory).
The goal is to have different notions of availability based on the
intended use case. For example finding LVM structures make a drive
unavailable for the raw mode, but might be available for the lvm mode.

Fixes: https://tracker.ceph.com/issues/43400
Signed-off-by: Jan Fajerski jfajerski@suse.com
(cherry picked from commit 233ccff)
dff4c69 | ceph-volume: skip osd creation when already done

When rerunning ceph-volume lvm create on a device already prepared and
activated, ceph-volume should skip the creation.

This is a regression introduced by bb4de1a

Fixes: https://tracker.ceph.com/issues/43981

Signed-off-by: Guillaume Abrioux gabrioux@redhat.com
(cherry picked from commit 634a709)
1d48c4d | ceph-volume: add unit test test_safe_prepare_osd_already_created

This commit adds a new unit test
test_safe_prepare_osd_already_created() in order to test when
is_ceph_device() returns True RuntimeError is well raised.

Signed-off-by: Guillaume Abrioux gabrioux@redhat.com
(cherry picked from commit ccf92d7)
768a47e | mgr/DaemonServer: fix 'osd ok-to-stop' for EC pools

We need to pay attention to account for CRUSH_ITEM_NONE entries in the
EC PG acting set.

Fixes: https://tracker.ceph.com/issues/43151
Signed-off-by: Sage Weil sage@redhat.com
(cherry picked from commit 66690ea)

Conflicts:
qa/standalone/misc/ok-to-stop.sh

  • nautilus "ceph osd pool create" CLI command takes a pg_num argument
    38fc067 | qa/standalone/ceph-helpers: add wait_for_peered

Signed-off-by: Sage Weil sage@redhat.com
(cherry picked from commit 78ec6ae)
b9e8232 | qa/standalone/misc/ok-to-stop: improve test

Make sure PGs peer (simply flushing state to mon isn't enough).

Fixes: https://tracker.ceph.com/issues/43721
Signed-off-by: Sage Weil sage@redhat.com
(cherry picked from commit 76ea774)
91d0d48 | Merge pull request ceph#33122 from ajarr/wip-ajarr-mgr-volumes-nautilus

mgr/volumes: misc fix and feature enhancements

Reviewed-by: Venky Shankar vshankar@redhat.com
0c500d1 | Merge pull request ceph#32910 from smithfarm/wip-43503-nautilus

nautilus: mount.ceph: give a hint message when no mds is up or cluster is laggy

Reviewed-by: Jeff Layton jlayton@redhat.com
Reviewed-by: Ilya Dryomov idryomov@redhat.com
b03adf9 | Merge pull request ceph#32807 from smithfarm/wip-43770-nautilus

nautilus: mount.ceph: remove arbitrary limit on size of name= option

Reviewed-by: Ilya Dryomov idryomov@redhat.com
Reviewed-by: Jeff Layton jlayton@redhat.com
Reviewed-by: Ramana Raja rraja@redhat.com
ce39bb7 | Merge pull request ceph#32916 from smithfarm/wip-43729-nautilus

nautilus: cephfs: client: Add is_dir() check before changing directory

Reviewed-by: Patrick Donnelly pdonnell@redhat.com
Reviewed-by: Jeff Layton jlayton@redhat.com
8664ee6 | Merge pull request ceph#31905 from batrick/i43046

nautilus: mgr: "mds metadata" to setup new DaemonState races with fsmap

Reviewed-by: Ramana Raja rraja@redhat.com
63e487a | Merge pull request ceph#32756 from batrick/i43347

nautilus: mds: fix assert(omap_num_objs <= MAX_OBJECTS) of OpenFileTable

Reviewed-by: Ramana Raja rraja@redhat.com
79346f0 | Merge pull request ceph#32917 from smithfarm/wip-43733-nautilus

nautilus: cephfs: qa: ignore slow ops for ffsb workunit

Reviewed-by: Patrick Donnelly pdonnell@redhat.com
e95f718 | Merge pull request ceph#32918 from smithfarm/wip-43777-nautilus

nautilus: cephfs: qa: save MDS epoch barrier

Reviewed-by: Patrick Donnelly pdonnell@redhat.com
0728c08 | Merge pull request ceph#32921 from smithfarm/wip-43784-nautilus

nautilus: mds/OpenFileTable: match MAX_ITEMS_PER_OBJ to osd_deep_scrub_large_omap_object_key_threshold

Reviewed-by: Patrick Donnelly pdonnell@redhat.com
987767e | Merge pull request ceph#33115 from batrick/i43790

nautilus: RuntimeError: Files in flight high water is unexpectedly low (0 / 6)

Reviewed-by: Ramana Raja rraja@redhat.com
6bc1218 | Merge pull request ceph#33183 from smithfarm/wip-43846-nautilus

nautilus: rgw: update the hash source for multipart entries during resharding

Reviewed-by: J. Eric Ivancich ivancich@redhat.com
32dcee3 | Merge pull request ceph#32919 from smithfarm/wip-43780-nautilus

nautilus: cephfs: qa: ignore trimmed cache items for dead cache drop

Reviewed-by: Patrick Donnelly pdonnell@redhat.com
4ea0c07 | Merge pull request ceph#33242 from jan--f/wip-44047-nautilus

nautilus: ceph-volume: skip osd creation when already done
7df3e63 | ceph-volume: fix is_ceph_device for lvm batch

This is a regression introduced by 634a709

The lvm batch command fails to prepare the OSDs on the created LV.
When using lvm batch, the LV/VG are created prior the OSD prepare.
During that creation, multiple tags are set with null value.

$ lvs -o lv_tags --noheadings
ceph.cluster_fsid=null,ceph.osd_fsid=null,ceph.osd_id=null,ceph.type=null

Since we call is_ceph_device which returns True if the ceph.osd_id LVM
tag exists but doesn't test the value then we raise an execption.

When the tag value is set to 'null' then we can consider that the device
isn't part of the ceph cluster (because not yet prepared).

Closes: https://tracker.ceph.com/issues/44069

Signed-off-by: Dimitri Savineau dsavinea@redhat.com
(cherry picked from commit a825823)
6acc68f | ceph-volume: add is_ceph_device unit tests

Signed-off-by: Jan Fajerski jfajerski@suse.com
(cherry picked from commit 60d8063)
17a6e4a | ceph-volume: use get_device_vgs in has_common_vg

Fixes: https://tracker.ceph.com/issues/44099

Signed-off-by: Jan Fajerski jfajerski@suse.com
(cherry picked from commit 2c5a8c3)
f9a31b4 | Merge pull request ceph#33239 from jan--f/wip-43984-nautilus

nautilus: ceph-volume: fix has_bluestore_label() function
39e15a7 | Merge pull request ceph#33240 from jan--f/wip-44035-nautilus

nautilus: ceph-volume: finer grained availability notion in inventory.
47d3e52 | Merge pull request ceph#33253 from jan--f/wip-44109-nautilus

nautilus: ceph-volume: fix is_ceph_device for lvm batch
da9b42e | Merge pull request ceph#33254 from jan--f/wip-44112-nautilus

nautilus: ceph-volume: use get_device_vgs in has_common_vg
c506fbe | qa/suites/rados/multimon/tasks/mon_clock_with_skews: disable ntpd etc

Fixes: https://tracker.ceph.com/issues/43889
Signed-off-by: Sage Weil sage@redhat.com
(cherry picked from commit 9f2a854)
6dc0091 | qa/suites/rados/multimon/tasks/mon_clock_with_skews: whitelist MOST_DOWN

The skewed clock makes some mons miss elections.

Signed-off-by: Sage Weil sage@redhat.com
(cherry picked from commit 08b6a2b)
bda388c | common/bl: fix the dangling last_p issue.

Fixes: https://tracker.ceph.com/issues/43646
Signed-off-by: Radoslaw Zarzynski rzarzyns@redhat.com
(cherry picked from commit 8198332)

Conflicts:
src/test/bufferlist.cc
12310dd | ceph-monstore-tool: rename mon-ids in initial monmap

when ceph-mon starts, it checks to see if it's listed in the monmap, if
not it complains

no public_addr or public_network specified, and mon.a not present in
monmap or ceph.conf.

then bails out. normally, the monitor will try to rename its name in
monmap when performing "mkfs", but in our case, we are merely using the
"mkfs" monmap for passing the monmap built by ceph-monstore-tools, and
we don't actually go through the "mkfs" process. so, ceph-mon won't
rename when booting up.

in this change, user is allowed to specify the mon-ids in command line
when rebuilding mondb, the default mon-ids would be a,b,c,... if not
specified.

Signed-off-by: Kefu Chai kchai@redhat.com
(cherry picked from commit 4b3df5a)
a1b24fa | ceph-monstore-tool: correct the key for storing mgr_command_descs

Fixes: https://tracker.ceph.com/issues/43582
Signed-off-by: Kefu Chai kchai@redhat.com
(cherry picked from commit a5bfeca)
0658689 | doc: update mondb recovery script

to note that we also need to add mgr's key to monitor's keyring

Signed-off-by: Kefu Chai kchai@redhat.com
(cherry picked from commit 75f4765)
11be0f1 | Merge pull request ceph#32856 from zhengchengyao/nautilus_no_mon_update

nautilus: mon/ConfigMonitor: fix handling of NO_MON_UPDATE settings

Reviewed-by: Nathan Cutler ncutler@suse.com
Reviewed-by: Sage Weil sage@redhat.com
4e9023a | Merge pull request ceph#32905 from smithfarm/wip-43731-nautilus

nautilus: crush/CrushWrapper: behave with empty weight vector

Reviewed-by: Kefu Chai kchai@redhat.com
c7657a7 | Merge pull request ceph#32931 from smithfarm/wip-43819-nautilus

nautilus: mgr/pg_autoscaler: default to pg_num[_min] = 32

Reviewed-by: Neha Ojha nojha@redhat.com
Reviewed-by: Laura Paduano lpaduano@suse.com
f203d8b | Merge pull request ceph#33007 from smithfarm/wip-43928-nautilus

nautilus: mon: elector: return after triggering a new election

Reviewed-by: Josh Durgin jdurgin@redhat.com
ec1f782 | Merge pull request ceph#32948 from yaarith/wip-telemetry-serial-nautilus-yh

nautilus: mgr/telemetry: fix device serial number anonymization

Reviewed-by: Kefu Chai kchai@redhat.com
Reviewed-by: Neha Ojha nojha@redhat.com
0f85792 | Merge pull request ceph#33082 from k0ste/wip-43974-nautilus

nautilus: mgr/telemetry: anonymizing smartctl report itself

Reviewed-by: Sage Weil sage@redhat.com
b24c490 | Merge pull request ceph#33142 from shyukri/wip-44000-nautilus

nautilus: mon/MgrMonitor.cc: warn about missing mgr in a cluster with osds

Reviewed-by: Kefu Chai kchai@redhat.com
61a8864 | Merge pull request ceph#33147 from shyukri/wip-43989-nautilus

nautilus: osd: Allow 64-char hostname to be added as the "host" in CRUSH

Reviewed-by: Kefu Chai kchai@redhat.com
Reviewed-by: Neha Ojha nojha@redhat.com
066b4f0 | Merge pull request ceph#33155 from shyukri/wip-43916-nautilus

nautilus: mon/ConfigMonitor: only propose if leader

Reviewed-by: Kefu Chai kchai@redhat.com
3611588 | Merge pull request ceph#33157 from shyukri/wip-43924-nautilus

nautilus: mgr/prometheus: report per-pool pg states

Reviewed-by: Jan Fajerski jfajerski@suse.com
c4deea5 | Merge pull request ceph#33168 from k0ste/wip-44057-nautilus

nautilus: mgr/telemetry: split entity_name only once (handle ids with dots)

Reviewed-by: Sage Weil sage@redhat.com
11256ac | Merge pull request ceph#33170 from k0ste/wip-43727-nautilus

nautilus: mgr/pg_autoscaler: calculate pool_pg_target using pool size

Reviewed-by: Kefu Chai kchai@redhat.com
7a46e53 | Merge pull request ceph#32908 from smithfarm/wip-43821-nautilus

nautilus: mon/Session: only index osd ids >= 0
01099f6 | Merge pull request ceph#33095 from k0ste/wip-43979-nautilus

nautilus: mgr/telemetry: check get_metadata return val

Reviewed-by: David Zafman dzafman@redhat.com
081f5ce | Merge pull request ceph#33152 from shyukri/wip-43879-nautilus

nautilus: mon: Don't put session during feature change

Reviewed-by: David Zafman dzafman@redhat.com
5d41825 | Merge pull request ceph#33008 from smithfarm/wip-43922-nautilus

nautilus: rgw_file: avoid string::front() on empty path

Reviewed-by: Casey Bodley cbodley@redhat.com
9a360a9 | Merge pull request ceph#33149 from shyukri/wip-43874-nautilus

nautilus: rgw: maybe coredump when reload operator happened

Reviewed-by: Casey Bodley cbodley@redhat.com
10a8372 | Merge pull request ceph#33151 from shyukri/wip-43877-nautilus

nautilus: rgw: fix one part of the bulk delete(RGWDeleteMultiObj_ObjStore_S3) fails but no error messages

Reviewed-by: Casey Bodley cbodley@redhat.com
4e6231f | ceph-volume: avoid calling zap_lv with a LV-less VG

Fixes: https://tracker.ceph.com/issues/44125

Signed-off-by: Jan Fajerski jfajerski@suse.com
(cherry picked from commit ad0dea5)
7f6ebdc | ceph-volume: batch bluestore fix create_lvs call

Fixes: https://tracker.ceph.com/issues/43844

Signed-off-by: Jan Fajerski jfajerski@suse.com
(cherry picked from commit df18497)
dc39f21 | Merge pull request ceph#33297 from jan--f/wip-44135-nautilus

nautilus: ceph-volume: avoid calling zap_lv with a LV-less VG
d430ece | Merge pull request ceph#33301 from jan--f/wip-43871-nautilus-failed-cp

nautilus: ceph-volume: batch bluestore fix create_lvs call
2386632 | ceph-volume: pass journal_size as Size not string

Fixes: https://tracker.ceph.com/issues/44148

Signed-off-by: Jan Fajerski jfajerski@suse.com
(cherry picked from commit 49f6e6d)
104f6ca | ceph-volume: don't remove vg twice when zapping filestore

Signed-off-by: Jan Fajerski jfajerski@suse.com

Fixes: https://tracker.ceph.com/issues/44149
(cherry picked from commit bccdf6e)
5410fcd | Merge pull request ceph#33334 from jan--f/wip-44152-nautilus

nautilus: ceph-volume: pass journal_size as Size not string
21a2166 | Merge pull request ceph#33337 from jan--f/wip-44153-nautilus

nautilus: ceph-volume: don't remove vg twice when zapping filestore
59eedd8 | Merge pull request ceph#32844 from smithfarm/wip-43239-nautilus

nautilus: mgr/DaemonServer: fix 'osd ok-to-stop' for EC pools

Reviewed-by: Sage Weil sage@redhat.com
Reviewed-by: David Zafman dzafman@redhat.com
1597e2a | Merge pull request ceph#33276 from smithfarm/wip-44082-nautilus

nautilus: qa/suites/rados/multimon/tasks/mon_clock_with_skews: disable ntpd etc

Reviewed-by: Kefu Chai kchai@redhat.com
cc0b4d4 | Merge pull request ceph#33277 from smithfarm/wip-43722-nautilus

nautilus: common/bl: fix the dangling last_p issue.

Reviewed-by: Kefu Chai kchai@redhat.com
4d5b840 | Merge pull request ceph#33278 from smithfarm/wip-44085-nautilus

nautilus: ceph-monstore-tool: correct the key for storing mgr_command_descs

Reviewed-by: Kefu Chai kchai@redhat.com
70fa3cf | mgr/devicehealth: factor _get_device_metrics out of show_device_metrics

Add the min_sample lower-bound argument too

Signed-off-by: Sage Weil sage@redhat.com
(cherry picked from commit 7be5c13)
Conflicts: had to be backported to enable backporting of
ceph#32903
Backport tracker: https://tracker.ceph.com/issues/43873
21d78c0 | mgr/devicehealth: fix telemetry stops sending device reports after 48 hours

Telemetry module fetches device metrics which were scraped in the last
"telemetry interval"*2 (=48 hours by default) by calling
_get_device_metrics() with min_sample. _get_device_metrics() fetches the
metrics from omap and breaks on the first one that is older than
min_sample. But because it fetched in ascending order (from oldest to
newest) it was breaking on the first one it received, if it was older
than the interval above. We need to pass min_sample to get_omap_vals()
so it will start fetching from that value.

Fixes: https://tracker.ceph.com/issues/43837
Signed-off-by: Yaarit Hatuka yaarit@redhat.com
(cherry picked from commit 5f7e4a9)
ea8e0b0 | nautilus: qa/ceph-ansible: ansible-version and ceph_ansible

Upgrade to 2.8.1 and stable-4.0 respectively

Signed-off-by: Brad Hubbard bhubbard@redhat.com
392f471 | Merge pull request ceph#33378 from badone/wip-badone-testing

nautilus: qa/ceph-ansible: ansible-version and ceph_ansible

Reviewed-by: Yuri Weinstein yweinste@redhat.com
0079d1c | qa/suites/upgrade/mimic-x/stress-split: fix msgr2 vs nautilus ordering

This was done for octopus in 8283ea9,
but not for nautilus

Signed-off-by: Neha Ojha nojha@redhat.com
a1c7b3f | Merge pull request ceph#33470 from neha-ojha/wip-mgsr2-order-nautilus

nautilus: qa/suites/upgrade/mimic-x/stress-split: fix msgr2 vs nautilus ordering
24c0e72 | mgr: drop reference to msg on return

Caused by backport commit cb48be5 which
did not account for the explicit drop of the message reference, only in
Nautilus-.

Fixes: https://tracker.ceph.com/issues/44245
Fixes: cb48be5
Signed-off-by: Patrick Donnelly pdonnell@redhat.com
97ce2bd | Merge PR ceph#33498 into nautilus

  • refs/pull/33498/head:
    mgr: drop reference to msg on return

Reviewed-by: Venky Shankar vshankar@redhat.com
Reviewed-by: Sage Weil sage@redhat.com
9c455de | mgr/volumes: access volume in lockless mode when fetching async job

Saw a deadlock when deleting lot of subvolumes -- purge threads were
stuck in accessing global lock for volume access. This can happen
when there is a concurrent remove (which renames and signals the
purge threads) and a purge thread is just about to scan the trash
directory for entries.

For the fix, purge threads fetches entries by accessing the volume
in lockless mode. This is safe from functionality point-of-view as
the rename and directory scan is correctly handled by the filesystem.
Worst case the purge thread would pick up the trash entry on next
scan, never leaving a stale trash entry.

Signed-off-by: Venky Shankar vshankar@redhat.com
(cherry picked from commit 808a1ce)
6440d38 | test: pass timeout argument to mount::wait_for_dir_empty()

Signed-off-by: Venky Shankar vshankar@redhat.com
(cherry picked from commit 5ec09a2)
7abad7d | test: verify purge queue w/ large number of subvolumes

Fixes: http://tracker.ceph.com/issues/44282
Signed-off-by: Venky Shankar vshankar@redhat.com
(cherry picked from commit 92b2008)
2b1d255 | Merge PR ceph#33526 into nautilus

  • refs/pull/33526/head:
    test: verify purge queue w/ large number of subvolumes
    test: pass timeout argument to mount::wait_for_dir_empty()
    mgr/volumes: access volume in lockless mode when fetching async job

Reviewed-by: Patrick Donnelly pdonnell@redhat.com
d7e0d07 | Merge pull request ceph#33346 from yaarith/backport-nautilus-pr-32903

nautilus: mgr/devicehealth: fix telemetry stops sending device reports after 48 hours

Reviewed-by: Sage Weil sage@redhat.com
f071465 | mgr/volumes: unregister job upon async threads exception

If the async threads hit a temporary exception the job is
never unregistered and therefore gets skipped by the async
threads on subsequent scans.

Patrick hit this in nautilus when one of the purge threads
hit an exception when trying to log a message. The trash
entry was never picked up again by the purge threads.

Fixes: http://tracker.ceph.com/issues/44315
Signed-off-by: Venky Shankar vshankar@redhat.com
(cherry picked from commit 46476ef)
2885574 | Merge PR ceph#33569 into nautilus

  • refs/pull/33569/head:
    mgr/volumes: unregister job upon async threads exception

Reviewed-by: Ramana Raja rraja@redhat.com
Reviewed-by: Patrick Donnelly pdonnell@redhat.com
2d095e9 | 14.2.8

📝 Please access here to sign the CLA.

It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment: /check-cla to verify. Thanks.


  • If you've already signed a CLA, it's possible you're using a different email address for your gitee account. Check your existing CLA data and verify the email.
  • If you signed the CLA as an employee or a member of an organization, please contact your corporation or organization to verify you have been activated to start contributing.
  • If you have done the above and are still having issues with the CLA being reported as unsigned, please feel free to file an issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet