Compaction #14

The master version of this PR deals with the unified asok/tell interface. Nautilus- are separated. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> (cherry picked from commit 9dc07d8)

Rationale can be found in [1]. Point is that EC pools incur a significant performance penalty when dealing with small files and xattr updates. This is because _every_ inode has a corresponding data pool object with backtrace information stored in its xattr. [1] doc/cephfs/createfs.rst Fixes: https://tracker.ceph.com/issues/42450 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> (cherry picked from commit 3e0aee5)

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> (cherry picked from commit bf0cf8e) Conflicts: qa/tasks/cephfs/filesystem.py

In the future, we should add the EC data pool as a supplementary data pool but that requires a mount to setup which is awkward in the code here. When cephfs-shell is more widely available, this will be easier. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> (cherry picked from commit 6e448f9)

Connection pointer is not helpful. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> (cherry picked from commit 9b71bbe) Conflicts: src/mgr/DaemonServer.cc: actually do print the Connection*, Connection& cannot be dumped in Nautilus.

If the mgr is waiting on daemon metadata from the mons, it has no DaemonState associated with the daemon yet. If we try to process this MgrOpen, the metadata sent by the daemon (like its config) will not be recorded. Fixes: https://tracker.ceph.com/issues/43037 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> (cherry picked from commit 16a1deb)

Added 'telemetry show-device' command to print a preview of telemetry device report. Added a message at the bottom of 'telemetry show' about 'telemetry show-device' new command. Signed-off-by: Yaarit Hatuka <yaarit@redhat.com> (cherry picked from commit cae87cc) Conflicts: - path: src/pybind/mgr/telemetry/module.py comment: nautilus version of json.dumps() don't have sort_keys arg.

smartctl JSON output contains the device's serial number in two different keys ('serial_number' & 'output'). Serial is now obfuscated in both. Fixes: https://tracker.ceph.com/issues/43939 Signed-off-by: Yaarit Hatuka <yaarit@redhat.com> (cherry picked from commit be1257f)

get_metada() returns 'None' when requesting a missing service, hence trying to access its content fails. Added a check for osd and mgr get_metadata() calls. Fixes: https://tracker.ceph.com/issues/43642 Signed-off-by: Yaarit Hatuka <yaarit@redhat.com> (cherry picked from commit 9e7a0cb)

nautilus: mgr/dashboard: Using wrong identifiers in RGW user/bucket datatables Reviewed-by: Laura Paduano <lpaduano@suse.com> Reviewed-by: Volker Theile <vtheile@suse.com>

nautilus: mgr/dashboard: iSCSI targets not available if any gateway is down (and more...) Reviewed-by: Jason Dillaman <dillaman@redhat.com> Reviewed-by: Laura Paduano <lpaduano@suse.com> Reviewed-by: Tiago Melo <tmelo@suse.com> Reviewed-by: Volker Theile <vtheile@suse.com>

nautilus: mgr/dashboard: add debug mode, and accept expected exception when SSL handshaking Reviewed-by: Kefu Chai <kchai@redhat.com> Reviewed-by: Laura Paduano <lpaduano@suse.com> Reviewed-by: Stephan Müller <smueller@suse.com> Reviewed-by: Tatjana Dehler <tdehler@suse.com>

nautilus: mgr/dashboard: Dashboard can't handle self-signed cert on Grafana API Reviewed-by: Ernesto Puerta <epuertat@redhat.com> Reviewed-by: Laura Paduano <lpaduano@suse.com> Reviewed-by: Volker Theile <vtheile@suse.com>

nautilus: mgr/dashboard: check embedded Grafana dashboard references Reviewed-by: Laura Paduano <lpaduano@suse.com>

nautilus: mgr/dashboard: check if user has config-opt permissions Reviewed-by: Ernesto Puerta <epuertat@redhat.com> Reviewed-by: Laura Paduano <lpaduano@suse.com>

nautilus: mgr/dashboard: disable 'Add Capability' button in rgw user edit Reviewed-by: Laura Paduano <lpaduano@suse.com>

Nothing inherits from PQ. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> (cherry picked from commit 761cc0e) Conflicts: src/mds/PurgeQueue.h

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> (cherry picked from commit 096a5ca)

This makes the corresponding test not racy. Fixes: https://tracker.ceph.com/issues/16881 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> Conflicts: src/mds/PurgeQueue.cc src/mds/PurgeQueue.h

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> (cherry picked from commit 98e3b7e) Note: removed mgr blacklist test which applies to Octopus.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

The squelched error prevented us from knowing connection cleanup doesn't work on py3. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> (cherry picked from commit b45c08b)

Otherwise this raises an exception. Fixes: https://tracker.ceph.com/issues/43113 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> (cherry picked from commit 03f8080)

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> (cherry picked from commit 6714364)

nautilus: mgr/MgrClient: fix open condition Reviewed-by: Josh Durgin <jdurgin@redhat.com>

nautilus: selinux: Allow ceph to read udev db Reviewed-by: Kefu Chai <kchai@redhat.com> Reviewed-by: Boris Ranto <branto@redhat.com> Reviewed-by: Neha Ojha <nojha@redhat.com>

nautilus: kv: fix shutdown vs async compaction Reviewed-by: Neha Ojha <nojha@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>

nautilus: osd: Diagnostic logging for upmap cleaning Reviewed-by: David Zafman <dzafman@redhat.com>

nautilus: osd: Use physical ratio for nearfull (doesn't include backfill resserve) Reviewed-by: Neha Ojha <nojha@redhat.com>

nautilus: osd/OSD: enhance osd numa affinity compatibility Reviewed-by: Kefu Chai <kchai@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>

nautilus: osd/PeeringState.cc: skip peer_purged when discovering all missing Reviewed-by: Sage Weil <sage@redhat.com> Reviewed-by: Neha Ojha <nojha@redhat.com>

nautilus: qa/suites/rados/thrash: force normal pg log length with cache tiering Reviewed-by: Sage Weil <sage@redhat.com>

nautilus: osd/PeeringState.cc: don't let num_objects become negative Reviewed-by: Neha Ojha <nojha@redhat.com>

nautilus: core: osd/OSDMap: health alert for non-power-of-two pg_num Reviewed-by: Neha Ojha <nojha@redhat.com>

Fixes: https://tracker.ceph.com/issues/36728 Fallback to predefined paths for backward compatibility. Alter test involved for partial match in warning Signed-off-by: Shyukri Shyukriev <shshyukriev@suse.com> (cherry picked from commit a857708)

Users complained[1] the error message isn't clear, and they thought it referred to the cluster fsid instead of the osd_fsid. Made it clearer. [1] rook/rook#4547 Fixes: https://tracker.ceph.com/issues/43442 Signed-off-by: Yaniv Kaul <ykaul@redhat.com> (cherry picked from commit ff3ba92)

Fixes: bb4de1a Fixes: https://tracker.ceph.com/issues/42970 Signed-off-by: Jan Fajerski <jfajerski@suse.com> (cherry picked from commit c1bd09f) Conflicts: src/ceph-volume/ceph_volume/tests/api/test_lvm.py

…int from string. Fixes: https://tracker.ceph.com/issues/43186 Signed-off-by: dongdong tao <dongdong.tao@canonical.com> (cherry picked from commit 81ff4be)

…r_device Actual data size depending on osds_per_device needs to be calculated here. Otherwise, if osds_per_device is greater than 1, ceph-volume will allocate 100% of the device to the first osd and then fail to create the LV for the second because the volume group is already full. Fixes: https://tracker.ceph.com/issues/39442 Signed-off-by: Fabian Niepelt <f.niepelt@mittwald.de> (cherry picked from commit ecde6cd) Conflicts: src/ceph-volume/ceph_volume/devices/lvm/strategies/bluestore.py I've removed `data_uuid` since it's not in nautilus already

This allows for a symlink to be passed to ``` ceph-volume lvm list <path> ``` which makes it possible to use `/dev/disk/by-path/*` devices, for instance. Fixes: https://tracker.ceph.com/issues/43497 Signed-off-by: Benoît Knecht <bknecht@protonmail.ch> (cherry picked from commit 09fa3df)

A lot of our functionality depends on the mgr now. If there is a cluster set up with osds but no mgr, issue a warning. Fixes: https://tracker.ceph.com/issues/38942 Signed-off-by: Neha Ojha <nojha@redhat.com> (cherry picked from commit c20e51c)

On Linux system it is possible to set 64 character length hostname when HOST_NAME_MAX is set to 64. It means that if we execute gethostname function we should expect HOST_NAME_MAX characters + 1 for null character ending hostname string as described here: http://man7.org/linux/man-pages/man2/sethostname.2.html With the current code on host with 64 long hostname osd during start updates crush map with host=unknown_host. Signed-off-by: Michal Skalski <mskalski@juniper.net> (cherry picked from commit 5201048)

Fixes: https://tracker.ceph.com/issues/42119 Signed-off-by: Richard Bai(白学余) <baixueyu@inspur.com> (cherry picked from commit 78125a8)

…ils but no error messages Fixes: None Signed-off-by: Snow Si <silonghu@inspur.com> (cherry picked from commit ff2f4af)

The session gets put as result of the set_session call in the next block. Fixes: https://tracker.ceph.com/issues/38345 Signed-off-by: Brad Hubbard <bhubbard@redhat.com> (cherry picked from commit 8f23f2c)

Introduced by 08fcf01, which activated this (broken) code path. Fixes: https://tracker.ceph.com/issues/43892 Signed-off-by: Sage Weil <sage@redhat.com> (cherry picked from commit ab64a22)

This commit adds per-pool pg states metrics with unique 'pool_id' label. Signed-off-by: Aleksei Zakharov <zakharov.a.g@yandex.ru> (cherry picked from commit 8fb16e4)

If we have all other stats by pool, it's better to have total count by pool too. We always can sum() all of total, but it's hard to count by-pool total. Signed-off-by: Aleksei Zakharov <zakharov.a.g@yandex.ru> (cherry picked from commit 41e7d20)

Update pg metrics descriptions to show that we have per pool stats now. Signed-off-by: Aleksei Zakharov <zakharov.a.g@yandex.ru> (cherry picked from commit 96bd77d)

Signed-off-by: Aleksei Zakharov <zaharov@selectel.ru> (cherry picked from commit 4eb58f7)

Also, revert table formatting. Signed-off-by: Aleksei Zakharov <zaharov@selectel.ru> (cherry picked from commit a37cf38)

If an entity name (id.type) has more than one dot (i.e. 'type' has dots), split only on the first one Fixes: https://tracker.ceph.com/issues/43313 Signed-off-by: Dan Mick <dan.mick@redhat.com> (cherry picked from commit 626c7df)

Using raw_used_rate to calculate the pool_pg_target results in too many PGs for erasure coded pools (e.g. EC 4+2 has raw_used_rate=1.5 but size is 6, so there will be 4x too many PGs). Calculate using p['size'] instead. Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch> Fixes: https://tracker.ceph.com/issues/43546 (cherry picked from commit b766d5f)

Fixes: https://tracker.ceph.com/issues/43583 Signed-off-by: dongdong tao <dongdong.tao@canonical.com> (cherry picked from commit fb6f78a) Conflicts: src/rgw/rgw_reshard.cc - nautilus does not have 7e613fd - in nautilus, RGWMPObj is defined in rgw_rados.h

nautilus: test: Fix wait_for_state() to wait for a PG to get into a state Reviewed-by: Neha Ojha <nojha@redhat.com>

nautilus: os/bluestore/BlueStore.cc: set priorities for compression stats Reviewed-by: Igor Fedotov <ifedotov@suse.com>

nautilus: common: fix deadlocky inflight op visiting in OpTracker. Reviewed-by: Kefu Chai <kchai@redhat.com>

nautilus: common/util: use ifstream to read from /proc files Reviewed-by: Kefu Chai <kchai@redhat.com>

nautilus: mon/MgrMonitor.cc: add always_on_modules to the output of "ceph mgr module ls" Reviewed-by: David Zafman <dzafman@redhat.com>

nautilus: common/config: update values when they are removed via mon Reviewed-by: Sage Weil <sage@redhat.com>

nautilus: common/options: bluestore 4k min_alloc_size for SSD Reviewed-by: Sage Weil <sage@redhat.com> Reviewed-by: Igor Fedotov <ifedotov@suse.com>

nautilus: ceph-volume: minor clean-up of "simple scan" subcommand help

nautilus: ceph-volume/test: patch VolumeGroups

Fixes: https://tracker.ceph.com/issues/42777 Signed-off-by: Jan Fajerski <jfajerski@suse.com> (cherry picked from commit b35e8c4)

Signed-off-by: Jan Fajerski <jfajerski@suse.com> (cherry picked from commit 4749f4c) Conflicts: src/ceph-volume/ceph_volume/tests/conftest.py resolved by importing PropertyMock

When batch is called non-interactively and a user explicitly specifies, say a db-device, this will be filtered when unavailable. This can cause the resulting OSD to be very different from the users intention (standalone vs external db when the db-device was filtered). If devices get filtered in non-interactive mode, ceph-volume should fail. Fixes: https://tracker.ceph.com/issues/43105 Signed-off-by: Jan Fajerski <jfajerski@suse.com> (cherry picked from commit 2e98505)

nautilus: ceph-volume: assume msgrV1 for all branches containing mimic

nautilus: ceph-volume: util: look for executable in $PATH

nautilus: ceph-volume/lvm/activate.py: clarify error message: fsid refers to osd_fsid

nautilus: ceph-volume: use correct extents if using db-devices and >1 osds_per_device

nautilus: ceph-volume: fix the integer overflow

nautilus: ceph-volume: Dereference symlink in lvm list

nautilus: ceph-volume: import mock.mock instead of unittest.mock (py2)

nautilus: ceph-volume: make get_devices fs location independent

nautilus: ceph-volume/batch: fail on filtered devices when non-interactive

This changes create_lv so one can pass the desired device and either a VG with a name starting with ceph is re-used or a new one is created. This commit also adds two new lvm primitives, making use of lvm's select feature. The goal is to eventually avoid keeping a full list of lv's (or vg's) around and query the lvm system as needed. Signed-off-by: Jan Fajerski <jfajerski@suse.com> (cherry picked from commit bb4de1a)

Signed-off-by: Jan Fajerski <jfajerski@suse.com> (cherry picked from commit 01a603f)

Add option to pass raw physical devices everywhere, restructure a little (bluestore section before filestore) and reword a few things. Signed-off-by: Jan Fajerski <jfajerski@suse.com> (cherry picked from commit f2018d7)

nautilus: ceph-volume: allow raw block devices everywhere

Simply calls lvchange -an to deactivate a logical volume. Signed-off-by: Jan Fajerski <jfajerski@suse.com> (cherry picked from commit 8087600)

Signed-off-by: Jan Fajerski <jfajerski@suse.com> (cherry picked from commit 2558b55)

Thsi unmounts a path if and only if it's a tmpfs mount. Signed-off-by: Jan Fajerski <jfajerski@suse.com> (cherry picked from commit 705ed11)

This new subcommand unmounts and OSDs tmpfs mount and closes crypt devices if there are any. Signed-off-by: Jan Fajerski <jfajerski@suse.com> (cherry picked from commit 9797f6b)

nautilus: mds: reject sessionless messages Reviewed-by: Ramana Raja <rraja@redhat.com>

nautilus: MDSMonitor: warn if a new file system is being created with an EC default data pool Reviewed-by: Ramana Raja <rraja@redhat.com>

nautilus: mds: reject forward scrubs when cluster has multiple active MDS (more than one rank) Reviewed-by: Ramana Raja <rraja@redhat.com>

nautilus: mds: fix revoking caps after after stale->resume circle Reviewed-by: Ramana Raja <rraja@redhat.com>

nautilus: cephfs-journal-tool: fix crash and usage Reviewed-by: Ramana Raja <rraja@redhat.com>

nautilus: mds: note client features when rejecting client Reviewed-by: Ramana Raja <rraja@redhat.com>

nautilus: client: disallow changing fuse_default_permissions option at runtime Reviewed-by: Ramana Raja <rraja@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com>

nautilus: ceph-volume: lvm deactivate command

Remove the --all flag until its actually implemented. Fixes: https://tracker.ceph.com/issues/43330 Signed-off-by: Jan Fajerski <jfajerski@suse.com> (cherry picked from commit c13901f) Conflicts: src/ceph-volume/ceph_volume/devices/lvm/deactivate.py

nautilus: mon: print FSMap regardless of file system count Reviewed-by: Ramana Raja <rraja@redhat.com>

Filters can be passed to these commands by using option '-S'. Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit a4f2fce)

nautilus: ceph-volume: add methods to pass filters to pvs, vgs and lvs commands

nautilus: ceph-volume: lvm/deactivate: add unit tests, remove --all

…elease Signed-off-by: Yuri Weinstein <yweinste@redhat.com>

…autilus qa/tests: added client-upgrade-nautilus suite to be used on octopus … Reviewed-by: Neha Ojha <nojha@redhat.com>

The Size class can now parse strings and has support for arithmetic operations and comparisons with numbers. Signed-off-by: Jan Fajerski <jfajerski@suse.com> (cherry picked from commit dd89f46)

This adds options to size to-be-created LVs in the prepare and create subcommands. Sizing can be done explicitly by passing a sizes or implicitly by specifying the number of slots per [data|journal|wal|db] device. The former will try to create a LV of the specified size and use that to create OSDs if it succeeds. The latter will carve up the device size into $n slots and use one of those slots for the to-be-created OSD. If partitions or LVs are passed these options are ignored. This also creates the foundation to move to byte-based sizing, by moving VolumeGroup lvm querying and size calculation to bytes as the base unit. Fixes: https://tracker.ceph.com/issues/43299 Signed-off-by: Jan Fajerski <jfajerski@suse.com> (cherry picked from commit 8b8913a)

Signed-off-by: Jan Fajerski <jfajerski@suse.com> (cherry picked from commit 9e61b80)

Signed-off-by: Jan Fajerski <jfajerski@suse.com> (cherry picked from commit 4051129)

This was introduced in ceph#32242 erroneously. Signed-off-by: Jan Fajerski <jfajerski@suse.com> (cherry picked from commit 6f107f7)

Fixes: https://tracker.ceph.com/issues/43844 Signed-off-by: Jan Fajerski <jfajerski@suse.com> (cherry picked from commit df18497)

nautilus: pybind/mgr/volumes: idle connection drop is not working Reviewed-by: Ramana Raja <rraja@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>

Add a new Python binding equivalent to lstat so that information about the symlink itself can be also obtained, along with other type of files. Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit 8cce7da)

Fixes: http://tracker.ceph.com/issues/42646 Signed-off-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit b9cff8a)

* Fixed raises that doesn't re-raise * Dropped some commands with --force remove commands, as it is unnecessary. Signed-off-by: Jos Collin <jcollin@redhat.com> (cherry picked from commit 992d8b6)

* Raised RuntimeException when the commands, which were expected to fail succeed. * Dropped some commands with --force remove commands, as it is unnecessary. Signed-off-by: Jos Collin <jcollin@redhat.com> (cherry picked from commit c8f6a25)

…-force * tests 'fs subvolume rm --force' * tests 'fs subvolume snapshot rm --force' * tests 'fs subvolumegroup rm --force' * tests 'fs subvolumegroup snapshot rm --force' Signed-off-by: Jos Collin <jcollin@redhat.com> (cherry picked from commit 42c135d)

Fixes: https://tracker.ceph.com/issues/42872 Signed-off-by: Jos Collin <jcollin@redhat.com> (cherry picked from commit 5e998bd)

Signed-off-by: Jos Collin <jcollin@redhat.com> (cherry picked from commit 6913524)

There are only 2 cases which needs cleanup: 1. The volume is successfully created 2. The volume is successfully created but create_mds fails In either case, we could do a 'volume rm'. Signed-off-by: Jos Collin <jcollin@redhat.com> (cherry picked from commit 67e43f4)

... which was not fully implemented anyway, so just remove the boilerplates. Signed-off-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit 45c22bd)

introduce with statement in rmtree. This change simplifies the code's handling of directory cleanup. Signed-off-by: Jos Collin <jcollin@redhat.com> (cherry picked from commit 9e27cd1)

Signed-off-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit 2eb0c50)

Signed-off-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit 968f675)

Signed-off-by: Joshua Schmid <jschmid@suse.de> (cherry picked from commit 2f705ee)

Instead of checking if the --yes-i-really-mean-it flag was set _after_ removing the MDS daemon, we need to check if before starting any removal operation. Fixes: https://tracker.ceph.com/issues/42931 Signed-off-by: Joshua Schmid <jschmid@suse.de> (cherry picked from commit c1db7f8)

* clean up on fs create error * drop unnecessary check in create_pool Signed-off-by: Jos Collin <jcollin@redhat.com> (cherry picked from commit 171c375)

This is fixed already. Now the pool names are: cephfs.<volume name>.meta cephfs.<volume name>.data Signed-off-by: Jos Collin <jcollin@redhat.com> (cherry picked from commit ffda5f6)

helpers for various filesystem querying routines, utils for creating/removing filesystem, pool and MDSs. Signed-off-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit 6682c77)

unlike existing subvolume specification, this is just a minimal set of globally available configurations. bulk of other configurations will be moved to the respective entity modules (subsequent commits). Signed-off-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit 0039b5d)

Signed-off-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit bc89d05)

Signed-off-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit f9ae6e3)

Signed-off-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit 74f349f)

Signed-off-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit b30f0cb)

Signed-off-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit 0e3c48e)

Signed-off-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit 3eccd61)

subvolume base class implements common routines/helpers and initializes a metadata manager. later, when v2 subvolume version is implemented, the metadata manager would be used to persist subvolume metadata in ceph filesystem. this would allow flexible metadata management when complex subvolume features are added. typically, a subvolume would be implemented by subclassing the subvolume base class and the subvolume template -- instantiating this would be called a "subvolume object". with this commit, current subvolume topology is maintained. but we introduce the concept of subvolume versions. a loader stub loads available "versions" of subvolumes. right now, the only available version is v1. since backward compatibility needs to be maintained for existing subvolumes, the loader API allows version discovery w/ auto upgradation to the most recent version. Signed-off-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit 97170d7)

create_subvolume() creates a subvolume with the max version known to the plugin. open_subvolume() performs version discovery by using loader stub and returns a subvoule object. Signed-off-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit 8a64914)

apart from the new way of provisioning subvolumes, this makes heavy use of context manager for volumes, groups and subvolumes. this change classifies volumes, groups and subvolumes to be treated as filesystem dentries and inodes. a "volume" can be thought as a dentry with "groups" as it's entries (inodes). likewise, a "group" is a dentry again with "subvolumes" as entries (inodes). this is built into the access mechanism as follows: with open_volume(...) as fs_handle: with open_gorup(fs_handle, ...) as group: with open_subvolume(group, ...) as subvolume: # call subvolume object API path = subvolume.getpath() this way, lot of redundant checks such as verifying if a volume or group exist before accessing a subvolume is built right into the access mechanism, plus, an added bonus of simple error handling. Signed-off-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit 9b87bd7)

Fixes: https://tracker.ceph.com/issues/43349 Signed-off-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit 03ee966)

this was lying around post versioning changes. Signed-off-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit 5595476)

Fixes: http://tracker.ceph.com/issues/43645 Signed-off-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit c158a13)

... and fetch creation state from state machine table. Signed-off-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit 46f29bf)

Signed-off-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit 5318779)

This will be required when creating a clone as the clone would inherit source subvolumes creation mode and uid/gid. Signed-off-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit f02b1e7)

Signed-off-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit 7089808)

Signed-off-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit 461909b)

Signed-off-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit 8d68f1a)

Signed-off-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit fa3c56f)

Signed-off-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit b2145b7)

Signed-off-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit 7ad14cf)

This also makes `_cancel_jobs()` thread safe, which was not the case earlier (with `_cancel_purge_job()`) -- this also makes the code simpler by sharing the lock betweent two condition variables. Signed-off-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit f16cc1e)

Signed-off-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit 4f09568)

Signed-off-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit 451be11)

Signed-off-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit b5970ff)

This fix is only needed in nautilus, and the issue was observed during upstream teuthology testing. File "/usr/share/ceph/mgr/volumes/fs/async_cloner.py", line 114, in cptree copy_file(fs_handle, d_full_src, d_full_dst, mo, st.st_uid, st.st_gid) File "/usr/share/ceph/mgr/volumes/fs/fs_util.py", line 97, in copy_file fs.chown(dst, uid, gid) File "cephfs.pyx", line 855, in cephfs.LibCephFS.chown TypeError: uid must be an int The issue wasn't observed in master/octopus teuthology testing. Signed-off-by: Ramana Raja <rraja@redhat.com>

Fix the following issue seen while upstream teuthology testing, File "/usr/share/ceph/mgr/volumes/fs/operations/versions/subvolume_base.py", line 98, in load_config self.metadata_mgr = MetadataManager(self.fs, self.legacy_config_path, 0o640) File "/usr/share/ceph/mgr/volumes/fs/operations/versions/subvolume_base.py", line 73, in legacy_config_path meta_config = "{0}.meta".format(m.digest().hex()) AttributeError: 'str' object has no attribute 'hex' This issue is not observed in master/octopus, as it only supports py3. Signed-off-by: Ramana Raja <rraja@redhat.com>

nautilus: ceph-volume: add sizing arguments to prepare

nautilus: ceph-volume: batch bluestore fix create_lvs call

get_pvs, get_vgs and get_lvs must accept tags and filter volumes based on tags. Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit fb13909)

These convenience methods shortens following phrase to "lv = get_first_lv()" - lvs = get_lvs() if len(lvs) >= 1: lvs = lv[0] These methods do the same things as above phrase internall. Rewrite listing.py to use these new helper methods. Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit 17957d9)

The method determines whether given LV is managed by Ceph or not. Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit 876244b)

Get rid of duplicate and redundant code and use get_lvs, get_vgs and get_pvs to simplify the module as much as possible. Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit d02bd7d)

Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit d1ae6d1)

lisitng.py doesn't call api.Volumes anymore. Therefore, this test is redundant. Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit 665ed24)

17957d9 introduced a regression in `lvm list`. When passing a vg/lv path for generating a single report, it fails because the filter used in the `lvs` command isn't right. It uses the lv name instead of the vg name because `os.path.basename(device)` is used while it should be `os.path.dirname(device)` Fixes: https://tracker.ceph.com/issues/43969 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 0179fed)

Also drop the sep argument from get_lvs and siblings, unused. Introduce LV_CMD_OPTIONS to unify options to lvs. Signed-off-by: Jan Fajerski <jfajerski@suse.com> (cherry picked from commit ffe5b57)

A single report on a non-lvm device now works. Format was cleaned up, report lvm journal,wal, db only once. Fixes: https://tracker.ceph.com/issues/44009 Signed-off-by: Jan Fajerski <jfajerski@suse.com> (cherry picked from commit 000bf2f)

nautilus: ceph-volume: refactor listing.py + fixes

When using vg/lv, this function throws an error like following: ``` stderr: unable to read label for test_group/data-lv2: (2) No such file or directory stderr: 2020-02-04T21:03:32.153+0000 7fe091af4200 -1 bluestore(test_group/data-lv2) _read_bdev_label failed to open test_group/data-lv2: (2) No such file or directory ``` using `self.abspath` fixes this error. Fixes: https://tracker.ceph.com/issues/43970 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 148069a)

We don't want to generate this log when a call to `has_bluestore_label()` fails. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 7f8371c)

This adds two properties available_[lvm,raw] to device (and thus inventory). The goal is to have different notions of availability based on the intended use case. For example finding LVM structures make a drive unavailable for the raw mode, but might be available for the lvm mode. Fixes: https://tracker.ceph.com/issues/43400 Signed-off-by: Jan Fajerski <jfajerski@suse.com> (cherry picked from commit 233ccff)

When rerunning ceph-volume lvm create on a device already prepared and activated, ceph-volume should skip the creation. This is a regression introduced by bb4de1a Fixes: https://tracker.ceph.com/issues/43981 Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit 634a709)

This commit adds a new unit test `test_safe_prepare_osd_already_created()` in order to test when `is_ceph_device()` returns `True` `RuntimeError` is well raised. Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com> (cherry picked from commit ccf92d7)

We need to pay attention to account for CRUSH_ITEM_NONE entries in the EC PG acting set. Fixes: https://tracker.ceph.com/issues/43151 Signed-off-by: Sage Weil <sage@redhat.com> (cherry picked from commit 66690ea) Conflicts: qa/standalone/misc/ok-to-stop.sh - nautilus "ceph osd pool create" CLI command takes a pg_num argument

Signed-off-by: Sage Weil <sage@redhat.com> (cherry picked from commit 78ec6ae)

Make sure PGs peer (simply flushing state to mon isn't enough). Fixes: https://tracker.ceph.com/issues/43721 Signed-off-by: Sage Weil <sage@redhat.com> (cherry picked from commit 76ea774)

mgr/volumes: misc fix and feature enhancements Reviewed-by: Venky Shankar <vshankar@redhat.com>

nautilus: mount.ceph: give a hint message when no mds is up or cluster is laggy Reviewed-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Ilya Dryomov <idryomov@redhat.com>

nautilus: mount.ceph: remove arbitrary limit on size of name= option Reviewed-by: Ilya Dryomov <idryomov@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Ramana Raja <rraja@redhat.com>

nautilus: cephfs: client: Add is_dir() check before changing directory Reviewed-by: Patrick Donnelly <pdonnell@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com>

nautilus: mgr: "mds metadata" to setup new DaemonState races with fsmap Reviewed-by: Ramana Raja <rraja@redhat.com>

nautilus: mds: fix assert(omap_num_objs <= MAX_OBJECTS) of OpenFileTable Reviewed-by: Ramana Raja <rraja@redhat.com>

nautilus: cephfs: qa: ignore slow ops for ffsb workunit Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

nautilus: cephfs: qa: save MDS epoch barrier Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

nautilus: mds/OpenFileTable: match MAX_ITEMS_PER_OBJ to osd_deep_scrub_large_omap_object_key_threshold Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

nautilus: RuntimeError: Files in flight high water is unexpectedly low (0 / 6) Reviewed-by: Ramana Raja <rraja@redhat.com>

nautilus: rgw: update the hash source for multipart entries during resharding Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>

nautilus: cephfs: qa: ignore trimmed cache items for dead cache drop Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

nautilus: ceph-volume: skip osd creation when already done

This is a regression introduced by 634a709 The lvm batch command fails to prepare the OSDs on the created LV. When using lvm batch, the LV/VG are created prior the OSD prepare. During that creation, multiple tags are set with null value. $ lvs -o lv_tags --noheadings ceph.cluster_fsid=null,ceph.osd_fsid=null,ceph.osd_id=null,ceph.type=null Since we call is_ceph_device which returns True if the ceph.osd_id LVM tag exists but doesn't test the value then we raise an execption. When the tag value is set to 'null' then we can consider that the device isn't part of the ceph cluster (because not yet prepared). Closes: https://tracker.ceph.com/issues/44069 Signed-off-by: Dimitri Savineau <dsavinea@redhat.com> (cherry picked from commit a825823)

Signed-off-by: Jan Fajerski <jfajerski@suse.com> (cherry picked from commit 60d8063)

Fixes: https://tracker.ceph.com/issues/44099 Signed-off-by: Jan Fajerski <jfajerski@suse.com> (cherry picked from commit 2c5a8c3)

nautilus: ceph-volume: fix has_bluestore_label() function

nautilus: ceph-volume: finer grained availability notion in inventory.

nautilus: ceph-volume: fix is_ceph_device for lvm batch

nautilus: ceph-volume: use get_device_vgs in has_common_vg

Fixes: https://tracker.ceph.com/issues/43889 Signed-off-by: Sage Weil <sage@redhat.com> (cherry picked from commit 9f2a854)

The skewed clock makes some mons miss elections. Signed-off-by: Sage Weil <sage@redhat.com> (cherry picked from commit 08b6a2b)

Fixes: https://tracker.ceph.com/issues/43646 Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com> (cherry picked from commit 8198332) Conflicts: src/test/bufferlist.cc

when ceph-mon starts, it checks to see if it's listed in the monmap, if not it complains ``` no public_addr or public_network specified, and mon.a not present in monmap or ceph.conf. ``` then bails out. normally, the monitor will try to rename its name in monmap when performing "mkfs", but in our case, we are merely using the "mkfs" monmap for passing the monmap built by ceph-monstore-tools, and we don't actually go through the "mkfs" process. so, ceph-mon won't rename when booting up. in this change, user is allowed to specify the mon-ids in command line when rebuilding mondb, the default mon-ids would be a,b,c,... if not specified. Signed-off-by: Kefu Chai <kchai@redhat.com> (cherry picked from commit 4b3df5a)

Fixes: https://tracker.ceph.com/issues/43582 Signed-off-by: Kefu Chai <kchai@redhat.com> (cherry picked from commit a5bfeca)

to note that we also need to add mgr's key to monitor's keyring Signed-off-by: Kefu Chai <kchai@redhat.com> (cherry picked from commit 75f4765)

nautilus: mon/ConfigMonitor: fix handling of NO_MON_UPDATE settings Reviewed-by: Nathan Cutler <ncutler@suse.com> Reviewed-by: Sage Weil <sage@redhat.com>

nautilus: crush/CrushWrapper: behave with empty weight vector Reviewed-by: Kefu Chai <kchai@redhat.com>

nautilus: mgr/pg_autoscaler: default to pg_num[_min] = 32 Reviewed-by: Neha Ojha <nojha@redhat.com> Reviewed-by: Laura Paduano <lpaduano@suse.com>

nautilus: mon: elector: return after triggering a new election Reviewed-by: Josh Durgin <jdurgin@redhat.com>

…lus-yh nautilus: mgr/telemetry: fix device serial number anonymization Reviewed-by: Kefu Chai <kchai@redhat.com> Reviewed-by: Neha Ojha <nojha@redhat.com>

nautilus: mgr/telemetry: anonymizing smartctl report itself Reviewed-by: Sage Weil <sage@redhat.com>

nautilus: mon/MgrMonitor.cc: warn about missing mgr in a cluster with osds Reviewed-by: Kefu Chai <kchai@redhat.com>

nautilus: osd: Allow 64-char hostname to be added as the "host" in CRUSH Reviewed-by: Kefu Chai <kchai@redhat.com> Reviewed-by: Neha Ojha <nojha@redhat.com>

nautilus: mon/ConfigMonitor: only propose if leader Reviewed-by: Kefu Chai <kchai@redhat.com>

nautilus: mgr/prometheus: report per-pool pg states Reviewed-by: Jan Fajerski <jfajerski@suse.com>

nautilus: mgr/telemetry: split entity_name only once (handle ids with dots) Reviewed-by: Sage Weil <sage@redhat.com>

nautilus: mgr/pg_autoscaler: calculate pool_pg_target using pool size Reviewed-by: Kefu Chai <kchai@redhat.com>

nautilus: mon/Session: only index osd ids >= 0

nautilus: mgr/telemetry: check get_metadata return val Reviewed-by: David Zafman <dzafman@redhat.com>

nautilus: mon: Don't put session during feature change Reviewed-by: David Zafman <dzafman@redhat.com>

nautilus: rgw_file: avoid string::front() on empty path Reviewed-by: Casey Bodley <cbodley@redhat.com>

nautilus: rgw: maybe coredump when reload operator happened Reviewed-by: Casey Bodley <cbodley@redhat.com>

nautilus: rgw: fix one part of the bulk delete(RGWDeleteMultiObj_ObjStore_S3) fails but no error messages Reviewed-by: Casey Bodley <cbodley@redhat.com>

Fixes: https://tracker.ceph.com/issues/44125 Signed-off-by: Jan Fajerski <jfajerski@suse.com> (cherry picked from commit ad0dea5)

Fixes: https://tracker.ceph.com/issues/43844 Signed-off-by: Jan Fajerski <jfajerski@suse.com> (cherry picked from commit df18497)

nautilus: ceph-volume: avoid calling zap_lv with a LV-less VG

nautilus: ceph-volume: batch bluestore fix create_lvs call

Fixes: https://tracker.ceph.com/issues/44148 Signed-off-by: Jan Fajerski <jfajerski@suse.com> (cherry picked from commit 49f6e6d)

Signed-off-by: Jan Fajerski <jfajerski@suse.com> Fixes: https://tracker.ceph.com/issues/44149 (cherry picked from commit bccdf6e)

nautilus: ceph-volume: pass journal_size as Size not string

nautilus: ceph-volume: don't remove vg twice when zapping filestore

nautilus: mgr/DaemonServer: fix 'osd ok-to-stop' for EC pools Reviewed-by: Sage Weil <sage@redhat.com> Reviewed-by: David Zafman <dzafman@redhat.com>

nautilus: qa/suites/rados/multimon/tasks/mon_clock_with_skews: disable ntpd etc Reviewed-by: Kefu Chai <kchai@redhat.com>

nautilus: common/bl: fix the dangling last_p issue. Reviewed-by: Kefu Chai <kchai@redhat.com>

nautilus: ceph-monstore-tool: correct the key for storing mgr_command_descs Reviewed-by: Kefu Chai <kchai@redhat.com>

Add the min_sample lower-bound argument too Signed-off-by: Sage Weil <sage@redhat.com> (cherry picked from commit 7be5c13) Conflicts: had to be backported to enable backporting of ceph#32903 Backport tracker: https://tracker.ceph.com/issues/43873

… hours Telemetry module fetches device metrics which were scraped in the last "telemetry interval"*2 (=48 hours by default) by calling _get_device_metrics() with min_sample. _get_device_metrics() fetches the metrics from omap and breaks on the first one that is older than min_sample. But because it fetched in ascending order (from oldest to newest) it was breaking on the first one it received, if it was older than the interval above. We need to pass min_sample to get_omap_vals() so it will start fetching from that value. Fixes: https://tracker.ceph.com/issues/43837 Signed-off-by: Yaarit Hatuka <yaarit@redhat.com> (cherry picked from commit 5f7e4a9)

Upgrade to 2.8.1 and stable-4.0 respectively Signed-off-by: Brad Hubbard <bhubbard@redhat.com>

nautilus: qa/ceph-ansible: ansible-version and ceph_ansible Reviewed-by: Yuri Weinstein <yweinste@redhat.com>

This was done for octopus in 8283ea9, but not for nautilus Signed-off-by: Neha Ojha <nojha@redhat.com>

nautilus: qa/suites/upgrade/mimic-x/stress-split: fix msgr2 vs nautilus ordering

Caused by backport commit cb48be5 which did not account for the explicit drop of the message reference, only in Nautilus-. Fixes: https://tracker.ceph.com/issues/44245 Fixes: cb48be5 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

* refs/pull/33498/head: mgr: drop reference to msg on return Reviewed-by: Venky Shankar <vshankar@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>

Saw a deadlock when deleting lot of subvolumes -- purge threads were stuck in accessing global lock for volume access. This can happen when there is a concurrent remove (which renames and signals the purge threads) and a purge thread is just about to scan the trash directory for entries. For the fix, purge threads fetches entries by accessing the volume in lockless mode. This is safe from functionality point-of-view as the rename and directory scan is correctly handled by the filesystem. Worst case the purge thread would pick up the trash entry on next scan, never leaving a stale trash entry. Signed-off-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit 808a1ce)

Signed-off-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit 5ec09a2)

Fixes: http://tracker.ceph.com/issues/44282 Signed-off-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit 92b2008)

* refs/pull/33526/head: test: verify purge queue w/ large number of subvolumes test: pass timeout argument to mount::wait_for_dir_empty() mgr/volumes: access volume in lockless mode when fetching async job Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

nautilus: mgr/devicehealth: fix telemetry stops sending device reports after 48 hours Reviewed-by: Sage Weil <sage@redhat.com>

If the async threads hit a temporary exception the job is never unregistered and therefore gets skipped by the async threads on subsequent scans. Patrick hit this in nautilus when one of the purge threads hit an exception when trying to log a message. The trash entry was never picked up again by the purge threads. Fixes: http://tracker.ceph.com/issues/44315 Signed-off-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit 46476ef)

* refs/pull/33569/head: mgr/volumes: unregister job upon async threads exception Reviewed-by: Ramana Raja <rraja@redhat.com> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

Commits on Mar 2, 2020

14.2.8

Jenkins Build Slave User committed Mar 2, 2020

Copy the full SHA

2d095e9 View commit details

Browse the repository at this point in the history

Commits on Oct 31, 2021

revised BlueStore.cc for compaction

SizheLuo committed Oct 31, 2021

Copy the full SHA

71cc8a4 View commit details

Browse the repository at this point in the history

Commits on Nov 1, 2021

compaction for ceph

luosizhe-HW committed Nov 1, 2021

Copy the full SHA

3517896 View commit details

Browse the repository at this point in the history

Commits on Nov 8, 2021

use aio_read in compaction

luosizhe-HW committed Nov 8, 2021

Copy the full SHA

63de917 View commit details

Browse the repository at this point in the history

Commits on Dec 20, 2021

cleancode for data compaction

luosizhe-HW committed Dec 20, 2021

Copy the full SHA

e8e22f2 View commit details

Browse the repository at this point in the history

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compaction #14

Compaction #14

Commits on Feb 5, 2020

Commits on Feb 6, 2020

Commits on Feb 7, 2020

Commits on Feb 8, 2020

Commits on Feb 10, 2020

Commits on Feb 11, 2020

Commits on Feb 12, 2020

Commits on Feb 13, 2020

Commits on Feb 14, 2020

Commits on Feb 15, 2020

Commits on Feb 18, 2020

Commits on Feb 21, 2020

Commits on Feb 23, 2020

Commits on Feb 24, 2020

Commits on Feb 25, 2020

Commits on Feb 27, 2020

Commits on Mar 2, 2020

Commits on Oct 31, 2021

Commits on Nov 1, 2021

Commits on Nov 2, 2021

Commits on Nov 8, 2021

Commits on Dec 20, 2021