doc: Update CephFS disaster recovery documentation

Better documentation about spawning multiple workers to speed up the recovery proces. Signed-off-by: Wido den Hollander <wido@42on.com>
ceph · Dec 7, 2016 · 428c458 · 428c458
1 parent 9b965f9
commit 428c458
Show file tree

Hide file tree

Showing 2 changed files with 26 additions and 12 deletions.
diff --git a/doc/cephfs/disaster-recovery.rst b/doc/cephfs/disaster-recovery.rst
@@ -140,25 +140,37 @@ it into the metadata pool.
     cephfs-data-scan scan_extents <data pool>
     cephfs-data-scan scan_inodes <data pool>
 
-This command may take a very long time if there are many
-files or very large files in the data pool.  To accelerate
-the process, run multiple instances of the tool.  Decide on
-a number of workers, and pass each worker a number within
-the range 0-(N_workers - 1), like so:
+This command may take a *very long* time if there are many
+files or very large files in the data pool.
+
+To accelerate the process, run multiple instances of the tool.
+
+Decide on a number of workers, and pass each worker a number within
+the range 0-(worker_m - 1).
+
+The example below shows how to run 4 workers simultaneously:
 
 ::
 
     # Worker 0
-    cephfs-data-scan scan_extents --worker_n 0 --worker_m 1 <data pool>
+    cephfs-data-scan scan_extents --worker_n 0 --worker_m 4 <data pool>
     # Worker 1
-    cephfs-data-scan scan_extents --worker_n 1 --worker_m 1<data pool> 1 1
+    cephfs-data-scan scan_extents --worker_n 1 --worker_m 4 <data pool>
+    # Worker 2
+    cephfs-data-scan scan_extents --worker_n 2 --worker_m 4 <data pool>
+    # Worker 3
+    cephfs-data-scan scan_extents --worker_n 3 --worker_m 4 <data pool>
 
     # Worker 0
-    cephfs-data-scan scan_inodes --worker_n 0 --worker_m 1 <data pool>
+    cephfs-data-scan scan_inodes --worker_n 0 --worker_m 4 <data pool>
     # Worker 1
-    cephfs-data-scan scan_inodes --worker_n 1 --worker_m 1 <data pool>
+    cephfs-data-scan scan_inodes --worker_n 1 --worker_m 4 <data pool>
+    # Worker 2
+    cephfs-data-scan scan_inodes --worker_n 2 --worker_m 4 <data pool>
+    # Worker 3
+    cephfs-data-scan scan_inodes --worker_n 3 --worker_m 4 <data pool>
 
-It is important to ensure that all workers have completed the
+It is **important** to ensure that all workers have completed the
 scan_extents phase before any workers enter the scan_inodes phase.
 
 Finding files affected by lost data PGs

diff --git a/src/tools/cephfs/DataScan.cc b/src/tools/cephfs/DataScan.cc
@@ -32,14 +32,16 @@ void DataScan::usage()
 {
   std::cout << "Usage: \n"
     << "  cephfs-data-scan init [--force-init]\n"
-    << "  cephfs-data-scan scan_extents [--force-pool] <data pool name>\n"
-    << "  cephfs-data-scan scan_inodes [--force-pool] [--force-corrupt] <data pool name>\n"
+    << "  cephfs-data-scan scan_extents [--force-pool] [--worker_n N --worker_m M] <data pool name>" << dendl;
+    << "  cephfs-data-scan scan_inodes [--force-pool] [--force-corrupt] [--worker_n N --worker_m M] <data pool name>" << dendl;
     << "  cephfs-data-scan pg_files <path> <pg id> [<pg id>...]\n"
     << "  cephfs-data-scan scan_links\n"
     << "\n"
     << "    --force-corrupt: overrite apparently corrupt structures\n"
     << "    --force-init: write root inodes even if they exist\n"
     << "    --force-pool: use data pool even if it is not in FSMap\n"
+    << "    --worker_m: Maximum number of workers" << dendl;
+    << "    --worker_n: Worker number, range 0-(worker_m-1)" << dendl;
     << "\n"
     << "  cephfs-data-scan scan_frags [--force-corrupt]\n"
     << std::endl;