Permalink
Browse files
Example to show btrfs-cleaner progress
Ever deleted too many subvolumes at the same time, resulting in having a
kernel thread "btrfs-cleaner" go wild using cpu or writing to disk?
Here's a little script to monitor progress of the cleaner. It reports
the amount of orphaned subvolumes that it will clean up, and if cleaning
any of them takes more than a fraction of time, progress will be
reported with 2 minute intervals.
Example output:
100 orphans left to clean
dropping root 1294230 for at least 0 sec drop_progress (637928 EXTENT_DATA 0)
dropping root 1294230 finished after at least 109 sec
99 orphans left to clean
dropping root 1094252 for at least 0 sec drop_progress (5504 DIR_ITEM 1048466060)
dropping root 1094252 for at least 120 sec drop_progress (1058244 INODE_REF 1056848)
dropping root 1094252 finished after at least 121 sec
98 orphans left to clean
dropping root 1299468 for at least 0 sec drop_progress (14216 DIR_INDEX 33)
dropping root 1299468 finished after at least 17 sec
97 orphans left to clean
dropping root 1294116 for at least 0 sec drop_progress (4297 INODE_ITEM 0)
dropping root 1294116 finished after at least 6 sec
96 orphans left to clean
dropping root 1094148 for at least 0 sec drop_progress (3193 INODE_REF 1558)
dropping root 1094148 finished after at least 7 sec
95 orphans left to clean
dropping root 1294233 for at least 0 sec drop_progress (29155 INODE_REF 28406)
dropping root 1294233 for at least 120 sec drop_progress (1718475 INODE_ITEM 0)
dropping root 1294233 for at least 240 sec drop_progress (2930889 DIR_INDEX 17)
dropping root 1294233 for at least 360 sec drop_progress (3739430 INODE_ITEM 0)
dropping root 1294233 for at least 480 sec drop_progress (5077225 INODE_ITEM 0)
dropping root 1294233 for at least 600 sec drop_progress (5762256 EXTENT_DATA 0)
dropping root 1294233 for at least 720 sec drop_progress (6754272 INODE_REF 6754207)
dropping root 1294233 for at least 840 sec drop_progress (7279795 INODE_ITEM 0)
dropping root 1294233 for at least 960 sec drop_progress (7969363 DIR_ITEM 985984353)
dropping root 1294233 for at least 1080 sec drop_progress (8304717 DIR_INDEX 25)
dropping root 1294233 for at least 1200 sec drop_progress (8668644 EXTENT_DATA 0)
dropping root 1294233 finished after at least 1292 sec
94 orphans left to clean
dropping root 1094253 for at least 0 sec drop_progress (15681 DIR_ITEM 1073933304)
dropping root 1094253 for at least 120 sec drop_progress (937036 INODE_REF 936022)
[...]
73 orphans left to clean
dropping root 1094244 for at least 0 sec drop_progress (183679 INODE_ITEM 0)
dropping root 1094244 finished after at least 6 sec
72 orphans left to clean
69 orphans left to clean
dropping root 1094183 for at least 0 sec drop_progress (112400 DIR_ITEM 4071209755)
dropping root 1094183 finished after at least 6 sec
68 orphans left to clean
66 orphans left to clean
dropping root 1094184 for at least 0 sec drop_progress (265876 DIR_ITEM 2364958367)
dropping root 1094184 finished after at least 7 sec
65 orphans left to clean
dropping root 1299429 for at least 0 sec drop_progress (69781 INODE_ITEM 0)
dropping root 1299429 finished after at least 6 sec
64 orphans left to clean
63 orphans left to clean
62 orphans left to clean
[...]
Technically, this works by querying the list of orphan subvolume IDs,
and then looking at the root items, which are still present in tree 1.
The root item which has a non-zero drop_progress is the one which is
being cleaned right now.- Loading branch information
Showing
with
49 additions
and 0 deletions.
| @@ -0,0 +1,49 @@ | ||
| #!/usr/bin/python3 | ||
|
|
||
| import btrfs | ||
| import sys | ||
| import time | ||
|
|
||
| if len(sys.argv) < 2: | ||
| print("Usage: {} <mountpoint>".format(sys.argv[0])) | ||
| sys.exit(1) | ||
|
|
||
| report_interval = 120 | ||
| zero_key = btrfs.ctree.Key(0, 0, 0) | ||
| prev_amount_orphans_left = -1 | ||
|
|
||
| fs = btrfs.FileSystem(sys.argv[1]) | ||
| while True: | ||
| current_id = None | ||
| orphans = fs.orphan_subvol_ids() | ||
| amount_orphans_left = len(orphans) | ||
| if prev_amount_orphans_left != amount_orphans_left: | ||
| print("{} orphans left to clean".format(amount_orphans_left)) | ||
| prev_amount_orphans_left = amount_orphans_left | ||
| for subvol_id in orphans: | ||
| subvolumes = list(fs.subvolumes(min_id=subvol_id, max_id=subvol_id)) | ||
| if len(subvolumes) == 0: | ||
| continue | ||
| subvol = subvolumes[0] | ||
| if subvol.drop_progress != zero_key: | ||
| current_id, since = subvol_id, int(time.time()) | ||
| break | ||
| if current_id is not None: | ||
| current_drop_progress = None | ||
| report_after = 0 | ||
| while True: | ||
| subvolumes = list(fs.subvolumes(min_id=current_id, max_id=current_id)) | ||
| duration = int(time.time()) - since | ||
| if len(subvolumes) == 0: | ||
| if report_after > 0: | ||
| print("dropping root {} finished after at least {} sec".format( | ||
| current_id, duration)) | ||
| break | ||
| if duration >= report_after: | ||
| subvol = subvolumes[0] | ||
| print("dropping root {} for at least {} sec drop_progress {}".format( | ||
| current_id, duration, subvol.drop_progress)) | ||
| report_after += report_interval | ||
| time.sleep(1) | ||
| else: | ||
| time.sleep(1) |