Estimating progress in bees #72

Zygo · 2018-09-25T15:49:49Z

It would be nice if bees could estimate progress, i.e. how much of the filesystem has been scanned or needs to be scanned before bees can return to an idle state.

For subvol scans (root 5 and all roots 256 and higher), in .beeshome/beescrawl.dat we can see the position of the extent iterators:

root 5 objectid 18446744073709551611 offset 193891 min_transid 1345204 max_transid 1345216 started 1537886291 start_ts 2018-09-25-10-38-11
root 258 objectid 6385071 offset 540699 min_transid 1345204 max_transid 1345216 started 1537886291 start_ts 2018-09-25-10-38-11
root 259 objectid 0 offset 0 min_transid 1345204 max_transid 1345216 started 1537886291 start_ts 2018-09-25-10-38-11
root 289 objectid 92116 offset 257 min_transid 1345204 max_transid 1345216 started 1537886291 start_ts 2018-09-25-10-38-11

The 'objectid' field is the inode number within the subvol. If you know the largest inode number in a subvol (let's call it max_objectid), then the percentage progress is:

progress_percent = 100 * objectid / max_objectid

The scan mode (-m) option scans subvols differently:

In mode 0, all subvols are scanned in lock-step, i.e. they all progress at the same rate, and they all restart at the same time.
In mode 1, all subvols are scanned in parallel with no synchronization. Each subvol scan restarts immediately. A small subvol will be scanned many times while a large subvol is scanned once.
In mode 2, all subvols are scanned in order of start_ts, with root ID to break ties. When a subvol is completed it will not be scanned again until all other subvols have been scanned.

At the end of each subvol scan (100% completion), the max_transid field is copied to the min_transid field and the scan starts over. If all subvols have no new data, scanning stops until 10 transids have passed.

When a new subvol is detected, the lowest value for any subvol's min_transid field is copied to the min_transid field of the new subvol, since any extent older than the lowest min_transid has already been scanned.

If the min_transid field of all subvols is non-zero then at least one scan has been completed for the entire filesystem.

For root 2 (extent) or root 7 (csum) scans the objectid is a data block bytenr. In these cases the entire filesystem is scanned in a single pass; however, data block bytenrs are not contiguous, so some extra work (scans of device tree 4) has to be done to determine which parts of the bytenr space are occupied by data in order to produce an accurate estimate.

The text was updated successfully, but these errors were encountered:

Zygo added the enhancement label Sep 25, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Estimating progress in bees #72

Estimating progress in bees #72

Zygo commented Sep 25, 2018 •

edited

Loading

Estimating progress in bees #72

Estimating progress in bees #72

Comments

Zygo commented Sep 25, 2018 • edited Loading

Zygo commented Sep 25, 2018 •

edited

Loading