Make progress indicator more predictable #286

mvglasow · 2022-08-17T17:59:36Z

Running duperemove gives a percentage indicator, which, however, presents two surprises:

The percentage indicator is limited to the indexing phase. When it reaches 100%, it doesn’t mean everything is deduplicated – we’re just moving on to the next phase. (Office Space, anyone?)
Percentages for the indexing phase seem to be calculated exclusively on file count, not on block count. This may lead to surprises if all the big files tend to be underneath one particular directory – around 50%, progress will either seem to speed up or slow down dramatically.

For first-time users, this can be a bit misleading. If progress reaches 10% after a day, one might expect the whole process to take 10 days, only to find 19 days later that indexing is still in progress, and 2 days later that indexing is just one out of two or three phases.

The holy grail of UX for progress would be a near-steady rate, though I understand that may be difficult, depending on circumstances.

Suggestions:

Calculate the total number of files as well as the total number of blocks – should be fairly easy to do, just add up file sizes in blocks.
For index progress, calculate the average of file and block progress. For example, after processing 80% of files but only 20% of blocks, progress should be 50% (currently 80%).
To account for the total number of phases:
- Easy: display something like phase 1/3, 80%
- Advanced: guesstimate the percentage of total time each phase will take, and scale accordingly. For example, indexing might go from 0% to 40%, loading indexes from 40% to 60%, and actual deduplication from 60% to 100%.

For the last point, the advanced solution is probably suitable only if the duration of phases relative to each other is somewhat predictable from the moment indexing starts (else, extrapolations based on progress will become unreliable again). The easy solution is probably preferable if the duration of the phases is highly variable and cannot be predicted from the start.

The text was updated successfully, but these errors were encountered:

JackSlateur mentioned this issue Nov 24, 2023

Heavy utilization on sda1 while running duperemove option=partial on sda2(,3,4,5,6) #306

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make progress indicator more predictable #286

Make progress indicator more predictable #286

mvglasow commented Aug 17, 2022

Make progress indicator more predictable #286

Make progress indicator more predictable #286

Comments

mvglasow commented Aug 17, 2022