-
Notifications
You must be signed in to change notification settings - Fork 92
Closed
Description
I think I have a reproducer script of a hanging duperemove. I initially wanted to use it to measure scalability bottlenect of duperemove, but looks like I got it to get stuck:
#!/usr/bin/env bash
rm -fv /tmp/h1K.db /tmp/h1M.db
# create a directory suitable for deduping:
# it contains 1M files of size 1024 bytes.
if [[ ! -d dd ]]; then
echo "Creating directory structure, will take a minute"
mkdir dd
for d in `seq 1 1000`; do
mkdir -v dd/$d
for f in `seq 1 1000`; do
printf "%*s" 1024 "$f" > dd/$d/$f
done
done
sync
fi
echo "duperemove defaults, batch of size 1M"
time { ./duperemove -q --batchsize=1000000 -rd --hashfile=/tmp/h1M.db dd/ >/dev/null 2>&1; }
echo "duperemove defaults, batch of size 1024"
time { ./duperemove -q -rd --hashfile=/tmp/h1K.db dd/ >/dev/null 2>&1; }$ time ./bench.bash
duperemove defaults, batch of size 1M
^C^X
real 164m12,365s
user 154m14,324s
sys 1m42,050s
Note: there is no progress over two hours. I think it should succeed in minutes (or tens of minutes worst). I ran it on compressed btrfs.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels