-
Notifications
You must be signed in to change notification settings - Fork 1
Home
This currently still is under development. Perhaps have a look into TODO
However besides from some quirks and presentation bugs, it works as shown below.
This tool was created after observing, that some faulty hardware introduced data errors into the image taken by ddrescue
. I don't think ddrescue
was the culprit, but the hardware problems which did not allow to all data beeing transferred correctly to the other side using sshfs
.
-
Important: This tool, by default, skips blocks of sectors which are smaller than 64 KB. If you want to check smaller (successfully read) blocks, too, then use option
-s 4K
or similar.-s0
disables skipping entirely, note that everything below the blocksize of your drive probably does the same. -
Option
-m 1M
is the default. This means, there is one checksum generated for each1 MiB
block of data seen. This usually is a good value. However if you have too few bandwidth, you can reduce the amount of data to re-read, however this increases the size of the check data you have to transfer to the broken system. With1 MiB
you get around6 MiB
of verify-checksums per100 GiB
of image. Note that it does not make much sense to reduce-m
below the blocksize of your device.
This example assumes:
- You are on your own workstation, possibly behind a firewall.
- The broken machine is booted into some rescue system and accessible as root@broken.example.com from your workstation.
- You have some
user@stable.example.com
account on a server nearbybroken.example.com
and can access this account from your workstation. -
broken.example.com
andstable.example.com
can talk to each other directly (read: Usessh
to log in to each other), probably on a faster network connection than your workstation has. - If your workstation takes the role of
stable.example.com
, then probablyssh
tunnels are your friend. This is beyond the scope of this document.
Transfer the data from broken
to stable
:
ssh root@broken.example.com 'cd /; umount /mnt; mountpoint /mnt ||
{ sshfs -C -o cache=no user@stable.example.com:/data/`hostname -f` /mnt && sleep 1 && cd /mnt &&
yes Q | ddrescue /dev/sda sda.img sda.log; }'
Repeat this, until the full image is taken when it breaks. This can take hours, because it transfers data over the network, the machine may lock up (due to faulty hardware) and so on.
Note that this is nothing special, this is the normal approach you would go with ddrescue
. So nothing new here, ddrescue-verify
was designed to not change anything in respect to ddrescue
, so you can use ddrescue-verify
even with old images, provided you still have the ddrescue
logfile around.
On the stable
system create the hashes, such that it works more quickly transferring the checks:
ssh user@stable.example.com 'cd /data/broken.example.com &&
ddrescue-verify sda.img sda.log > sda.check'
This just needs the "original" ddrescue
logfile and the original image taken by ddrescue
. No need to remember any options, ddrescue-verify
automatically detects the mode of operation, because it sees a ddrescue
log and not it's own output.
scp user@stable.example.com:/data/broken.example.com/sda.check root@broken.example.com:.
ssh root@broken.example.com 'apt-get install build-essential git;
git clone https://github.com/hilbix/ddrescue-verify;
cd ddrescue-verify && git pull && git submodule update --init && make &&
./ddrescue-verify -dui /dev/sda ~/sda.check' > sda.verify
In sda.check
there are only parts present with a checksum, which were listed a success in sda.log
. So no defective parts of the drive are touched.
If this hangs (because the machine is broken) then you need to restart the process.
Please note that restarting is not yet implemented (sorry).
There is option `-c 0xXXXXXXX' to restart the process from where it hung, but this is not really what you want. So this option needs improvement and will be changed in future.
If you are puzzled, where to find the value 0xXXXXXXX
, it is taken from tail sda.verify
. But this is not very satisfying today, as this value shows the last difference, not the last working position. So you loose some effort.
Note that I will not improve this until I need to use
ddrescue-verify
myself again. If you like, you can update it and send me a pull request. But please drop your copyright on this changes, else I cannot merge the changes back.
With the list of changes in sda.verify
you can update the parts of the image which are different:
scp sda.verify user@stable.example.com:/data/broken.example.com/sda.verify
scp sda.verify root@broken.example.com:.
ssh root@broken.example.com 'cd /; umount /mnt; mountpoint /mnt ||
{ sshfs -C -o cache=no user@stable.example.com:/data/`hostname -f` /mnt && sleep 1 && cd /mnt &&
yes Q | ddrescue /dev/sda sda.img sda.verify; }'
This pulls the changes which are listed in sda.verify
and updates the image. Note that afterwards sda.verify
is no more interesting, as some information of sda.log
may be lost. However sda.log
is still around, so you will continue to use that as the proper source!
You can repeat this step until it is complete as usual. As sda.verify
is based on sda.log
, no part of the drive is accessed which is known to be defective.
Now jump to the second step above:
- Create a new verification file
sda.check
- Run the differences, in case the broken system still was lying to you.
- Update the image
Do this until you are satisfied. This is probably, when no more changes are detected by ddrescue-verify
.