Skip to content
This repository
Browse code

md/raid1: don't abort a resync on the first badblock.

If a resync of a RAID1 array with 2 devices finds a known bad block
one device it will neither read from, or write to, that device for
this block offset.
So there will be one read_target (The other device) and zero write
targets.
This condition causes md/raid1 to abort the resync assuming that it
has finished - without known bad blocks this would be true.

When there are no write targets because of the presence of bad blocks
we should only skip over the area covered by the bad block.
RAID10 already gets this right, raid1 doesn't.  Or didn't.

As this can cause a 'sync' to abort early and appear to have succeeded
it could lead to some data corruption, so it suitable for -stable.

Cc: stable@vger.kernel.org
Reported-by: Alexander Lyakas <alex.bolshoy@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
  • Loading branch information...
commit b7219ccb33aa0df9949a60c68b5e9f712615e56f 1 parent 90cf195
NeilBrown authored July 31, 2012

Showing 1 changed file with 4 additions and 1 deletion. Show diff stats Hide diff stats

  1. 5  drivers/md/raid1.c
5  drivers/md/raid1.c
@@ -2502,7 +2502,10 @@ static sector_t sync_request(struct mddev *mddev, sector_t sector_nr, int *skipp
2502 2502
 		/* There is nowhere to write, so all non-sync
2503 2503
 		 * drives must be failed - so we are finished
2504 2504
 		 */
2505  
-		sector_t rv = max_sector - sector_nr;
  2505
+		sector_t rv;
  2506
+		if (min_bad > 0)
  2507
+			max_sector = sector_nr + min_bad;
  2508
+		rv = max_sector - sector_nr;
2506 2509
 		*skipped = 1;
2507 2510
 		put_buf(r1_bio);
2508 2511
 		return rv;

0 notes on commit b7219cc

Please sign in to comment.
Something went wrong with that request. Please try again.