Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zpool scrub / status fails to address [read?] errors to disks #159

Open
jessegit opened this issue Jan 3, 2013 · 3 comments
Open

zpool scrub / status fails to address [read?] errors to disks #159

jessegit opened this issue Jan 3, 2013 · 3 comments

Comments

@jessegit
Copy link

jessegit commented Jan 3, 2013

I have a backup/archive pool (single raidz1 with 5 disks) that has a disk that is failing. As the pool is mostly only written to and the data can be recollected in a case of failures in pool rebuild, I've been running zpool scrub to fix data errors on the broken disk to somewhat guard against a loss of some of the other disks.

After new writes to the pool, scrub always has some read or checksum errors.

The pool used give errors under Solaris 11:

NAME        STATE     READ WRITE CKSUM
    c0t4d0  ONLINE      12     0    14  458K repaired

All the scrubs on the pool with Solaris 11 have returned read erros with occasional checksum errors to the disk.

Previously, on smartos, I've gotten:

NAME        STATE     READ WRITE CKSUM
    c0t4d0  ONLINE       0     0     2
and
    c0t4d0  ONLINE       0     0     6

with the total amount of corrected data before the vdevs, like:

  scan: scrub repaired 15.3M in 9h54m with 0 errors on Thu Jan  3 12:22:57 2013

The latest scrub I ran was on smartos 20121018T224723Z, but it reported (in addition to the scan: line above):

NAME        STATE     READ WRITE CKSUM
    c0t4d0  ONLINE       0     0     0

which makes me suspect scrub is not counting read error counts properly, or zpool status is not reporting them properly.

It would also be nice to assign the amount of corrected data to disks in addition of the total, like Solaris 11.

@psy0rz
Copy link

psy0rz commented Sep 26, 2013

can this problem be related to it?: https://gist.github.com/psy0rz/6714836

note that this is the latest smartos release

@alcir
Copy link

alcir commented Sep 26, 2013

Or to this: #259

@jessegit
Copy link
Author

I experienced same as https://gist.github.com/psy0rz/6714836 (on another box, not related to the problem described above) and it was fixed with dump zvol destroy/recreate, like described in #259 (my guess: on-disk format changed between os versions so the checksums are wrong)

This bug, however, stems from a real hardware problem which is not addressed to the zpool output. The total bytes fixed is shown, but none of the disks in the pool are assigned any read (nor checksum) errors. (it's also possible that sometimes the checksum errors are not assigned to any disk, but at least that works sometimes as seen above)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants