Data corruption in cluster environment with shared storage on ZOL 0.7.0-rc5 and above

### System information

Type                                | Version/Name
  ---                                  |     --- 
Distribution Name       | Debian Jessie
Distribution Version    | 8
Linux Kernel                 | 4.4.45, 3.10
Architecture                 | x86_64
ZFS Version                  | 0.7-rc5 and above
SPL Version                  | 0.7.1-1


### Describe the problem you're observing

I experienced data corruption in cluster environment (corosync, pacemaker) with shared storage after force power off one of the cluster node (tested on kvm, vmware and real hardware).

I have one pool:
```
zpool status
  pool: Pool-0
 state: ONLINE
  scan: none requested
config:

        NAME                                          STATE     READ WRITE CKSUM
        Pool-0                                        ONLINE       0     0     0
          mirror-0                                    ONLINE       0     0     0
            scsi-0QEMU_QEMU_HARDDISK_drive-scsi3-0-4  ONLINE       0     0     0
            scsi-0QEMU_QEMU_HARDDISK_drive-scsi3-0-3  ONLINE       0     0     0
```
with one zvol (primarycache=metadata, sync=always, logbias=throughput) which is shared to client host. 

After force power off one of the cluster node, second node takes over the resource and data corruption on zvol can be observed.

I tested all 0.7.0 rc versions and seems that changes in 0.7.0-rc5 had impact on synchronization. After revert that commit https://github.com/zfsonlinux/zfs/commit/1b7c1e5ce90ae27d9bb1f6f3616bf079c168005c corruption did not occur anymore.

Additional I tried different volblocksize for zvol and seems that only volume with 64k and 128k block size has something broken with synchronization.
If I add ZIL to the pool, corruption also did not happen.

I reported that bug also on https://github.com/zfsonlinux/zfs/issues/3577 but after deeper analysis I think that it is different bug

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Data corruption in cluster environment with shared storage on ZOL 0.7.0-rc5 and above #6603

System information

Describe the problem you're observing

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Type	Version/Name
Distribution Name	Debian Jessie
Distribution Version	8
Linux Kernel	4.4.45, 3.10
Architecture	x86_64
ZFS Version	0.7-rc5 and above
SPL Version	0.7.1-1

Data corruption in cluster environment with shared storage on ZOL 0.7.0-rc5 and above #6603

Description

System information

Describe the problem you're observing

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions