Skip to content

Commit f3c517d

Browse files
ahrensbehlendorf
authored andcommitted
Illumos 5820 - verify failed in zio_done(): BP_EQUAL(bp, io_bp_orig)
5820 verify failed in zio_done(): BP_EQUAL(bp, io_bp_orig) Reviewed by: Alex Reece <alex@delphix.com> Reviewed by: George Wilson <george@delphix.com> Reviewed by: Steven Hartland <killing@multiplay.co.uk> Approved by: Garrett D'Amore <garrett@damore.org> References: https://www.illumos.org/issues/5820 illumos/illumos-gate@34e8acef00 Ported-by: DHE <git@dehacked.net> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #3364
1 parent 36c6ffb commit f3c517d

File tree

1 file changed

+24
-11
lines changed

1 file changed

+24
-11
lines changed

module/zfs/dmu.c

Lines changed: 24 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1656,19 +1656,32 @@ dmu_sync(zio_t *pio, uint64_t txg, dmu_sync_cb_t *done, zgd_t *zgd)
16561656
ASSERT(dr->dr_next == NULL || dr->dr_next->dr_txg < txg);
16571657

16581658
/*
1659-
* Assume the on-disk data is X, the current syncing data is Y,
1660-
* and the current in-memory data is Z (currently in dmu_sync).
1661-
* X and Z are identical but Y is has been modified. Normally,
1662-
* when X and Z are the same we will perform a nopwrite but if Y
1663-
* is different we must disable nopwrite since the resulting write
1664-
* of Y to disk can free the block containing X. If we allowed a
1665-
* nopwrite to occur the block pointing to Z would reference a freed
1666-
* block. Since this is a rare case we simplify this by disabling
1667-
* nopwrite if the current dmu_sync-ing dbuf has been modified in
1668-
* a previous transaction.
1659+
* Assume the on-disk data is X, the current syncing data (in
1660+
* txg - 1) is Y, and the current in-memory data is Z (currently
1661+
* in dmu_sync).
1662+
*
1663+
* We usually want to perform a nopwrite if X and Z are the
1664+
* same. However, if Y is different (i.e. the BP is going to
1665+
* change before this write takes effect), then a nopwrite will
1666+
* be incorrect - we would override with X, which could have
1667+
* been freed when Y was written.
1668+
*
1669+
* (Note that this is not a concern when we are nop-writing from
1670+
* syncing context, because X and Y must be identical, because
1671+
* all previous txgs have been synced.)
1672+
*
1673+
* Therefore, we disable nopwrite if the current BP could change
1674+
* before this TXG. There are two ways it could change: by
1675+
* being dirty (dr_next is non-NULL), or by being freed
1676+
* (dnode_block_freed()). This behavior is verified by
1677+
* zio_done(), which VERIFYs that the override BP is identical
1678+
* to the on-disk BP.
16691679
*/
1670-
if (dr->dr_next)
1680+
DB_DNODE_ENTER(db);
1681+
dn = DB_DNODE(db);
1682+
if (dr->dr_next != NULL || dnode_block_freed(dn, db->db_blkid))
16711683
zp.zp_nopwrite = B_FALSE;
1684+
DB_DNODE_EXIT(db);
16721685

16731686
ASSERT(dr->dr_txg == txg);
16741687
if (dr->dt.dl.dr_override_state == DR_IN_DMU_SYNC ||

0 commit comments

Comments
 (0)