Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle damaged blk_birth in dsl_deadlist_insert() #4089

Closed
wants to merge 1 commit into from

Conversation

behlendorf
Copy link
Contributor

If a bit were cleared in bp->blk_birth such that the txg birth
was now lower than any other txg_birth in the deadlist, then there
will be no entry before this in the tree.

This should be impossible but regardless error handling code has
been added for this case. By default this is left as a fatal case
and the blk_birth is logged. However, setting zfs_recover=1 will
cause the bp to be placed at the start of the deadlist even though
it contains an invalid blk_birth.

Signed-off-by: Brian Behlendorf behlendorf1@llnl.gov

If a bit were cleared in `bp->blk_birth` such that the txg birth
was now lower than any other txg_birth in the deadlist, then there
will be no entry before this in the tree.

This should be impossible but regardless error handling code has
been added for this case.  By default this is left as a fatal case
and the blk_birth is logged.  However, setting `zfs_recover=1` will
cause the bp to be placed at the start of the deadlist even though
it contains an invalid blk_birth.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
@tuxoko
Copy link
Contributor

tuxoko commented Dec 11, 2015

@behlendorf
What would happen a bp belongs to other dle be put to the front?

@behlendorf
Copy link
Contributor Author

Yes, I should have mentioned the consequences. Assuming it's just the birth txg which has been damaged then this bp may end up being considered to be part of more snapshots than it really is. In which case it won't be freed at the right time and that space may be leaked. Not good, so we don't want it to be the default. But better than a panic which many people will treat as a total pool loss.

@tuxoko
Copy link
Contributor

tuxoko commented Dec 11, 2015

Yeah, I think you're right. And at least it wouldn't cause the block to be removed earlier than it should.
I also think we should get more info out of the bp, like type, level. In case this is caused be some software bug.

@behlendorf
Copy link
Contributor Author

I modeled the error message after those found in zfs_blkptr_verify() for consistency. But I completely agree that we should log more information while we can. I'd suggest adding a helper function so we could easily call this throughout the code. Probably a wrapper around zfs_panic_recover, maybe called zfs_blkptr_debug which takes a blkptr_t and a variable args error message.

ryao pushed a commit to ryao/zfs that referenced this pull request Jan 4, 2016
If a bit were cleared in `bp->blk_birth` such that the txg birth
was now lower than any other txg_birth in the deadlist, then there
will be no entry before this in the tree.

This should be impossible but regardless error handling code has
been added for this case.  By default this is left as a fatal case
and the blk_birth is logged.  However, setting `zfs_recover=1` will
cause the bp to be placed at the start of the deadlist even though
it contains an invalid blk_birth.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <tuxoko@gmail.com>
Closes openzfs#4086
Closes openzfs#4089
goulvenriou pushed a commit to Alyseo/zfs that referenced this pull request Jan 17, 2016
If a bit were cleared in `bp->blk_birth` such that the txg birth
was now lower than any other txg_birth in the deadlist, then there
will be no entry before this in the tree.

This should be impossible but regardless error handling code has
been added for this case.  By default this is left as a fatal case
and the blk_birth is logged.  However, setting `zfs_recover=1` will
cause the bp to be placed at the start of the deadlist even though
it contains an invalid blk_birth.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <tuxoko@gmail.com>
Closes openzfs#4086
Closes openzfs#4089
goulvenriou pushed a commit to Alyseo/zfs that referenced this pull request Feb 4, 2016
If a bit were cleared in `bp->blk_birth` such that the txg birth
was now lower than any other txg_birth in the deadlist, then there
will be no entry before this in the tree.

This should be impossible but regardless error handling code has
been added for this case.  By default this is left as a fatal case
and the blk_birth is logged.  However, setting `zfs_recover=1` will
cause the bp to be placed at the start of the deadlist even though
it contains an invalid blk_birth.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <tuxoko@gmail.com>
Closes openzfs#4086
Closes openzfs#4089
@behlendorf behlendorf deleted the issue-4086 branch April 19, 2021 19:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants