Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reinstate raw receive check when truncating #8857

Merged
merged 1 commit into from Jun 6, 2019

Conversation

tcaputi
Copy link
Contributor

@tcaputi tcaputi commented Jun 5, 2019

This patch re-adds a check that was removed in 369aa50. The check
confirms that a raw receive is not occuring before truncating an
object's dn_maxblkid. At the time, it was believed that all cases
that would hit this code path would be handled in other places,
but that was not the case.

Signed-off-by: Tom Caputi tcaputi@datto.com

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

@ahrens
Copy link
Member

ahrens commented Jun 6, 2019

At the time, it was believed that all cases
that would hit this code path would be handled in other places,
but that was not the case.

Could you describe in the comment how we get into the case where this is not handled?

@codecov
Copy link

codecov bot commented Jun 6, 2019

Codecov Report

Merging #8857 into master will increase coverage by 0.03%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #8857      +/-   ##
==========================================
+ Coverage   78.64%   78.68%   +0.03%     
==========================================
  Files         382      382              
  Lines      117771   117771              
==========================================
+ Hits        92624    92663      +39     
+ Misses      25147    25108      -39
Flag Coverage Δ
#kernel 79.24% <100%> (-0.08%) ⬇️
#user 67.53% <100%> (+0.55%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8e91c5b...6db0b05. Read the comment docs.

@tcaputi
Copy link
Contributor Author

tcaputi commented Jun 6, 2019

@ahrens I still don't have a reproducer, but my current theory is that this can happen if you write some data to the last block of a file and then later replace that data with zeros with compression on. I believe that this will cause the block to be replaced with a hole, even though its not technically a truncation. The receive side will still just get an aggregated DRR_FREE record that includes the hole, so without this check it will attempt to do a real truncation.

@kpande What test case? I'm sorry I don't remember.

@ahrens
Copy link
Member

ahrens commented Jun 6, 2019

@tcaputi That makes sense. The way you've described it sounds straightforward to reproduce. Could you verify and then add a comment explaining this? E.g. something like Usually, the last FREE record will be at the maxblkid, because the source system sets the maxblkid when truncating. However, if the last block was freed by overwriting with zeros and being compressed away to a hole, the source system will generate a FREE record while leaving maxblkid after the end of the FREE record. In this case we need to leave the maxblkid as indicated in the OBJECT record, so that it matches the source system, so that the cryptographic hashes will match.

@tcaputi
Copy link
Contributor Author

tcaputi commented Jun 6, 2019

@ahrens I will add the comment. I verified that this problem can occur, but i needed to use pv to slow down the zfs receive rate enough to ensure that a txg sync happened between the DRR_OBJECT record and the DRR_FREE record. I'm not sure how to add that to a ZTS test case.

@tcaputi
Copy link
Contributor Author

tcaputi commented Jun 6, 2019

For documentation, this is the script I used to reproduce the issue:

zpool create -f pool sdb
echo password | zfs create -o compression=on -o encryption=on -o keyformat=passphrase pool/test

yes | dd of=/pool/test/yes.txt bs=128k count=4 iflag=fullblock
zpool sync pool
zdb -e -ddddddd pool/test 2
cat /dev/zero | dd of=/pool/test/yes.txt bs=128k seek=3 count=1 iflag=fullblock
zpool sync pool
zdb -e -ddddddd pool/test 2

zfs snap pool/test@1
zfs send -w pool/test@1 | pv -L 400 | zfs recv pool/recv
echo password | zfs mount -l pool/recv

@rjhind
Copy link

rjhind commented Jun 6, 2019

but i needed to use pv to slow down the zfs receive rate enough to ensure that a txg sync happened between the DRR_OBJECT record and the DRR_FREE record

I'm generally using pv between send/receive as well when I've been hitting this.

This patch re-adds a check that was removed in 369aa50. The check
confirms that a raw receive is not occuring before truncating an
object's dn_maxblkid. At the time, it was believed that all cases
that would hit this code path would be handled in other places,
but that was not the case.

Signed-off-by: Tom Caputi <tcaputi@datto.com>
@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Jun 6, 2019
@behlendorf behlendorf added this to PRs to apply in 0.8-release Jun 6, 2019
@behlendorf behlendorf merged commit fd7a657 into openzfs:master Jun 6, 2019
behlendorf pushed a commit that referenced this pull request Jun 7, 2019
This patch re-adds a check that was removed in 369aa50. The check
confirms that a raw receive is not occuring before truncating an
object's dn_maxblkid. At the time, it was believed that all cases
that would hit this code path would be handled in other places,
but that was not the case.

Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #8852 
Closes #8857
@behlendorf behlendorf added this to Applied in 0.8.1 Jun 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
No open projects
0.8.1
  
Applied
Development

Successfully merging this pull request may close these issues.

None yet

5 participants