Skip to content

Conversation

@Fabian-Gruenbichler
Copy link
Contributor

this combines changes from

e6d3a84

OpenZFS 6393 - zfs receive a full send as a clone

and

50c957f

Implement large_dnode pool feature

to hopefully allow sending regular streams from 0.7.x to 0.6.5.x based
systems. the problematic records of the following kind now no longer lead
to an infinite loop, but instead allow the receive to complete:

drr_type = FREEOBJECTS firstobj = 64 numobjs = 36028797018963904 err = 0

see issues #5699 (older incompatibility between FreeNAS and <= 0.6.5.11)
and #6507 (recent incompatibility between 0.7.x and <= 0.6.5.11)

Signed-Off-By: Fabian Grünbichler f.gruenbichler@proxmox.com

Motivation and Context

while I get that ZoL does not guarantee API/stream format compatibility at the moment, I still think this breakage was very unfortunate - I am sure there are a lot of setups out there with mixed versions.

this also affected Illumos: https://www.mail-archive.com/discuss@lists.illumos.org/msg02735.html

one other possible way to fix this would be to add a switch to zfs send / a module parameter, which allows reverting to the old behaviour of not generating FREEOBJECTS records with huge numobjs, but the code has undergone a lot of changes since 0.6.5.11 and I am not sure how feasible this would be? the quick fix suggested in the gist above does not seem to be enough (it allows an initial full stream to be received, but the hang still occurs on incremental send/receive).

How Has This Been Tested?

as this is still an RFC/WIP, I (only) did some quick tests by sending from 0.7.1 to 0.6.5.11 with this patch applied. both full and incremental streams work and the resulting data is bit-identical to the source. punching holes and then sending those changes as incremental stream works as expected as well.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

  • My code follows the ZFS on Linux code style requirements.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.
  • All commit messages are properly formatted and contain Signed-off-by.
  • Change has been approved by a ZFS on Linux member.

this combines changes from

e6d3a84

    OpenZFS 6393 - zfs receive a full send as a clone

and

50c957f

    Implement large_dnode pool feature

to hopefully allow sending regular streams from 0.7.x to 0.6.5.x based
systems. the problematic records of the following kind now no longer
lead to an infinite loop, but instead allow the receive to complete:

drr_type = FREEOBJECTS firstobj = 64 numobjs = 36028797018963904 err = 0

see issues openzfs#5699 (older incompatibility between FreeNAS and <= 0.6.5.11)
and openzfs#6507 (recent incompatibility between 0.7.x and <= 0.6.5.11)

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Requires-spl: refs/pull/647/head
@Fabian-Gruenbichler Fabian-Gruenbichler force-pushed the pull/0.6.5.x-receive-compat branch from 57df39f to 397f816 Compare September 5, 2017 12:18
@behlendorf
Copy link
Contributor

@Fabian-Gruenbichler thanks for opening this. We definitely never intended to break compatibility in this way. I completely agree that as long as you haven't enabled certain incompatible features a system running 0.6.5 should be able to receive it. It sounds like there are two incompatibilities here.

  • The changes introduced by OpenZFS 6393 regarding free objects. While we could fix this on the receive side I think it would be better to address it on the send side. We've already pulled the upstream workaround for this in to 0.7 but it looks like we failed to expose it as module option. Can you open a PR for this, the following patch should do the job but we'll also need to add a section to man/man5/zfs-module-parameters.5.
diff --git a/module/zfs/dmu_send.c b/module/zfs/dmu_send.c
index 882926b..5eb5228 100644
--- a/module/zfs/dmu_send.c
+++ b/module/zfs/dmu_send.c
@@ -4010,6 +4010,10 @@ dmu_objset_is_receiving(objset_t *os)
 }
 
 #if defined(_KERNEL)
+module_param(zfs_send_set_freerecords_bit, int, 0644);
+MODULE_PARM_DESC(zfs_send_set_freerecords_bit,
+       "Disable setting of DRR_FLAG_FREERECORD");
+
 module_param(zfs_send_corrupt_data, int, 0644);
 MODULE_PARM_DESC(zfs_send_corrupt_data, "Allow sending corrupt data");
 #endif
  • As for the second issue I'm not sure I understand. As long as you haven't enabled the large dnode feature you should be able to send your filesystem to a system running 0.6.5. But it sounds like that's not the case?

Also if you get a chance can you please refresh this PR as-is to trigger a fresh automated testing run.

@loli10K
Copy link
Contributor

loli10K commented Sep 8, 2017

I think it would be better to address it on the send side

Agreed. Unfortunately OpenZFS 6536 zfs send: want a way to disable setting of DRR_FLAG_FREERECORDS doesn't seem to quite work the way it was intended to: this was discussed here https://www.mail-archive.com/discuss@lists.illumos.org/msg02734.html

The following patch was proposed:

diff --git a/usr/src/uts/common/fs/zfs/dmu_send.c b/usr/src/uts/common/fs/zfs/dmu_send.c
index 01a4514..db60198 100644
--- a/usr/src/uts/common/fs/zfs/dmu_send.c
+++ b/usr/src/uts/common/fs/zfs/dmu_send.c
@@ -190,6 +190,15 @@ dump_free(dmu_sendarg_t *dsp, uint64_t object, uint64_t offset,
 	    (object == dsp->dsa_last_data_object &&
 	    offset > dsp->dsa_last_data_offset));

+	/*
+	 * If we are doing a non-incremental send, then there can't
+	 * be any data in the dataset we're receiving into.  Unless we're receiving
+	 * a full send as a clone, a free record would simply be a no-op.  
+	 * If we disable the tunable for this, save space by not sending it to
+	 * begin with.
+	 */
+	if (!zfs_send_set_freerecords_bit && !dsp->dsa_incremental)
+		return (0);
+
 	if (length != -1ULL && offset + length < offset)
 		length = -1ULL;

@@ -388,6 +397,10 @@ dump_freeobjects(dmu_sendarg_t *dsp, uint64_t firstobj, uint64_t numobjs)
 {
 	struct drr_freeobjects *drrfo = &(dsp->dsa_drr->drr_u.drr_freeobjects);

+	/* See comment in dump_free(). */
+	if (!zfs_send_set_freerecords_bit && !dsp->dsa_incremental)
+		return (0);
+
 	/*
 	 * If there is a pending op, but it's not PENDING_FREEOBJECTS,
 	 * push it out, since free block aggregation can only be done for

@Fabian-Gruenbichler
Copy link
Contributor Author

Fabian-Gruenbichler commented Sep 8, 2017 via email

@loli10K
Copy link
Contributor

loli10K commented Sep 8, 2017

like I wrote under 'Motivation and Context', I did in fact port that proposed fix to ZoL

@Fabian-Gruenbichler i'm sorry, but it wasn't really clear to me the fact you personally ported/tested the patch on ZoL from your comment under "Motivation and Context".

one other possible way to fix this would be to add a switch to zfs send / a module parameter, which allows reverting to the old behaviour of not generating FREEOBJECTS records with huge numobjs, but the code has undergone a lot of changes since 0.6.5.11 and I am not sure how feasible this would be? the quick fix suggested in the gist above does not seem to be enough (it allows an initial full stream to be received, but the hang still occurs on incremental send/receive).

Anyway, thank you for working on this, this is clearly something useful.

@Fabian-Gruenbichler
Copy link
Contributor Author

sorry for not formulating that one more clearly. also the description there is a bit outdated - AFAICT it is not the huge numobjs which is the problem, but the fact that a FREEOBJECTS record is generated for objects which do not exist on the receiving side (hence the proposed fix for the receiving side in this PR ;)).

@Fabian-Gruenbichler
Copy link
Contributor Author

it is a combination of the above - the huge numobjs combined with actually iterating over it instead of bailing out on the first one which does not exist. I am not yet sure what is happening, but 0.6.5 generates a stream with a record with a huge numobj as well, but does not attempt to restore that record when receiving (whereas it does for a stream generated with 0.7.x).

dumps of an incremental stream, generated with zstreamdump:

incremental-dump-0.7.1.txt
incremental-dump-0.6.5.txt

@Fabian-Gruenbichler
Copy link
Contributor Author

closing this in favor of #6616 for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants