-
Notifications
You must be signed in to change notification settings - Fork 1.9k
[RFC/WIP] backport freeobjects receive handling from 0.7.x to 0.6.5.x #6602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC/WIP] backport freeobjects receive handling from 0.7.x to 0.6.5.x #6602
Conversation
ab2e23b to
57df39f
Compare
this combines changes from e6d3a84 OpenZFS 6393 - zfs receive a full send as a clone and 50c957f Implement large_dnode pool feature to hopefully allow sending regular streams from 0.7.x to 0.6.5.x based systems. the problematic records of the following kind now no longer lead to an infinite loop, but instead allow the receive to complete: drr_type = FREEOBJECTS firstobj = 64 numobjs = 36028797018963904 err = 0 see issues openzfs#5699 (older incompatibility between FreeNAS and <= 0.6.5.11) and openzfs#6507 (recent incompatibility between 0.7.x and <= 0.6.5.11) Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com> Requires-spl: refs/pull/647/head
57df39f to
397f816
Compare
|
@Fabian-Gruenbichler thanks for opening this. We definitely never intended to break compatibility in this way. I completely agree that as long as you haven't enabled certain incompatible features a system running 0.6.5 should be able to receive it. It sounds like there are two incompatibilities here.
diff --git a/module/zfs/dmu_send.c b/module/zfs/dmu_send.c
index 882926b..5eb5228 100644
--- a/module/zfs/dmu_send.c
+++ b/module/zfs/dmu_send.c
@@ -4010,6 +4010,10 @@ dmu_objset_is_receiving(objset_t *os)
}
#if defined(_KERNEL)
+module_param(zfs_send_set_freerecords_bit, int, 0644);
+MODULE_PARM_DESC(zfs_send_set_freerecords_bit,
+ "Disable setting of DRR_FLAG_FREERECORD");
+
module_param(zfs_send_corrupt_data, int, 0644);
MODULE_PARM_DESC(zfs_send_corrupt_data, "Allow sending corrupt data");
#endif
Also if you get a chance can you please refresh this PR as-is to trigger a fresh automated testing run. |
Agreed. Unfortunately OpenZFS 6536 zfs send: want a way to disable setting of DRR_FLAG_FREERECORDS doesn't seem to quite work the way it was intended to: this was discussed here https://www.mail-archive.com/discuss@lists.illumos.org/msg02734.html The following patch was proposed: diff --git a/usr/src/uts/common/fs/zfs/dmu_send.c b/usr/src/uts/common/fs/zfs/dmu_send.c
index 01a4514..db60198 100644
--- a/usr/src/uts/common/fs/zfs/dmu_send.c
+++ b/usr/src/uts/common/fs/zfs/dmu_send.c
@@ -190,6 +190,15 @@ dump_free(dmu_sendarg_t *dsp, uint64_t object, uint64_t offset,
(object == dsp->dsa_last_data_object &&
offset > dsp->dsa_last_data_offset));
+ /*
+ * If we are doing a non-incremental send, then there can't
+ * be any data in the dataset we're receiving into. Unless we're receiving
+ * a full send as a clone, a free record would simply be a no-op.
+ * If we disable the tunable for this, save space by not sending it to
+ * begin with.
+ */
+ if (!zfs_send_set_freerecords_bit && !dsp->dsa_incremental)
+ return (0);
+
if (length != -1ULL && offset + length < offset)
length = -1ULL;
@@ -388,6 +397,10 @@ dump_freeobjects(dmu_sendarg_t *dsp, uint64_t firstobj, uint64_t numobjs)
{
struct drr_freeobjects *drrfo = &(dsp->dsa_drr->drr_u.drr_freeobjects);
+ /* See comment in dump_free(). */
+ if (!zfs_send_set_freerecords_bit && !dsp->dsa_incremental)
+ return (0);
+
/*
* If there is a pending op, but it's not PENDING_FREEOBJECTS,
* push it out, since free block aggregation can only be done for |
|
On Fri, Sep 08, 2017 at 03:25:37AM +0000, LOLi wrote:
> I think it would be better to address it on the send side
Agreed. Unfortunately _OpenZFS 6536 zfs send: want a way to disable setting of DRR_FLAG_FREERECORDS_ doesn't seem to quite work the way it was intended to: this was discussed here ***@***.***/msg02734.html
The following patch was proposed:
```diff
diff --git a/usr/src/uts/common/fs/zfs/dmu_send.c b/usr/src/uts/common/fs/zfs/dmu_send.c
index 01a4514..db60198 100644
--- a/usr/src/uts/common/fs/zfs/dmu_send.c
+++ b/usr/src/uts/common/fs/zfs/dmu_send.c
@@ -190,6 +190,15 @@ dump_free(dmu_sendarg_t *dsp, uint64_t object, uint64_t offset,
(object == dsp->dsa_last_data_object &&
offset > dsp->dsa_last_data_offset));
+ /*
+ * If we are doing a non-incremental send, then there can't
+ * be any data in the dataset we're receiving into. Unless we're receiving
+ * a full send as a clone, a free record would simply be a no-op.
+ * If we disable the tunable for this, save space by not sending it to
+ * begin with.
+ */
+ if (!zfs_send_set_freerecords_bit && !dsp->dsa_incremental)
+ return (0);
+
if (length != -1ULL && offset + length < offset)
length = -1ULL;
@@ -388,6 +397,10 @@ dump_freeobjects(dmu_sendarg_t *dsp, uint64_t firstobj, uint64_t numobjs)
{
struct drr_freeobjects *drrfo = &(dsp->dsa_drr->drr_u.drr_freeobjects);
+ /* See comment in dump_free(). */
+ if (!zfs_send_set_freerecords_bit && !dsp->dsa_incremental)
+ return (0);
+
/*
* If there is a pending op, but it's not PENDING_FREEOBJECTS,
* push it out, since free block aggregation can only be done for
```
like I wrote under 'Motivation and Context', I did in fact port that
proposed fix to ZoL (the dsa_incremental member is gone nowadays, but
easily replaced, and the bit needs exposing as module parameter like
@behlendorf suggested).
unfortunately testing showed that it will only fix the full stream case.
incremental streams will still cause the same hang on the receiving
side, as they contain the same kind of FREEOBJECTS record.
I'll open another PR to investigate this issue on the sending side,
starting with the module parameter and gist above, maybe we can find a
solution based on that.
|
@Fabian-Gruenbichler i'm sorry, but it wasn't really clear to me the fact you personally ported/tested the patch on ZoL from your comment under "Motivation and Context".
Anyway, thank you for working on this, this is clearly something useful. |
|
sorry for not formulating that one more clearly. also the description there is a bit outdated - AFAICT it is not the huge numobjs which is the problem, but the fact that a FREEOBJECTS record is generated for objects which do not exist on the receiving side (hence the proposed fix for the receiving side in this PR ;)). |
|
it is a combination of the above - the huge numobjs combined with actually iterating over it instead of bailing out on the first one which does not exist. I am not yet sure what is happening, but 0.6.5 generates a stream with a record with a huge numobj as well, but does not attempt to restore that record when receiving (whereas it does for a stream generated with 0.7.x). dumps of an incremental stream, generated with zstreamdump: |
|
closing this in favor of #6616 for now |
this combines changes from
e6d3a84
and
50c957f
to hopefully allow sending regular streams from 0.7.x to 0.6.5.x based
systems. the problematic records of the following kind now no longer lead
to an infinite loop, but instead allow the receive to complete:
drr_type = FREEOBJECTS firstobj = 64 numobjs = 36028797018963904 err = 0
see issues #5699 (older incompatibility between FreeNAS and <= 0.6.5.11)
and #6507 (recent incompatibility between 0.7.x and <= 0.6.5.11)
Signed-Off-By: Fabian Grünbichler f.gruenbichler@proxmox.com
Motivation and Context
while I get that ZoL does not guarantee API/stream format compatibility at the moment, I still think this breakage was very unfortunate - I am sure there are a lot of setups out there with mixed versions.
this also affected Illumos: https://www.mail-archive.com/discuss@lists.illumos.org/msg02735.html
one other possible way to fix this would be to add a switch to zfs send / a module parameter, which allows reverting to the old behaviour of not generating FREEOBJECTS records with huge numobjs, but the code has undergone a lot of changes since 0.6.5.11 and I am not sure how feasible this would be? the quick fix suggested in the gist above does not seem to be enough (it allows an initial full stream to be received, but the hang still occurs on incremental send/receive).
How Has This Been Tested?
as this is still an RFC/WIP, I (only) did some quick tests by sending from 0.7.1 to 0.6.5.11 with this patch applied. both full and incremental streams work and the resulting data is bit-identical to the source. punching holes and then sending those changes as incremental stream works as expected as well.
Types of changes
Checklist:
Signed-off-by.