Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Send/Recv backward compatibility is broken with ZFS v0.7.x #6507

Closed
ssergiienko opened this issue Aug 14, 2017 · 7 comments
Closed

Send/Recv backward compatibility is broken with ZFS v0.7.x #6507

ssergiienko opened this issue Aug 14, 2017 · 7 comments

Comments

@ssergiienko
Copy link

System information

Type Version/Name
Distribution Name CentOS / ArchLinux
Distribution Version 7.3 / latest
Linux Kernel 3.10.0-514.6.1.el7.x86_64 / 4.12.6-1-ARCH
Architecture x64
ZFS Version 0.6.5.9-1.el7_3.centos.x86_64 / zfs-dkms-git 0.7.0_r16_g6a8ee4f71-1
SPL Version 0.6.5.9-1.el7_3.centos.x86_64 / spl-dkms-git 0.7.0_r8_g9243b0f-1

Describe the problem you're observing

Problem is that zfs stream generated by ZFS v0.7 can't be successfully received on v0.6.5.x. Take into account that both pools has exactly the same features enabled and no special flags was enabled during zfs send (like compression or deduplication of stream).

Features on ZFS v0.7 pool:

rpool  multihost                      off                            default
rpool  feature@async_destroy          enabled                        local
rpool  feature@empty_bpobj            active                         local
rpool  feature@lz4_compress           active                         local
rpool  feature@multi_vdev_crash_dump  disabled                       local
rpool  feature@spacemap_histogram     active                         local
rpool  feature@enabled_txg            disabled                       local
rpool  feature@hole_birth             disabled                       local
rpool  feature@extensible_dataset     disabled                       local
rpool  feature@embedded_data          disabled                       local
rpool  feature@bookmarks              disabled                       local
rpool  feature@filesystem_limits      disabled                       local
rpool  feature@large_blocks           disabled                       local
rpool  feature@large_dnode            disabled                       local
rpool  feature@sha512                 disabled                       local
rpool  feature@skein                  disabled                       local
rpool  feature@edonr                  disabled                       local
rpool  feature@userobj_accounting     disabled                       local

and it's version:
[ 4.050576] ZFS: Loaded module v0.7.0-1, ZFS pool version 5000, ZFS filesystem version 5

My case:
I use my ArchLinux(zfs ~v0.7) server as a backup server for CentOS(zfs 0.6.5.9) server. Full replication ZFS stream generated by CentOS server with zfs send -R successfully received on ArchLinux, but when I tried to restore CentOS server from such backup zfs send -R archlinux_pool/backup | ssh centos zfs receive centos_pool it doesn't work - zfs recv process hangs and takes 100% cpu time (sy) and do nothing. System is still remains responsible.

Upgrading CentOS server to v0.7.1 before receiving this old stream fixes the problem.

I've also tried to boot into two rescue systems (Debian with zfs 0.6.5.x and FreeBSD 10.x) and import this CentOS's pool then receive stream - whole server hangs completely in both cases.

Reproducible 100% of time.

If such kind of backward compatibility is not guarantied(really?) than zfs recv should immediately print exact error instead of hanging the server. Also, there should be some kind of versioning and probably some flag like zfs send --maximize-compatibility or --generate-for=0.6.5 or some other workaround because upgrading ZFS on receiving side is often not a option at all.

Describe how to reproduce the problem

  1. Create CentOS server with ZFS v0.6.5.x
  2. Create some dataset on it. Even empty is ok
  3. Create backup server with ZFS v0.7.x
  4. zfs send -R from first server to second (success)
  5. zfs recv from second to first (zfs process hangs or even whole server hangs)

Include any warning/errors/backtraces from the system logs

Unfortunately no errors seen in dmesg or stderr/out

@gmelikov
Copy link
Member

Install latest version - 0.6.5.11 or 0.7. 0.6.5.9 may have problems with recv from newer openZFS versions.

Closed as #5699 duplicate.

@gmelikov
Copy link
Member

About compatibility - we don't break disk data format, but API and ABI will be stabilized only in 1.0 release.

@ssergiienko
Copy link
Author

ssergiienko commented Aug 14, 2017

Ok, so it's just <= v0.6.5.9 specific problem.

we don't break disk data format

It's understood, but just to be clear - does "send/recv stream" format is fixed and/or versioned? @gmelikov

Thanks!

@DeHackEd
Copy link
Contributor

There is some minimal version compatibility information in the stream that is only relevant when using the extra parameters, such as zfs send -L .... Otherwise the stream format hasn't changed.

@ssergiienko
Copy link
Author

@gmelikov Just tried with Debian ZFS version that you recommend and got same bug. System halted when started recv.

[Thu Aug 17 23:44:29 2017] spl: loading out-of-tree module taints kernel.
[Thu Aug 17 23:44:29 2017] SPL: Loaded module v0.6.5.11-1~bpo9+1
[Thu Aug 17 23:44:29 2017] znvpair: module license 'CDDL' taints kernel.
[Thu Aug 17 23:44:29 2017] Disabling lock debugging due to kernel taint
[Thu Aug 17 23:44:29 2017] ZFS: Loaded module v0.6.5.11-1~bpo9+1, ZFS pool version 5000, ZFS filesystem version 5
[Thu Aug 17 23:44:29 2017] SPL: using hostid 0x00000000
root@rescue ~ # zfs list
NAME                 USED  AVAIL  REFER  MOUNTPOINT
rpool                892M  2.97G   136K  none
rpool/ROOT           890M  2.97G   136K  none
rpool/ROOT/default   890M  2.97G   890M  none

Sending side:

....
total estimated size is 1,48G
TIME        SENT   SNAPSHOT
TIME        SENT   SNAPSHOT
TIME        SENT   SNAPSHOT
receiving full stream of rpool/backup@current into rpool@current
00:45:50   1,93M   rpool/backup/ROOT/default@bsnap
00:45:51   1,93M   rpool/backup/ROOT/default@bsnap
00:45:52   1,93M   rpool/backup/ROOT/default@bsnap
00:45:53   1,93M   rpool/backup/ROOT/default@bsnap
^C

Is issue really fixed in v0.6.5.11 or that was just a guess?)

@gmelikov
Copy link
Member

IIRC there was some problems we fixed in 0.6.5.11, looks like it's not your case. Unfortunately the best method is to update to 0.7 on both sides, this release includes many useful updates, and this behavior may be found in other OpenZFS systems too. Fortunately it's only one minor regression we have.

Be ready for API/ABI incompatibility problems before version 1.0.

Fabian-Gruenbichler added a commit to Fabian-Gruenbichler/zfs that referenced this issue Sep 5, 2017
this combines changes from

e6d3a84

    OpenZFS 6393 - zfs receive a full send as a clone

and

50c957f

    Implement large_dnode pool feature

to hopefully allow sending regular streams from 0.7.x to 0.6.5.x based
systems.the problematic records of the following kind now no longer lead
to an infinite loop, but instead allow the receive to complete:

drr_type = FREEOBJECTS firstobj = 64 numobjs = 36028797018963904 err = 0

see issues openzfs#5699 (older incompatibility between FreeNAS and <= 0.6.5.11)
and openzfs#6507 (recent incompatibility between 0.7.x and <= 0.6.5.11)
Fabian-Gruenbichler added a commit to Fabian-Gruenbichler/zfs that referenced this issue Sep 5, 2017
this combines changes from

e6d3a84

    OpenZFS 6393 - zfs receive a full send as a clone

and

50c957f

    Implement large_dnode pool feature

to hopefully allow sending regular streams from 0.7.x to 0.6.5.x based
systems.the problematic records of the following kind now no longer lead
to an infinite loop, but instead allow the receive to complete:

drr_type = FREEOBJECTS firstobj = 64 numobjs = 36028797018963904 err = 0

see issues openzfs#5699 (older incompatibility between FreeNAS and <= 0.6.5.11)
and openzfs#6507 (recent incompatibility between 0.7.x and <= 0.6.5.11)

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Fabian-Gruenbichler added a commit to Fabian-Gruenbichler/zfs that referenced this issue Sep 5, 2017
this combines changes from

e6d3a84

    OpenZFS 6393 - zfs receive a full send as a clone

and

50c957f

    Implement large_dnode pool feature

to hopefully allow sending regular streams from 0.7.x to 0.6.5.x based
systems. the problematic records of the following kind now no longer
lead to an infinite loop, but instead allow the receive to complete:

drr_type = FREEOBJECTS firstobj = 64 numobjs = 36028797018963904 err = 0

see issues openzfs#5699 (older incompatibility between FreeNAS and <= 0.6.5.11)
and openzfs#6507 (recent incompatibility between 0.7.x and <= 0.6.5.11)

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Fabian-Gruenbichler added a commit to Fabian-Gruenbichler/zfs that referenced this issue Sep 5, 2017
this combines changes from

e6d3a84

    OpenZFS 6393 - zfs receive a full send as a clone

and

50c957f

    Implement large_dnode pool feature

to hopefully allow sending regular streams from 0.7.x to 0.6.5.x based
systems. the problematic records of the following kind now no longer
lead to an infinite loop, but instead allow the receive to complete:

drr_type = FREEOBJECTS firstobj = 64 numobjs = 36028797018963904 err = 0

see issues openzfs#5699 (older incompatibility between FreeNAS and <= 0.6.5.11)
and openzfs#6507 (recent incompatibility between 0.7.x and <= 0.6.5.11)

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Requires-spl: refs/heads/spl-0.6.5-release
Fabian-Gruenbichler added a commit to Fabian-Gruenbichler/zfs that referenced this issue Sep 5, 2017
this combines changes from

e6d3a84

    OpenZFS 6393 - zfs receive a full send as a clone

and

50c957f

    Implement large_dnode pool feature

to hopefully allow sending regular streams from 0.7.x to 0.6.5.x based
systems. the problematic records of the following kind now no longer
lead to an infinite loop, but instead allow the receive to complete:

drr_type = FREEOBJECTS firstobj = 64 numobjs = 36028797018963904 err = 0

see issues openzfs#5699 (older incompatibility between FreeNAS and <= 0.6.5.11)
and openzfs#6507 (recent incompatibility between 0.7.x and <= 0.6.5.11)

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Requires-spl: 'refs/heads/spl-0.6.5-release'
Fabian-Gruenbichler added a commit to Fabian-Gruenbichler/zfs that referenced this issue Sep 5, 2017
this combines changes from

e6d3a84

    OpenZFS 6393 - zfs receive a full send as a clone

and

50c957f

    Implement large_dnode pool feature

to hopefully allow sending regular streams from 0.7.x to 0.6.5.x based
systems. the problematic records of the following kind now no longer
lead to an infinite loop, but instead allow the receive to complete:

drr_type = FREEOBJECTS firstobj = 64 numobjs = 36028797018963904 err = 0

see issues openzfs#5699 (older incompatibility between FreeNAS and <= 0.6.5.11)
and openzfs#6507 (recent incompatibility between 0.7.x and <= 0.6.5.11)

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Requires-spl: 3010774dd6ad6e6eb1f6a4d9c603dd49c55c79c9
Fabian-Gruenbichler added a commit to Fabian-Gruenbichler/zfs that referenced this issue Sep 5, 2017
this combines changes from

e6d3a84

    OpenZFS 6393 - zfs receive a full send as a clone

and

50c957f

    Implement large_dnode pool feature

to hopefully allow sending regular streams from 0.7.x to 0.6.5.x based
systems. the problematic records of the following kind now no longer
lead to an infinite loop, but instead allow the receive to complete:

drr_type = FREEOBJECTS firstobj = 64 numobjs = 36028797018963904 err = 0

see issues openzfs#5699 (older incompatibility between FreeNAS and <= 0.6.5.11)
and openzfs#6507 (recent incompatibility between 0.7.x and <= 0.6.5.11)

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Requires-spl: spl-0.6.5-release
Fabian-Gruenbichler added a commit to Fabian-Gruenbichler/zfs that referenced this issue Sep 5, 2017
this combines changes from

e6d3a84

    OpenZFS 6393 - zfs receive a full send as a clone

and

50c957f

    Implement large_dnode pool feature

to hopefully allow sending regular streams from 0.7.x to 0.6.5.x based
systems. the problematic records of the following kind now no longer
lead to an infinite loop, but instead allow the receive to complete:

drr_type = FREEOBJECTS firstobj = 64 numobjs = 36028797018963904 err = 0

see issues openzfs#5699 (older incompatibility between FreeNAS and <= 0.6.5.11)
and openzfs#6507 (recent incompatibility between 0.7.x and <= 0.6.5.11)

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Requires-spl: spl-0\.6\.5-release
Fabian-Gruenbichler added a commit to Fabian-Gruenbichler/zfs that referenced this issue Sep 5, 2017
this combines changes from

e6d3a84

    OpenZFS 6393 - zfs receive a full send as a clone

and

50c957f

    Implement large_dnode pool feature

to hopefully allow sending regular streams from 0.7.x to 0.6.5.x based
systems. the problematic records of the following kind now no longer
lead to an infinite loop, but instead allow the receive to complete:

drr_type = FREEOBJECTS firstobj = 64 numobjs = 36028797018963904 err = 0

see issues openzfs#5699 (older incompatibility between FreeNAS and <= 0.6.5.11)
and openzfs#6507 (recent incompatibility between 0.7.x and <= 0.6.5.11)

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Requires-spl: "spl-0.6.5-release"
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Fabian-Gruenbichler added a commit to Fabian-Gruenbichler/zfs that referenced this issue Sep 5, 2017
this combines changes from

e6d3a84

    OpenZFS 6393 - zfs receive a full send as a clone

and

50c957f

    Implement large_dnode pool feature

to hopefully allow sending regular streams from 0.7.x to 0.6.5.x based
systems. the problematic records of the following kind now no longer
lead to an infinite loop, but instead allow the receive to complete:

drr_type = FREEOBJECTS firstobj = 64 numobjs = 36028797018963904 err = 0

see issues openzfs#5699 (older incompatibility between FreeNAS and <= 0.6.5.11)
and openzfs#6507 (recent incompatibility between 0.7.x and <= 0.6.5.11)

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Requires-spl: refs/pull/647/head
@Fabian-Gruenbichler
Copy link
Contributor

see #6616 for some analysis and maybe further progress/workarounds

behlendorf pushed a commit that referenced this issue Oct 10, 2017
All objects after the last written or freed object are not supposed to
exist after receiving the stream.  Free them accordingly, as if a
freeobjects record for them had been included in the stream.

Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Closes #5699
Closes #6507
Closes #6616
behlendorf pushed a commit that referenced this issue Oct 10, 2017
When sending an incremental stream based on a snapshot, the receiving
side must have the same base snapshot.  Thus we do not need to send
FREEOBJECTS records for any objects past the maximum one which exists
locally.

This allows us to send incremental streams (again) to older ZFS
implementations (e.g. ZoL < 0.7) which actually try to free all objects
in a FREEOBJECTS record, instead of bailing out early.

Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Closes #5699
Closes #6507
Closes #6616
aerusso pushed a commit to aerusso/zfs that referenced this issue Oct 11, 2017
All objects after the last written or freed object are not supposed to
exist after receiving the stream.  Free them accordingly, as if a
freeobjects record for them had been included in the stream.

Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Closes openzfs#5699
Closes openzfs#6507
Closes openzfs#6616
aerusso pushed a commit to aerusso/zfs that referenced this issue Oct 11, 2017
When sending an incremental stream based on a snapshot, the receiving
side must have the same base snapshot.  Thus we do not need to send
FREEOBJECTS records for any objects past the maximum one which exists
locally.

This allows us to send incremental streams (again) to older ZFS
implementations (e.g. ZoL < 0.7) which actually try to free all objects
in a FREEOBJECTS record, instead of bailing out early.

Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Closes openzfs#5699
Closes openzfs#6507
Closes openzfs#6616
aerusso pushed a commit to aerusso/zfs that referenced this issue Oct 12, 2017
All objects after the last written or freed object are not supposed to
exist after receiving the stream.  Free them accordingly, as if a
freeobjects record for them had been included in the stream.

Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Closes openzfs#5699
Closes openzfs#6507
Closes openzfs#6616
aerusso pushed a commit to aerusso/zfs that referenced this issue Oct 12, 2017
When sending an incremental stream based on a snapshot, the receiving
side must have the same base snapshot.  Thus we do not need to send
FREEOBJECTS records for any objects past the maximum one which exists
locally.

This allows us to send incremental streams (again) to older ZFS
implementations (e.g. ZoL < 0.7) which actually try to free all objects
in a FREEOBJECTS record, instead of bailing out early.

Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Closes openzfs#5699
Closes openzfs#6507
Closes openzfs#6616
tonyhutter pushed a commit that referenced this issue Oct 16, 2017
All objects after the last written or freed object are not supposed to
exist after receiving the stream.  Free them accordingly, as if a
freeobjects record for them had been included in the stream.

Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Closes #5699
Closes #6507
Closes #6616
tonyhutter pushed a commit that referenced this issue Oct 16, 2017
When sending an incremental stream based on a snapshot, the receiving
side must have the same base snapshot.  Thus we do not need to send
FREEOBJECTS records for any objects past the maximum one which exists
locally.

This allows us to send incremental streams (again) to older ZFS
implementations (e.g. ZoL < 0.7) which actually try to free all objects
in a FREEOBJECTS record, instead of bailing out early.

Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Closes #5699
Closes #6507
Closes #6616
FransUrbo pushed a commit to FransUrbo/zfs that referenced this issue Apr 28, 2019
All objects after the last written or freed object are not supposed to
exist after receiving the stream.  Free them accordingly, as if a
freeobjects record for them had been included in the stream.

Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Closes openzfs#5699
Closes openzfs#6507
Closes openzfs#6616
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants