Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

After connection problems: cannot receive incremental stream: destination ... has been modified #203

Open
pschwarzpp opened this issue Aug 14, 2023 · 12 comments

Comments

@pschwarzpp
Copy link

pschwarzpp commented Aug 14, 2023

We have been using zrep for multiple years to replicate through SSH, but in the last few months we suffer from less-than-optimal connectivity which is causing pain with zrep.

The error message is inaccurate, resulting from zrep being confused by its own error handling. When we encounter the issue, manual steps are needed to resolve it.

Order of events

Connection problem

we encounter a connection problem during zrep synconly, this log is from the source where we run the command

packet_write_wait: Connection to x.x.x.x port 22: Broken pipe
packet_write_wait: Connection to UNKNOWN port 65535: Broken pipe
Error: Problem doing sync for esrepo/snap_esids@zrep_000bc6. Renamed to esrepo/snap_esids@zrep_000bc6_unsent

Next run fails

This happens with all subsequent runs:

cannot receive incremental stream: destination esrepo-targethost/snap_esids has been modified since most recent snapshot
warning: cannot send 'esrepo/snap_esids@zrep_000bc6_unsent': signal received
Error: Problem doing sync for esrepo/snap_esids@zrep_000bc7. Renamed to esrepo/snap_esids@zrep_000bc7_unsent

We usually do zrep snaponly followed by zrep synconly to ensure consistency with systems running ontop the replicated dataset.

Failing run with ZREP_VERBOSE

# ZREP_VERBOSE=true zrep synconly esrepo/snap_esids
sending esrepo/snap_esids@zrep_000bfb to targethost.domain.com:esrepo-targethost/snap_esids
send from @zrep_000bc5 to esrepo/snap_esids@zrep_000bc6_unsent estimated size is 2.90M
send from @zrep_000bc6_unsent to esrepo/snap_esids@zrep_000bc7_unsent estimated size is 6.42M
# more in between
send from @zrep_000bf8_unsent to esrepo/snap_esids@zrep_000bf9_unsent estimated size is 11.3M
send from @zrep_000bf9_unsent to esrepo/snap_esids@zrep_000bfa_unsent_unsent_unsent estimated size is 27.4M
# multiple _unsent suffix because we did some failing synconly calls without snaponly in between
send from @zrep_000bfa_unsent_unsent_unsent to esrepo/snap_esids@zrep_000bfb estimated size is 13.6M
total estimated size is 2.57G
TIME        SENT   SNAPSHOT
cannot receive incremental stream: destination esrepo-targethost/snap_esids has been modified
since most recent snapshot
warning: cannot send 'esrepo/snap_esids@zrep_000bc6_unsent': signal received
Error: Problem doing sync for esrepo/snap_esids@zrep_000bfb. Renamed to esrepo/snap_esids@zrep_000bfb_unsent

Further info / workaround

Latest snapshot on target

$ zfs list -t snap -r esrepo-targethost/snap_esids | tail -1
esrepo-targethost/snap_esids@zrep_000bc6     0B      -     12.3G  -

But we see above that zrep thinks it needs to send from @zrep_000bc5 instead of @zrep_000bc6. @zrep_000bc6 has been renamed to @zrep_000bc6_unsent on the source.

comparing @zrep_000bc6

Source

# zfs get guid esrepo/snap_esids@zrep_000bc6_unsent
NAME                                  PROPERTY  VALUE  SOURCE
esrepo/snap_esids@zrep_000bc6_unsent  guid      12096180512352983159  -

Target

$ zfs get guid esrepo-targethost/snap_esids@zrep_000bc6
NAME                                        PROPERTY  VALUE  SOURCE
esrepo-targethost/snap_esids@zrep_000bc6  guid      12096180512352983159  -

Fixing replication

Rename

Since @zrep_000bc6 has actually been successfully transferred (same guid), we rename it on the source so it has the same name as on the target:

# zfs rename esrepo/snap_esids@zrep_000bc6_unsent esrepo/snap_esids@zrep_000bc6

Rerun

... and run it again, successfully

# ZREP_VERBOSE=true zrep synconly esrepo/snap_esids
sending esrepo/snap_esids@zrep_000bfb_unsent to targethost.domain.com:esrepo-targethost/snap_esids
send from @zrep_000bc5 to esrepo/snap_esids@zrep_000bc6 estimated size is 2.90M
send from @zrep_000bc6 to esrepo/snap_esids@zrep_000bc7_unsent estimated size is 6.42M
send from @zrep_000bc7_unsent to esrepo/snap_esids@zrep_000bc8_unsent estimated size is 7.53M
send from @zrep_000bc8_unsent to esrepo/snap_esids@zrep_000bc9_unsent estimated size is 2.73M
send from @zrep_000bc9_unsent to esrepo/snap_esids@zrep_000bca_unsent estimated size is 4.49M
send from @zrep_000bca_unsent to esrepo/snap_esids@zrep_000bcb_unsent estimated size is 6.07M
# more in between
send from @zrep_000bf4_unsent to esrepo/snap_esids@zrep_000bf5_unsent estimated size is 6.74M
send from @zrep_000bf5_unsent to esrepo/snap_esids@zrep_000bf6_unsent estimated size is 6.10M
send from @zrep_000bf6_unsent to esrepo/snap_esids@zrep_000bf7_unsent estimated size is 11.9M
send from @zrep_000bf7_unsent to esrepo/snap_esids@zrep_000bf8_unsent estimated size is 10.6M
send from @zrep_000bf8_unsent to esrepo/snap_esids@zrep_000bf9_unsent estimated size is 11.3M
send from @zrep_000bf9_unsent to esrepo/snap_esids@zrep_000bfa_unsent_unsent_unsent estimated size is 27.4M
send from @zrep_000bfa_unsent_unsent_unsent to esrepo/snap_esids@zrep_000bfb_unsent estimated size is 13.6M
total estimated size is 2.57G
TIME        SENT   SNAPSHOT
TIME        SENT   SNAPSHOT
07:41:56   2.48M   esrepo/snap_esids@zrep_000bc7_unsent
TIME        SENT   SNAPSHOT
TIME        SENT   SNAPSHOT
TIME        SENT   SNAPSHOT
TIME        SENT   SNAPSHOT
07:42:00   5.89M   esrepo/snap_esids@zrep_000bcb_unsent
# cut out some progress
07:43:31   1013M   esrepo/snap_esids@zrep_000bf4_unsent
07:43:32   1.04G   esrepo/snap_esids@zrep_000bf4_unsent
07:43:33   1.09G   esrepo/snap_esids@zrep_000bf4_unsent
TIME        SENT   SNAPSHOT
07:43:35   3.16M   esrepo/snap_esids@zrep_000bf5_unsent
07:43:36   3.16M   esrepo/snap_esids@zrep_000bf5_unsent
07:43:37   3.16M   esrepo/snap_esids@zrep_000bf5_unsent
07:43:38   3.16M   esrepo/snap_esids@zrep_000bf5_unsent
07:43:39   3.16M   esrepo/snap_esids@zrep_000bf5_unsent
TIME        SENT   SNAPSHOT
TIME        SENT   SNAPSHOT
TIME        SENT   SNAPSHOT
07:43:42   6.27M   esrepo/snap_esids@zrep_000bf8_unsent
TIME        SENT   SNAPSHOT
TIME        SENT   SNAPSHOT
07:43:44   21.8M   esrepo/snap_esids@zrep_000bfa_unsent_unsent_unsent
TIME        SENT   SNAPSHOT
07:43:45   3.00M   esrepo/snap_esids@zrep_000bfb_unsent
Also running expire on targethost.domain.com:esrepo-targethost/snap_esids now...
Expiring zrep snaps on esrepo-targethost/snap_esids

Version used

We are running zrep 1.9.0 on the source and 1.8.0 on the target. We are going to update (& align versions) soon, but from the changelog its not obvious whether there were any related changes.

@ppbrown
Copy link
Member

ppbrown commented Aug 15, 2023 via email

@pschwarzpp
Copy link
Author

pschwarzpp commented Aug 16, 2023

I reproduced the issue with freshly installed VMs (ubuntu server 22.04) and the latest version of zrep:

root@zrep-source:~# zfs --version
zfs-2.1.5-1ubuntu6~22.04.1
zfs-kmod-2.1.5-1ubuntu6~22.04.1

The problem only occurs when the connection is dropped after the actual data has been copied by zfs - if it drops mid-stream the next run won't face the problem. I assume this explains why in our production situation we almost only see the problem for a dataset with low activity and basically never on a much busier dataset.

To reproduce I killed ssh on the target host with kill $(ps -ef | grep "sshd: root@notty" | grep -v grep | gawk '{print $2}').

Another transcript with a few annotations

Successful run before reproducing

root@zrep-source:~# zrep snaponly rpool/ds1
rpool/ds1@zrep_000014
root@zrep-source:~# zrep synconly rpool/ds1
sending rpool/ds1@zrep_000014 to zrep-target:rpool/ds1
send from @zrep_000012 to rpool/ds1@zrep_000013_unsent estimated size is 100M
send from @zrep_000013_unsent to rpool/ds1@zrep_000014 estimated size is 624B
total estimated size is 100M
TIME        SENT   SNAPSHOT rpool/ds1@zrep_000013_unsent
08:02:49   3.13M   rpool/ds1@zrep_000013_unsent
[...]
08:03:02   97.5M   rpool/ds1@zrep_000013_unsent
Also running expire on zrep-target:rpool/ds1 now...
Expiring zrep snaps on rpool/ds1

Write a bit of data

root@zrep-source:~# dd bs=1M if=/dev/urandom of=/rpool/ds1/50M.bin count=50 conv=fsync
50+0 records in
50+0 records out
52428800 bytes (52 MB, 50 MiB) copied, 12.9225 s, 4.1 MB/s

Next run, triggering the problem

Note that I kill SSH on the target just as the process stalls briefly at the end. You killed it at the right time when rpool/ds1@zrep_000015 becomes visible a few seconds later on the target (zfs list -t snap -r rpool/ds1).

root@zrep-source:~# zrep snaponly rpool/ds1
rpool/ds1@zrep_000015
root@zrep-source:~# zrep synconly rpool/ds1
sending rpool/ds1@zrep_000015 to zrep-target:rpool/ds1
send from @zrep_000014 to rpool/ds1@zrep_000015 estimated size is 50.1M
total estimated size is 50.1M
TIME        SENT   SNAPSHOT rpool/ds1@zrep_000015
08:04:33   6.40M   rpool/ds1@zrep_000015
08:04:34   16.0M   rpool/ds1@zrep_000015
08:04:35   25.6M   rpool/ds1@zrep_000015
08:04:36   35.1M   rpool/ds1@zrep_000015
08:04:37   44.6M   rpool/ds1@zrep_000015
Connection to zrep-target closed by remote host.
Error: Problem doing sync for rpool/ds1@zrep_000015. Renamed to rpool/ds1@zrep_000015_unsent

Try to do next runs, fails

root@zrep-source:~# zrep synconly rpool/ds1
sending rpool/ds1@zrep_000015_unsent to zrep-target:rpool/ds1
send from @zrep_000014 to rpool/ds1@zrep_000015_unsent estimated size is 50.1M
total estimated size is 50.1M
cannot receive incremental stream: destination rpool/ds1 has been modified
since most recent snapshot
Error: Problem doing sync for rpool/ds1@zrep_000015_unsent. Renamed to rpool/ds1@zrep_000015_unsent_unsent
root@zrep-source:~# zrep snaponly rpool/ds1
rpool/ds1@zrep_000016
root@zrep-source:~# zrep synconly rpool/ds1
sending rpool/ds1@zrep_000016 to zrep-target:rpool/ds1
send from @zrep_000014 to rpool/ds1@zrep_000015_unsent_unsent estimated size is 50.1M
send from @zrep_000015_unsent_unsent to rpool/ds1@zrep_000016 estimated size is 624B
total estimated size is 50.1M
cannot receive incremental stream: destination rpool/ds1 has been modified
since most recent snapshot
Error: Problem doing sync for rpool/ds1@zrep_000016. Renamed to rpool/ds1@zrep_000016_unsent

Fix by renaming latest actually transported snapshot

root@zrep-source:~# zfs rename rpool/ds1@zrep_000015_unsent_unsent rpool/ds1@zrep_000015

Run again successfully


root@zrep-source:~# zrep synconly rpool/ds1
sending rpool/ds1@zrep_000016_unsent to zrep-target:rpool/ds1
send from @zrep_000014 to rpool/ds1@zrep_000015 estimated size is 50.1M
send from @zrep_000015 to rpool/ds1@zrep_000016_unsent estimated size is 624B
total estimated size is 50.1M
TIME        SENT   SNAPSHOT rpool/ds1@zrep_000015
08:05:46   11.5M   rpool/ds1@zrep_000015
08:05:47   21.2M   rpool/ds1@zrep_000015
08:05:48   30.7M   rpool/ds1@zrep_000015
08:05:49   40.2M   rpool/ds1@zrep_000015
08:05:50   49.7M   rpool/ds1@zrep_000015
Also running expire on zrep-target:rpool/ds1 now...
Expiring zrep snaps on rpool/ds1

@ppbrown
Copy link
Member

ppbrown commented Aug 16, 2023 via email

@pschwarzpp
Copy link
Author

pschwarzpp commented Aug 17, 2023

I see mostly two ways in which this could be resolved... but I am not an expert on zfs, so...

Potential solutions

a) Improve detecting snapshots on the target

To me it seems that in the output above after renaming the snapshot on the source, zrep still transports too much:

  • initially the transfer of rpool/ds1@zrep_000014 --> rpool/ds1@zrep_000015 failed (or rather: seemed to fail but actually completed), causing it to be renamed to rpool/ds1@zrep_000015_unsent
  • we then manually rename it back to rpool/ds1@zrep_000015 on the source
  • the next run seems to again transfer the delta from rpool/ds1@zrep_000014 --> rpool/ds1@zrep_000015

If zrep snaponly zrep synconly detected that rpool/ds1@zrep_000015_unsent on the source and rpool/ds1@zrep_000015 on the target are identical (they are, we can tell by zrep get guid, see first post). It would never try to transport that delta again and we would probably not see the error at all. This might also be more efficient in that situation.

b) Don't rename to _unsent

Maybe we can get around the problem by not renaming upon error. I see that there is ZREP_RENAME_UNSENT, but since it does not only not rename but also destroy the (in our case successfully transported) snapshot on the source (https://github.com/bolthole/zrep/blob/v2.0.2/zrep#L1866), I believe this option would make it even worse for us (we would need to delete the snapshot on the target)

Preference

To me a) seems much better, b) is more of a workaround. But I am not aware how the current logic works to decide which deltas need to be transferred.

@ppbrown
Copy link
Member

ppbrown commented Aug 17, 2023 via email

@pschwarzpp
Copy link
Author

pschwarzpp commented Aug 17, 2023

I disagree on calling my suggestion above a workaround. Connection issues do happen and a replication tool intended to be used across the network should recover from them. I would consider zrep not recognizing a snaphot it transported as a bug.

We create readonly clones (zfs clone -o readonly=on) of the replicated snapshots for verification purposes (clones are mentioned in http://www.bolthole.com/solaris/zrep/zrep.documentation.html#tuning), does this make "force" unfeasible for us?

In the simple example above where the issue was reproduced, nothing was done on the target except for killing ssh to simulate connection loss. Our mechanism which creates & cleans up the clones has never caused a "destination has been modified" error over the years (actually we never saw it at all). The problem we are seeing now is purely due to the connection issues paired with zreps error handling.

@pschwarzpp
Copy link
Author

pschwarzpp commented Aug 17, 2023

Maybe a third strategy ("do nothing") next to "rename to _unsent" and "delete if transfer seems to have failed" would be a very easy fix.

If zrep simply keeps the snapshot but does not rename it, the system would be in the same state as it is after we manually rename the problematic snapshot on the source. Or do I miss something?

... maybe -r would behave that way?

@ppbrown
Copy link
Member

ppbrown commented Aug 17, 2023 via email

@pschwarzpp
Copy link
Author

I have now tried with ZREP_RESUME=1 which also does not rename the snapshot when transfer fails. Using this option, I was unable to reproduce the problem (without it, I can reliably reproduce it).

I noticed that this is not exposed as a normal CLI argument for synconly (or am I making a mistake?), so I wonder if there is a reason not to use it?

Transcript with ZREP_RESUME=1

Simulate connection error

root@zrep-source:~# dd bs=1M if=/dev/urandom of=/rpool/ds1/50M.bin count=50 conv=fsync && zrep snaponly rpool/ds1
50+0 records in
50+0 records out
52428800 bytes (52 MB, 50 MiB) copied, 12.2732 s, 4.3 MB/s
rpool/ds1@zrep_000026
root@zrep-source:~# ZREP_RESUME=1 zrep synconly rpool/ds1
sending rpool/ds1@zrep_000026 to zrep-target:rpool/ds1
send from @zrep_000025 to rpool/ds1@zrep_000026 estimated size is 50.1M
total estimated size is 50.1M
TIME        SENT   SNAPSHOT rpool/ds1@zrep_000026
07:42:06   11.7M   rpool/ds1@zrep_000026
07:42:07   21.2M   rpool/ds1@zrep_000026
07:42:08   30.7M   rpool/ds1@zrep_000026
07:42:09   40.2M   rpool/ds1@zrep_000026
07:42:10   49.9M   rpool/ds1@zrep_000026
Connection to zrep-target closed by remote host.
Error: Problem doing sync for rpool/ds1@zrep_000026.

State after connection error

A few seconds after the failed synconly, the snapshot rpool/ds1@zrep_000026 does show up on the target. Only when this happens the error is (would be) triggered:

iso@zrep-target:~$ zfs list -t snap
NAME                    USED  AVAIL     REFER  MOUNTPOINT
rpool/ds1@zrep_000021  50.0M      -     1.12G  -
rpool/ds1@zrep_000022    11K      -     1.12G  -
rpool/ds1@zrep_000023    11K      -     2.10G  -
rpool/ds1@zrep_000024  50.0M      -     2.10G  -
rpool/ds1@zrep_000025  50.0M      -     2.10G  -
rpool/ds1@zrep_000026     0B      -     2.10G  -

Run after connection error

Note that also in this situation zrep transfers the delta from @zrep_000025 to @zrep_000026 even though the snapshot exists on the target already (coming back to "a)" in #203 (comment)).

But since we at least have not renamed it on the source after it has been copied to the target, this re-transmit works.

root@zrep-source:~# dd bs=1M if=/dev/urandom of=/rpool/ds1/50M.bin count=50 conv=fsync && zrep snaponly rpool/ds1
50+0 records in
50+0 records out
52428800 bytes (52 MB, 50 MiB) copied, 12.505 s, 4.2 MB/s
rpool/ds1@zrep_000027root@zrep-source:~#
root@zrep-source:~# ZREP_RESUME=1 zrep synconly rpool/ds1
sending rpool/ds1@zrep_000027 to zrep-target:rpool/ds1
send from @zrep_000025 to rpool/ds1@zrep_000026 estimated size is 50.1M
send from @zrep_000026 to rpool/ds1@zrep_000027 estimated size is 50.1M
total estimated size is 100M
TIME        SENT   SNAPSHOT rpool/ds1@zrep_000026
07:43:19   11.5M   rpool/ds1@zrep_000026
07:43:20   21.1M   rpool/ds1@zrep_000026
07:43:21   30.7M   rpool/ds1@zrep_000026
07:43:22   40.2M   rpool/ds1@zrep_000026
07:43:23   49.7M   rpool/ds1@zrep_000026
TIME        SENT   SNAPSHOT rpool/ds1@zrep_000027
07:43:24   6.89M   rpool/ds1@zrep_000027
07:43:25   17.2M   rpool/ds1@zrep_000027
07:43:26   26.7M   rpool/ds1@zrep_000027
07:43:27   36.2M   rpool/ds1@zrep_000027
07:43:28   45.7M   rpool/ds1@zrep_000027
Also running expire on zrep-target:rpool/ds1 now...
Expiring zrep snaps on rpool/ds1
root@zrep-source:~#

Other

your most recent debug output was described as replicating the problem, so I only looked deeply at that.
But now you say that the "destination modified" problem, is not part of your real problem.

No, sorry if my last post caused confusion. The "destination modified" after connection loss is the problem affecting us.

@ppbrown
Copy link
Member

ppbrown commented Aug 18, 2023 via email

@pschwarzpp
Copy link
Author

Could you please give insight on ZREP_RESUME=1? Are there additional risks / known reasons not to use it?

I'm presuming this is because of your readonly clone making.
Even if it theoretically shouldnt... I think it is pretty clear that it is.

It is unrelated. Again: On the VMs where I now reproduced the issue I am not doing any cloning.

@ppbrown
Copy link
Member

ppbrown commented Aug 18, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants