Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed init with resume fails and partial send is destroyed #149

Open
darkpixel opened this issue Jan 29, 2020 · 2 comments
Open

Failed init with resume fails and partial send is destroyed #149

darkpixel opened this issue Jan 29, 2020 · 2 comments
Assignees
Labels

Comments

@darkpixel
Copy link

root@usrbgofnas01:~# zrep -t zrep-remote init tank/pdata uswuxsdrtr01--redacted-- tank/backups/usrbgof/pdata
Setting zrep properties on tank/pdata
Creating snapshot tank/pdata@zrep-remote_000000
Sending initial replication stream to uswuxsdrtr01--redacted--:tank/backups/usrbgof/pdata
 169GiB 90:26:04 [ 705KiB/s] [                                <=>                                                                                                                                                                                                                                                                                                        ] 
packet_write_wait: Connection to my.ip.add.res port 221: Broken pipe                                                                                                                                                 <=>                                                                                                                                                ] 
aaron  ~  255  ssh root@usrbgofrtr01.--redacted-- -p 221

root@usrbgofnas01:~# cat /usr/local/etc/zrep.env
export SSH="ssh -p 225"
export ZREP_SEND_FLAGS="--raw"
export ZREP_RESUME=yes
export ZREP_R=-R
export ZREP_INC_FLAG=-i
export ZREP_OUTFILTER="pv -eIrabL 800K"
root@usrbgofnas01:~# source /usr/local/etc/zrep.env
root@usrbgofnas01:~# zrep -t zrep-remote init tank/pdata uswuxsdrtr01--redacted-- tank/backups/usrbgof/pdata
tank/pdata is at least partially configured by zrep
Partially complete init detected. Attempting to resume send
cannot receive incremental stream: incompatible embedded data stream feature with encrypted receive.
3.05MiB [ 395KiB/s] [ 395KiB/s] 
Error: resume send of zrep init tank/pdata failed
root@usrbgofnas01:~# 

I wasn't able to see the command it ran, but my guess based on the error message is that it was missing the --raw flag from the zfs send command.

@ppbrown ppbrown self-assigned this Mar 14, 2020
@ppbrown
Copy link
Member

ppbrown commented Mar 14, 2020

Huh. only just saw this one.

erm..
isnt the resume thing supposed to set all the flags automaticaly, below the zrep level?
is this actualy a ZFS bug?

@darkpixel
Copy link
Author

ZFS doesn't transfer the children "in bulk". It transfers them individually and the resume flag is set for each individual transfer. So if you have:

tank
tank/virt
tank/virt/vm-100-disk-0
tank/virt/vm-100-disk-1
tank/virt/vm/101-disk-0

And it sends vm-100-disk-0 and them bombs out on 100-disk-1, the next time zrep runs against tank/virt it will bomb out because it's not checking to see if the previous transfers completed successfully and it has no idea bout the 'remote state'.

Syncoid has a slightly different process. It appears to:

  • zfs snapshot -r tank/virt@whatever
  • Loop through the children and verify what the remote actually has
  • Loop through the children and either bring them up to the latest snapshot or resume the interrupted transfer

But this bug was more about zrep not sending the resume flag during a resume. I'm not sure if it's fixed in the latest version, but it appears when it detects an interrupted resume, it tries to fix it but fails to send the resume flag. Then when it bombs out (since the resume flag wasn't passed) the remote deletes the entire dataset.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants