Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add NBD+TLS support for Remote backup #66

Closed
abbbi opened this issue Jul 24, 2022 · 5 comments
Closed

Add NBD+TLS support for Remote backup #66

abbbi opened this issue Jul 24, 2022 · 5 comments
Labels
enhancement New feature or request

Comments

@abbbi
Copy link
Owner

abbbi commented Jul 24, 2022

Add TLS support for remote backup. It seems not all distributions ship an up to date domainbackup.rng, so its
not possible to start an NBD pull service with TLS support. As distributions adopt, add option --with-tls, so remote
backup via NBDS is possible.

https://fossies.org/linux/libvirt/src/conf/schemas/domainbackup.rng on debian11:

   60               <element name="server">
   61                 <optional>
   62                   <attribute name="tls">
   63                     <ref name="virYesNo"/>
   64                   </attribute>

on rhel 8.5 tls option is missing, so created backup-begin xml fails to validate:

                    </optional>
                    <!-- add tls? -->
                  </group>

Backup XML example:

<domainbackup mode="pull">
  <server tls="yes" name="192.168.20.2" port="10809"/>
  <disks>
    <disk name="vda">
      <scratch file="/var/tmp/backup.9Y9VD.sda"/>
    </disk>
  </disks>
</domainbackup>

this however requires some setup on the host too:

sudo virsh backup-begin --backupxml bck.xml restoredvm
error: internal error: unable to execute QEMU command 'object-add': Unable to access credentials /etc/pki/qemu/ca-cert.pem: No such file or directory

see: https://wiki.libvirt.org/page/TLSCreateCACert

@abbbi abbbi added the enhancement New feature or request label Jul 24, 2022
@abbbi
Copy link
Owner Author

abbbi commented Jul 26, 2022

@abbbi
Copy link
Owner Author

abbbi commented Jul 26, 2022

for TLS based auth with certificates:

  1. create certs:
    create-cert.txt
  2. disable verify (for now):
sudo grep backup.*verify /etc/libvirt/qemu.conf 
backup_tls_x509_verify = 0
  1. restart vm, start backup job, now reachable via nbds (non verified)
nbdinfo  nbds://127.0.0.1/sda
protocol: newstyle-fixed with TLS

for complete verification, setup certs in:

https://libguestfs.org/nbdkit-tls.1.html#Issuing-and-checking-client-certificates

@abbbi abbbi changed the title Add TLS support for Remote backup Add NBD+TLS support for Remote backup Jul 26, 2022
abbbi added a commit that referenced this issue Jul 26, 2022
@abbbi
Copy link
Owner Author

abbbi commented Jul 26, 2022

seems a bit wonky with older linbd versions, disconnecting fails:

    return libnbdmod.shutdown(self._o, flags)
nbd.Error: nbd_shutdown: gnutls_record_recv: The TLS connection was non-properly terminated.

https://libguestfs.org/nbdkit-tls.1.html#Issuing-and-checking-client-certificates

19:22 <abi> https://paste.debian.net/1248426/ (libnbd version: 1.6.1) known?
19:23 <abi> happens with nbdcopy, too :/
19:43 <rwmjones> abi: I think we've seen this, but I can't recall if it's fixed .. do you have an easy reproducer?
19:45 <rwmjones> I don't see any error with this command:
19:45 <rwmjones> $ ./nbdkit --tls=require --tls-certificates=tests/pki sparse-random 1G --run 'nbdcopy $uri null:'
19:45 <rwmjones> it runs shutdown
19:46 <rwmjones> so maybe it's been fixed upstream
19:48 <abi> rwmjones: reading from qemu here, so probably an issue on qemu side?
19:50 <rwmjones> let's see if I can make it happen with qemu-nbd
19:50 <abi> rwmjones: my reproducer is debian bullseyse + backup-begin with tls enabled, no verify
19:52 <rwmjones> actually I can reproduce it
19:52 <abi> occasionally i also receive this one:
19:52 -!- Irssi: Pasting 3 lines to #guestfs. Press Ctrl-K if you wish to do this or Ctrl-C to cancel.
19:52 <abi>     data = nbdCon.pread(block.length, block.offset)
19:52 <abi>   File "/usr/lib/python3/dist-packages/nbd.py", line 1448, in pread
19:52 <abi>     return libnbdmod.pread(self._o, count, offset, flags)
19:52 <abi> nbd.Error: nbd_pread: nothing to poll for in state REPLY.START: Invalid argument (EINVAL)
19:52 <rwmjones> so that's a different problem and also shouldn't happen
19:53 <rwmjones> sticking with the first one:
19:53 <rwmjones> $ qemu-nbd -t -f raw fedora-36.img --object tls-creds-x509,id=tls0,endpoint=server,dir=$HOME/d/libnbd/tests/pki --tls-creds tls0 
19:53 <rwmjones> $ nbdcopy nbds://localhost null:
19:53 <rwmjones> nbdcopy: nbds://localhost: nbd_connect_uri: gnutls_record_recv: The TLS connection was non-properly terminated.
19:53 <abi> :-)
19:53 <rwmjones> this is with upstream nbdcopy
19:53 <rwmjones> I wonder why we don't see this in interop testing ...
19:52 <rwmjones> so that's a different problem and also shouldn't happen
19:53 <rwmjones> sticking with the first one:
19:53 <rwmjones> $ qemu-nbd -t -f raw fedora-36.img --object tls-creds-x509,id=tls0,endpoint=server,dir=$HOME/d/libnbd/tests/pki --tls-creds tls0 
19:53 <rwmjones> $ nbdcopy nbds://localhost null:
19:53 <rwmjones> nbdcopy: nbds://localhost: nbd_connect_uri: gnutls_record_recv: The TLS connection was non-properly terminated.
19:53 <abi> :-)
19:53 <rwmjones> this is with upstream nbdcopy
19:53 <rwmjones> I wonder why we don't see this in interop testing ...
19:54 <rwmjones> oh right, because:
19:54 <rwmjones> https://gitlab.com/nbdkit/libnbd/-/blob/master/interop/interop.c#L219
19:54 <rwmjones> about the second issue, would be really good to have a self-container reproducer that doesn't involve doing backups etc
19:55 <abi> so no shutdown call if tls if used is OK?
19:55 <eblake> I've asked on the qemu list before; danpb insists that gnutls_bye is a waste if we know there is no more useful traffic to send
19:56 <eblake> so maybe we need to teach nbdcopy to ignore qemu's poor disconnect behavior if we successfully got NBD_CMD_DISC
19:56 <eblake> the REAL reason you should ALWAYS use gnutls_bye is if you are worried about a MitM attacker terminating the connection early
19:56 <eblake> if we don't see NBD_CMD_DISC, we WANT to diagnose that as a connection issue that may mean malicious MitM
19:57 <rwmjones> hmm interesting
19:57 <rwmjones> danpb: ^
19:57 <eblake> but if we did see intent to end the connection, an unclean shutdown doesn't buy us any security
19:57 <danpb> gnutls_bye isn't required if there is an explicit application protocol "shutdown" exchange
19:57 <eblake> or rather, not detecting an unclean shutdown doesn't buy security, and it does buy speed (less time spent with the final few TLS handshakes
19:58 <danpb> NBD has such a command so doesn't need something like gnutls_bye, but    HTTPS would want a tls shutdown
19:58 <rwmjones> we would need to make libnbd deal with both behaviours anyway, since qemu as it is is widely used
20:00 <abi> rwmjones: the other one will probably happen in combination with nbdinfo --map / qemu-io reading at different offsets
20:00 <danpb> dealing with  gnutls_bye is difficult in a non-blocking setup, as you'd typically run it at the place you do a close()  call, but there's typically no way to use of poll() for 
              non-blocking I/O at that point in the program
20:01 <danpb> so its not at all pleasant to try to get gnutls_bye into qemu in all the places that use tls

@abbbi
Copy link
Owner Author

abbbi commented Jul 27, 2022

currently reading data vai NBDS and python bindings sometimes fails with:

python3 test.py
65536
983040
20447232
5177344
Traceback (most recent call last):
  File "/home/abi/test.py", line 10, in <module>
    data = h.pread(b['length'],b['offset'])
  File "/usr/lib/python3/dist-packages/nbd.py", line 1448, in pread
    return libnbdmod.pread(self._o, count, offset, flags)
nbd.Error: nbd_pread: nothing to poll for in state REPLY.START: Invalid argument (EINVAL)
11:12 <rwmjones> abi: well there you go, I can reproduce it
11:13 <danpb> rwmjones:  fyi, with gnutls certtool i create certs with this config  https://paste.centos.org/view/2f0f6aa8
11:13 <danpb> rwmjones: so it works with all variants of localhost
11:14 <rwmjones> in this case I had to set cn to the name of the server since I'm connecting remotely
11:14 <rwmjones> (for the first time in like ever probably)
11:15 <abi> rwmjones: nice :)

@abbbi
Copy link
Owner Author

abbbi commented Jul 27, 2022

Fixed upstream: libguestfs/libnbd@39afa21
so for NBD+TLS it requires at least libnbd/w python bindings >= 1.12.6.

@abbbi abbbi closed this as completed Jul 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant