Libvirt dhcp lease fix #613

NotBrianZach · 2017-02-27T22:46:32Z

allows to mutate global state of deployment once before stopping
multiple machines (that could otherwise experience multithreading problems) by defining a method with a fixed named "_globalPreDestroyHook" in any given backend, in this particular case, useful with libvirt
to restart the vitrual network your vms were running on before destroying them since the only way to quickly re assign old hostname to new ip in dhcp was to mutate state of the network with virsh

                              ["virsh", "-c", "qemu:///system",
                               "net-update", net[0], "add",
                               "ip-dhcp-host",
                               "<host mac='{0}' name='{1}' ip='{2}' />".format(
                                 net[1], self.name, ip),
                               "--live"
                               ]),

example of problem this solves here

danbst · 2017-03-05T14:28:16Z

I accidently stopped network in virsh, then run deploy, got

...
backup> starting...
live..> starting...
backup> error: Failed to create domain from /tmp/nixops-tmp8OYpHW/backup-domain.xml
backup> error: Requested operation is not valid: network 'default' is not active
backup>
live..> error: Failed to create domain from /tmp/nixops-tmp8OYpHW/live-domain.xml
live..> error: Requested operation is not valid: network 'default' is not active
live..>
error: Multiple exceptions: command ‘['virsh', '-c', 'qemu:///system', 'create', '/tmp/nixops-tmp8OYpHW/backup-domain.xml']’ failed on machine ‘backup’ (exit code 1), command ‘['virsh', '-c', 'qemu:///system', 'create', '/tmp/nixops-tmp8OYpHW/live-domain.xml']’ failed on machine ‘live’ (exit code 1)

Then enabled network again and run destroy

$ ./nixops/scripts/nixops destroy
live..> running globalPreStopHook
error: failed to get domain 'nixops-ad4ce430-01a2-11e7-a68b-0a2c52343f13-live'
error: Domain not found: no domain with matching name 'nixops-ad4ce430-01a2-11e7-a68b-0a2c52343f13-live'
error: Command '['virsh', '-c', 'qemu:///system', 'dumpxml', u'nixops-ad4ce430-01a2-11e7-a68b-0a2c52343f13-live']' returned non-zero exit status 1

update: this may be unrelated to this PR

NotBrianZach · 2017-03-05T16:21:09Z

looked at it for a couple seconds and can't see how myself without further info.

for first phase, this pr doesn't make assumption about network name I don't believe

for second phase,

        globalPreDestroyHook = getattr(next(self.active.itervalues(), None), "_globalPreDestroyHook", None)
         if callable(globalPreDestroyHook):
             globalPreDestroyHook()

if there are no values in self.active.itervalues(), should get None, which I don't believe is callable so shouldn't trigger hook, i don't think

certainly mutating state outside of nixops with virsh could get you in trouble fairly quickly; and nixops seems still pretty "under development" for this backend anyway ;P

danbst · 2017-03-05T16:23:46Z

I ended up wrapping the globalPreDestroyHook into try ... except to workaround this.

danbst · 2017-03-05T16:24:28Z

nixops/backends/libvirtd.py

+          self._logged_exec(["virsh", "-c", "qemu:///system", "net-start", net])
+
+    def _globalPreDestroyHook(self):
+        self.log("running globalPreStopHook")


should be running globalPreDestroyHook

danbst · 2017-03-06T05:27:12Z

though you do network destroy, seems like it doesn't fix the problem for me. I have

  ...
  <ip address='192.168.122.1' netmask='255.255.255.0'>
    <dhcp>
      <range start='192.168.122.2' end='192.168.122.254'/>
      <host mac='52:54:00:6d:87:28' name='live' ip='192.168.122.52'/>
      <host mac='52:54:00:5c:2f:8c' name='backup' ip='192.168.122.169'/>
    </dhcp>
  </ip>

and running nixops destroy doesn't remove those custom <host lines (which were added by Nixops before #586, btw, not manually). Is this a kind of a problem you wanted to address or this is intended in #586?

NotBrianZach · 2017-03-06T06:01:23Z

if the virsh was this:

                              ["virsh", "-c", "qemu:///system",
                               "net-update", net[0], "add",
                               "ip-dhcp-host",
                               "<host mac='{0}' name='{1}' ip='{2}' />".format(
                                 net[1], self.name, ip),
                               "--live", "--config"
                               ]),

then the changes would persist after the network was destroyed

with just live the changes do not persist after the network has been destroyed.

                              ["virsh", "-c", "qemu:///system",
                               "net-update", net[0], "add",
                               "ip-dhcp-host",
                               "<host mac='{0}' name='{1}' ip='{2}' />".format(
                                 net[1], self.name, ip),
                               "--live"
                               ]),

this pull request does the second method however if you have added lines via first method (or similar technique not using a live update) they will persist unless you delete them with virsh net update.

danbst · 2017-03-06T11:11:26Z

So, this doesn't work with multiple deployments? If I destroy one deployment, network access to VMs in another deployment will be lost.

NotBrianZach · 2017-03-06T17:43:57Z

Hmm you might be right, might have to think about this some more. However as temp. workaround you could use a different network name for each deployment, after cursory glance might be possible through deployment.libvirtd.primary_net = "new_network_name" in infrastructure-libvirtd.nix

…

On Mon, Mar 6, 2017 at 5:11 AM, Данило Глинський (Danylo Hlynskyi) < ***@***.***> wrote: So, this doesn't work with multiple deployments? If I destroy one deployment, network access to VMs in another deployment will be lost. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#613 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AIQmnewFFeiZFA_jCw61po7LhBziLkC8ks5ri-nfgaJpZM4MNuvd> .

-- Brian "Zach" Abel

NotBrianZach · 2017-03-06T18:27:06Z

did a little testing and if you use the same physical specification (infrastructure-libvirt.nix), it won't be able to create two deployments (since they'll have same hostname), also it did lose connection (only verified it lost ssh connection) after destroying a deployment on the same network.

haven't tested the network workaround

I'll probably still use this personally but I think it would be unexpected behavior in the broader tool

though I do think this feature could be provided in a reliable/compatible way by automatically generating network names or some similar scheme.

danbst · 2017-03-07T09:26:22Z

Strangely enough, if I define

    networking.bridges.br1.interfaces = [];
    networking.interfaces.br1.ip4 = [ { address = "192.168.5.1"; prefixLength = 24; } ];

in guest's configuration, it publishes hostname to libvirt's dnsmasq. If now bridges are present, DHCP leases are not updated (the hostname nixos is assigned to all machines in a network).

If you could verify, that with this client configs you don't require patching XML, I can dig deeper into the issue.

I tested this with nixops destroy && nixops deploy, monitored DHCP leases with

$ journalctl SYSLOG_FACILITY=3 -pdebug -f | grep dnsmasq

used nixos-17.03

NotBrianZach · 2017-03-07T18:33:14Z

I added these to logical specification (network.nix), and commented out the live patching of the network xml.

   networking.bridges.br1.interfaces = [];
    networking.interfaces.br1.ip4 = [ { address = "192.168.122.68"; prefixLength = 24; } ];

   networking.bridges.br1.interfaces = [];
    networking.interfaces.br1.ip4 = [ { address = "192.168.122.57"; prefixLength = 24; } ];

following that when doing the new leases did have the appropriate names (instead of nixos) as you say,

sudo journalctl SYSLOG_FACILITY=3 -pdebug -f | grep dnsmasq
Mar 07 12:21:49 zachMothership dnsmasq-dhcp[9009]: DHCPREQUEST(virbr0) 192.168.122.189 52:54:00:a7:a7:b9
Mar 07 12:21:49 zachMothership dnsmasq-dhcp[9009]: DHCPACK(virbr0) 192.168.122.189 52:54:00:a7:a7:b9 postgresqlServer
Mar 07 12:21:50 zachMothership dnsmasq-dhcp[9009]: DHCPREQUEST(virbr0) 192.168.122.7 52:54:00:9a:65:02
Mar 07 12:21:50 zachMothership dnsmasq-dhcp[9009]: DHCPACK(virbr0) 192.168.122.7 52:54:00:9a:65:02 apiServer

virsh -c qemu:///system net-dhcp-leases default

Expiry Time          MAC address        Protocol  IP address                Hostname        Client ID or DUID
-------------------------------------------------------------------------------------------------------------------
2017-03-07 12:34:44  52:54:00:09:06:8d  ipv4      192.168.122.227/24        -               -
2017-03-07 12:49:44  52:54:00:0c:2a:c4  ipv4      192.168.122.220/24        nixos           -
2017-03-07 12:34:44  52:54:00:0f:1a:af  ipv4      192.168.122.188/24        -               -
2017-03-07 13:04:21  52:54:00:15:bd:16  ipv4      192.168.122.138/24        nixos           -
2017-03-07 13:19:00  52:54:00:19:95:72  ipv4      192.168.122.52/24         apiServer       -
2017-03-07 13:11:41  52:54:00:27:b0:97  ipv4      192.168.122.105/24        apiServer       -
2017-03-07 13:11:40  52:54:00:33:53:16  ipv4      192.168.122.121/24        nixos           -
2017-03-07 12:45:09  52:54:00:52:1c:73  ipv4      192.168.122.117/24        postgresqlServer -
2017-03-07 12:43:41  52:54:00:6f:b2:9f  ipv4      192.168.122.64/24         -               -
2017-03-07 12:49:34  52:54:00:7e:f8:0a  ipv4      192.168.122.34/24         nixos           -
2017-03-07 13:21:50  52:54:00:9a:65:02  ipv4      192.168.122.7/24          apiServer       -
2017-03-07 13:04:21  52:54:00:a5:04:ba  ipv4      192.168.122.5/24          postgresqlServer -
2017-03-07 13:21:49  52:54:00:a7:a7:b9  ipv4      192.168.122.189/24        nixos           -
2017-03-07 12:59:36  52:54:00:b9:6e:47  ipv4      192.168.122.33/24         apiServer       -
2017-03-07 13:18:58  52:54:00:bb:55:69  ipv4      192.168.122.124/24        nixos           -
2017-03-07 12:57:36  52:54:00:e1:3a:89  ipv4      192.168.122.99/24         nixos           -
2017-03-07 12:46:24  52:54:00:e8:b6:c7  ipv4      192.168.122.29/24         apiServer       -
2017-03-07 12:43:52  52:54:00:f0:2e:5e  ipv4      192.168.122.90/24         -               -

however the deploy hangs when starting nscd.service
also I am on nixos version 16.09.663.3dc0897 (Flounder)

danbst · 2017-03-07T19:22:30Z

networking.interfaces.br1.ip4 = [ { address = "192.168.122.68"; prefixLength = 24; } ];

Maybe I was a bit unclear here, but "192.168.5.1" was an arbitrary IP not clashing with libvirtd 192.168.122.0/24. I don't have problems with nscd at 16.09.

Though I'm stuck again. DHCP leases are expired in an hour and not renewed until machine reboot, so this solution is still bad. Perhaps some DHCP refresh daemon on client could solve this problem? Don't know

NotBrianZach · 2017-03-07T20:12:47Z

hmm, woops. yea not exactly a networking war veteran myself. was trying to go for a fix internal to nixops/libvirt, though for my personal needs it's not currently super high on the list of priorities. Do still think the xml hack could work if you forced each deployment to have a different network name. Not sure how bad a tradeoff that is in terms of configurability for more complex setups.

Nadrieril and others added 11 commits January 16, 2017 12:21

libvirt: Get mac address from libvirt

8f1929d

libvirt: Add different options for VM IP detection

031ca1b

forced addition of ip addresses to host name mapping w/ virsh

13a5be6

fix problem/wait with dhcp leases when destroy/deploy-ing quickly

705d679

add loggging

0d8c924

remove cruft

79156f2

globalPreDestroyHook works

65a670f

remove sshprivatekeylogstatement

22eb5b3

remove custom release.nix

234a0c6

removed another line from release.nix

7f7e6fc

small rename

1bda49e

NotBrianZach mentioned this pull request Feb 27, 2017

libvirt: Add different options for VM IP detection #586

Closed

small rename

e92e5d0

danbst reviewed Mar 5, 2017

View reviewed changes

NotBrianZach added 3 commits March 5, 2017 13:03

fix logging PreStopHook->PreDestroyHook

1e4949b

added exception handling to globalPreDestroyHook

e172d24

Merge branch 'libvirtDHCPLeaseFix' into libvirtDHCPLeaseFix2

adee6d4

NotBrianZach closed this Mar 6, 2017

NotBrianZach mentioned this pull request Mar 7, 2017

Libvirt DHCP lease expiration #617

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Libvirt dhcp lease fix #613

Libvirt dhcp lease fix #613

NotBrianZach commented Feb 27, 2017 •

edited

danbst commented Mar 5, 2017 •

edited

NotBrianZach commented Mar 5, 2017

danbst commented Mar 5, 2017

danbst Mar 5, 2017

danbst commented Mar 6, 2017

NotBrianZach commented Mar 6, 2017 •

edited

danbst commented Mar 6, 2017

NotBrianZach commented Mar 6, 2017 via email

NotBrianZach commented Mar 6, 2017 •

edited

danbst commented Mar 7, 2017

NotBrianZach commented Mar 7, 2017 •

edited

danbst commented Mar 7, 2017

NotBrianZach commented Mar 7, 2017 •

edited

Libvirt dhcp lease fix #613

Libvirt dhcp lease fix #613

Conversation

NotBrianZach commented Feb 27, 2017 • edited

danbst commented Mar 5, 2017 • edited

NotBrianZach commented Mar 5, 2017

danbst commented Mar 5, 2017

danbst Mar 5, 2017

Choose a reason for hiding this comment

danbst commented Mar 6, 2017

NotBrianZach commented Mar 6, 2017 • edited

danbst commented Mar 6, 2017

NotBrianZach commented Mar 6, 2017 via email

NotBrianZach commented Mar 6, 2017 • edited

danbst commented Mar 7, 2017

NotBrianZach commented Mar 7, 2017 • edited

danbst commented Mar 7, 2017

NotBrianZach commented Mar 7, 2017 • edited

NotBrianZach commented Feb 27, 2017 •

edited

danbst commented Mar 5, 2017 •

edited

NotBrianZach commented Mar 6, 2017 •

edited

NotBrianZach commented Mar 6, 2017 •

edited

NotBrianZach commented Mar 7, 2017 •

edited

NotBrianZach commented Mar 7, 2017 •

edited