Support user session libvirtd (qemu:///session) #272

Closed
mdekstrand opened this Issue Nov 8, 2014 · 33 comments

Comments

Projects
None yet

Right now, vagrant-libvirt seems to basically require the system libvirt instance (qemu:///system). It would be nice to be able to use my user session libvirt (qemu:///session) instead of the systemwide one (fewer password prompts, for one thing).

There are two ways this problem manifests at present:

  • No obvious setting, besides uri, to specify the session instance.
  • When specifying uri = "qemu:///session", vagrant up fails. The current failure is basically #247, although I already have version 0.23 installed (which should contain the fix?).
Collaborator

sciurus commented Nov 10, 2014

I don't think we can support this, since IIRC the user session doesn't have permission to perform some operations like creating networks.

bersace commented Jan 8, 2015

With qemu-bridge-helper, it is possible to plug to an existing bridge. This is how GNOME Boxes is supposed to work.

Contributor

purpleidea commented Jan 28, 2015

So vagrant+vagrant-libvirt look like they're finally getting into Fedora 22. w00t.
https://bugzilla.redhat.com/show_bug.cgi?id=1168333

One issue is that it was decided that the polkit file will not be included, meaning many password prompts by default. As a result, it makes a lot of sense to have vagrant-libvirt work with the qemu:///session instead of the system one.

As @sciurus commented, we have the network issue, but perhaps this could use the solution proposed by @bersace.

Any patches or recommendations welcome. If someone can help on the network side, I can probably patch the ruby side to support the qemu:///session aspect.

Cheers!

Contributor

purpleidea commented Mar 24, 2015

Ping :)

Bounty for this issue now contains some beverages...

Contributor

purpleidea commented May 8, 2015

Ping pong, ding dong. Bounty now also includes a cool t-shirt and/or a hat or something.

Contributor

purpleidea commented Jun 17, 2015

FYI: this issue getting fixed would directly help: https://bugzilla.redhat.com/show_bug.cgi?id=1187019

@zeenix Can gnome-boxes users use non user-mode networks? Is it something you support?

@purpleidea I found this. http://xkahn.zoned.net/blog/2013/11/26/networking-and-gnome-boxes/

In vagrant-libvirt we probably need to use virsh to setup the network on the system libvirt, and then use the bridge network interface from the session defined domain (ie: vm).

Contributor

purpleidea commented Jun 29, 2015

@stefwalter setting up a single network for the machine discovery and that is shared with every vagrant environment would be doable, however each user vagrant environment would need it's own custom static network, and this would need some sort of user mode network which I'm not quite sure how to build at the moment.

Maybe @strzibny is looking into this?

@purpleidea I think the point is that you can escalate privileges to setup the networks (since that happens less often, and prompting is acceptable). With qemu-bridge-helper, use of these networks then works from the qemu session instance just fine.

Contributor

purpleidea commented Jun 29, 2015

On Mon, Jun 29, 2015 at 11:58 AM, Stef Walter notifications@github.com
wrote:

@purpleidea https://github.com/purpleidea I think the point is that you
can escalate privileges to setup the networks (since that happens less
often, and prompting is acceptable). With qemu-bridge-helper, use of these
networks then works from the qemu session instance just fine.

Every 'vagrant destroy' will cause that network to be deleted, and a
'vagrant up' will re-create it. So I would say that prompting would still
happen far more than I would be okay with. This would still be worse UX
than what OSX or polkit users in the vagrant group experience.

Contributor

strzibny commented Jun 30, 2015

Unfortunately I am not looking into this now, there are other things that needs to be fixed first regarding Vagrant experience on Fedora. And frankly, when I use vagrant-libvirt, I don't want any password prompts. Copying one policy file is acceptable for me now and I started to work on DevAssistant's Vagrant assistant[0] that should help to set up Vagrant the way user needs (currently just basics including this very policy file).

[0] https://github.com/phracek/dap-vagrant

There is another good use case for this feature: 9p shares of code from /home/user/devel when SELinux is enforcing is difficult to accomplish when the virt process is confined. It is possible to relabel the /home/user/devel/my_project with the svirt_t label and then share, but getting it right can be tricky. Also, users have to additionally worry about facls and what not to allow the qemu user to read (and potentially write) their home folder.

If we were able to set the uri to qemu:///session, the process would run as the developer's own user with the unconfined_u context, and would be allowed to access all the files with 9p as expected with no SELinux fidgeting.

Contributor

purpleidea commented Aug 5, 2015

@rbarlow Nice. I think the biggest blocker for getting this feature, and then all the others solved is figuring out how to solve the network problem mentioned above. I don't know who can weigh in with a solution... Maybe you can ping someone?

I should also mention that there's an additional permissions related difficulty due to running the guest as the qemu user: if the vagrant user creates any files inside the 9p share, it immediately will not own the created files because on the host they will become owned by the qemu process, which has a different UID than the vagrant user. The vagrant user will see the files owned potentially by users that don't exist inside the guest (this is what happened in my case) and will not be able to do expected things like writing or potentially reading (depending on the umask).

I intend to look into using the uid= mount option on the 9p share to see if I can work around that issue, but I don't see a way in the vagrant-libvirt documentation to pass that particular mount option to the guest. I was thinking about trying to use provisioning to remount the share with the uid= option, which might get me past that particular issue. Perhaps I will file an issue about this uid= option separately if I can't find an easy workaround.

Of course, if I were able to run the guest as my user I wouldn't have to worry about that issue either.

Contributor

purpleidea commented Nov 5, 2015

Not sure if this: https://bugzilla.redhat.com/show_bug.cgi?id=1278317 could help in any way. If someone knows or can contact the creator of that bug, maybe they can help.

Cheers

scfc commented Nov 9, 2015

If I set uri = 'qemu:///session' in Vagrantfile on Fedora 23, vagrant up fails with:

==> default: Creating image (snapshot of base box volume).
/usr/share/gems/gems/fog-libvirt-0.0.2/lib/fog/libvirt/requests/compute/create_volume.rb:6:in `create_volume_xml': Call to virNetworkCreateXML failed: internal error: Child process (/bin/qemu-img create -f qcow2 -b /var/lib/libvirt/images/trusty-cloud_vagrant_box_image_0.img -o backing_fmt=qcow2,compat=0.10 /var/lib/libvirt/images/libvirt-test_default.img 41943040K) unexpected exit status 1: qemu-img: /var/lib/libvirt/images/libvirt-test_default.img: Could not create file: Permission denied (Libvirt::Error)
[stack trace]

I'm very interested in getting this issue resolved like @rbarlow for sharing directories between guest VM and executing user.

We use 'qemu:///session' in the Cockpit project testing. We bring up many thousands of qemu session VMs a day. But unfortunately the network needs to be preconfigured as root. Once that's done, and the bridge is in /etc/qemu/bridge.conf everything works.

But I think that vagrant likes to touch up the network on each 'vagrant up', so that's the obstacle here.

@purpleidea the qemu multicast support isn't going to be a general purpose replacement

re: gnome boxes originally made do with qemu's usermode/slirp networking. Recently boxes will prefer to use qemu:///system's 'default' virtual network via the setuid qemu-bridge-helper (the libvirt XML just manually references virbr0). This works when everything is set up correctly (which includes the out-of-the-box fedora workstation config), but if anything goes wrong it requires admin access to fix.

Contributor

purpleidea commented Nov 19, 2015

@crobinso I think you're the right expert to recommend what vagrant-libvirt should do to support the user session stuff. It would be great to avoid the 'root hole' and other issues mentioned in this thread. We're willing to purchase beverages in return.

Cheers

@purpleidea There isn't any simple fix... if there was, gnome-boxes (and virt-manager) would have used it long ago.

You can do what gnome-boxes does and implicitly depend on the qemu:///system 'default' network being available and providing virbr0. This works out of the box on fedora because we ship a /etc/qemu/bridge.conf whitelisting virbr0, and boxes has all the dependencies set up correctly. But like I said above, if something falls over or even something simple like the network isn't started or has been deleted, you need root access to fix it. But good error messages could point the user at the fix.

That said I'm pretty ignorant of vagrant's needs, and if vagrant needs to mess with the networking (like @stefwalter says), that might not work.

(IMO the ideal fix would be that NetworkManager provides an API or UI to say 'share network connection with VMs', and it would basically do what libvirt's default network already does, and provide an API for the authorized non-root user to get back an fd for a tap device... or something. That's what I suggested to the desktop guys before boxes development even started. But even if someone started implementing that today it wouldn't be usable for a year at least, considering it would need libvirt plumbing too)

Contributor

purpleidea commented Nov 19, 2015

@crobinso Thanks for your comments.

Well RE: the API you mentioned, if that's what's required, and assuming it's what's necessary and/or useful for other aspects of the machine (such as boxes) do you think you could open a sort of "design" bug in libvirt which a description of the specifics needed? That way I can at least point at something and say "I want this" and attempt to bribe someone to patch it :)

Thanks again

I'd need to dig deeper into the network stuff first before filing a coherent NetworkManager RFE... If I get around to it, I'll mention it here

Contributor

purpleidea commented Nov 19, 2015

On Thu, Nov 19, 2015 at 3:20 PM, Cole Robinson notifications@github.com
wrote:

I'd need to dig deeper into the network stuff first before filing a
coherent NetworkManager RFE... If I get around to it, I'll mention it here

Much appreciated, thanks!

Contributor

purpleidea commented Jan 12, 2016

It seems @crobinso wrote up a nice article about us :)

http://blog.wikichoon.com/2016/01/qemusystem-vs-qemusession.html

Bang on accurate about describing the problem... Maybe the qemu-bridge-helper can be pre-configured during the fedora vagrant RPM install, and then we use the session interface, but benefit from the network? This might require some patching in vagrant or perhaps only in vagrant-libvirt.

In Cockpit integration tests we have a major libvirt setup that uses qemu://session (it even runs from within a docker container ... yes starting VMs inside a container, go figure).

We use a bridged network using qemu-bridge-helper ... and Cole is right that this is the only part that needs privileges.

The way that we handle privileges with network is to check if it's configured correctly. If not, then we tell the user to invoke a command as sudo once, in onder to bring the network into shape ... after which we run routinely unprivileged as the current user.

@purpleidea the missing piece for vagrant AFAICT is that it needs to actually alter the network settings of virbr0 on 'vagrant up', which requires talking to qemu:///system

Can someone here explain how the libvirt virtual network XML is edited when launching a vagrant instance?

Owner

infernix commented Apr 25, 2016

There are actually more issues with this than just network:

  • We currently are hardcoding owner and group to 0 in volume_snapshot.xml.erb which creates issues like libvirtd[5635]: cannot chown /home/infernix/qemusession/vagrantbugs_default.img to (0, 0): Operation not permitted whenever we create a snapshot from a base box image under qemu:///session
  • qemu-bridge-helper isn't always setuid by default (does not appear to be on debian sid qemu 2.5), nor is CAP_NET_ADMIN always set (again debian specific)
  • Even with qemu.conf ACLs set up as allow all and bridge_helper defined in libvirt/qemu.conf, when vagrant creates a network while talking to user-privileged libvirtd, virNetDevBridgeCreate is called which does not appear to be using qemu-bridge-helper at all.

So I haven't really found a way to make this work at all, and i suspect that the way we create network bridges today doesn't leverage qemu-bridge-helper code paths.

What XML were you using? I suspect you were doing interface type='network', but to use qemu-bridge-helper you need to do interface type='bridge' with

Owner

infernix commented Apr 25, 2016

We create two types of network, either public or private, but I'm focusing just on the public one here.

The XML generated in the libvirt user session is:

<network ipv6='yes'>
  <name>vagrant-libvirt</name>
  <uuid>2f61b482-a917-49fa-bd4f-32c4bfd20ccc</uuid>
  <forward mode='nat'/>
  <bridge name='virbr0' stp='on' delay='0'/>
  <mac address='52:54:00:ce:b1:d3'/>
  <ip address='192.168.121.1' netmask='255.255.255.0'>
    <dhcp>
      <range start='192.168.121.1' end='192.168.121.254'/>
    </dhcp>
  </ip>
</network>

This does not even activate with virsh -c qemu:///session:

virsh -c qemu:///session net-start vagrant-libvirt
error: Failed to start network vagrant-libvirt
error: error creating bridge interface virbr0: Operation not permitted

Not even sure where to go from here, since i've configured bridge_helper in /etc/libvirt/qemu.conf and in ~/.config/libvirt/qemu.conf plus the acl in /etc/qemu/bridge.conf is allow all. Running libvirtd -v doesn't show any sign of it trying to invoke bridge helper code.

You need to set up your bridge (let's say virbr0) independent of qemu:///session. Like via system libvirtd/qemu:///system. Then with qemu:///session give your VM device XML like:

<interface type='bridge'>
  <source bridge='virbr0'/>
</interface>

And that's it. You don't invoke net-* commands on qemu:///session

Owner

infernix commented Apr 25, 2016

Does that not defeat the purpose of this issue though? The goal here is to be able to use qemu://session and not qemu://system?

I mean if that's the conclusion then I think this is simply not achievable since we would have to talk to qemu:///system AND qemu:///session on pretty much every invocation of a vagrant up, especially so when there are multiple networks (whether public or private).

Or in other words, if we absolutely cannot avoid talking to qemu:///system, why should we try to support qemu:///session then?

infernix added the wontfix label Apr 25, 2016

Owner

infernix commented Apr 25, 2016

So after some more deliberation on IRC, the conclusion is that we cannot presently support even the most basic features:

  • storage pools need to be uid aware; fixable though, but a TODO
  • non-existing or inactive public networks are created as XML (e.g. virsh net-create) and then started (e.g. virsh net-start) before VM gets created

Doing a net-start seems impossible today without qemu:///system. Therefore I'm going to close this with a wontfix because if we cannot even create a network from a qemu:///session, there is no way we can fully support it. Implementing a partial qemu:///session with kludges and workarounds to talk to qemu://system seems to me a wasted effort; only 100% qemu:///session support is worth the rewrite.

If anyone can come up with a true solution to only talk to qemu:///session whilst supporting all vagrant-libvirts network features, please reopen or submit a PR, though I suspect it will require features in libvirt that aren't available today.

infernix closed this Apr 25, 2016

nehaljwani commented Aug 31, 2016 edited

FWIW, to run the plugin completely in user space (and throwing away some features), I have documented the steps at: https://gist.githubusercontent.com/nehaljwani/e5cb24eb7d6c508a8db5bbc357fcfcbc/raw

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment