Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs/dev/libvirt: update libvirt/firewalld setup instructions #3677

Merged
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
153 changes: 101 additions & 52 deletions docs/dev/libvirt/README.md
Expand Up @@ -7,6 +7,7 @@ Launching clusters via libvirt is especially useful for operator development.
[how to create an install-config.yaml file](../../user/overview.md#multiple-invocations) and [the libvirt platform customization](customization.md) documents.

## One-time setup

It's expected that you will create and destroy clusters often in the course of development. These steps only need to be run once.

Before you begin, install the [build dependencies](../dependencies.md).
Expand All @@ -16,13 +17,14 @@ Before you begin, install the [build dependencies](../dependencies.md).
Make sure you have KVM enabled by checking for the device:

```console
$ ls -l /dev/kvm
$ ls -l /dev/kvm
travier marked this conversation as resolved.
Show resolved Hide resolved
crw-rw-rw-+ 1 root kvm 10, 232 Oct 31 09:22 /dev/kvm
```

If it is missing, try some of the ideas [here][kvm-install].

### Install and Enable Libvirt

On CentOS 7, first enable the
[kvm-common](http://mirror.centos.org/centos/7/virt/x86_64/kvm-common/)
repository to ensure you get a new enough version of qemu-kvm.
Expand All @@ -44,6 +46,7 @@ sudo systemctl enable --now libvirtd
In this example, we'll set the base domain to `tt.testing` and the cluster name to `test1`.

### Clone the project

```sh
git clone https://github.com/openshift/installer.git
cd installer
Expand Down Expand Up @@ -88,69 +91,94 @@ components drive deployment of worker machines. The libvirt cluster-api
provider will run inside the local cluster, and will need to connect back to
the libvirt instance on the host machine to deploy workers.

In order for this to work, you'll need to enable TCP connections for libvirt.
In order for this to work, you'll need to enable unauthenticated TCP
connections for libvirt.

#### Configure libvirtd.conf
To do this, first modify your `/etc/libvirt/libvirtd.conf` and set the
following:
```
listen_tls = 0
listen_tcp = 1
auth_tcp="none"
tcp_port = "16509"
**NOTE:** The following configuration disables all encryption and authentication
options in libvirtd and causes it to listen on all network interfaces and IP
addresses. **A connection to this privileged libvirtd gives the client
privileges equivalent to those of a root shell.** This configuration has a
security impact on a par with running a telnet server with no root password set.
It is critical to follow the steps below to **configure the firewall to prevent
access to libvirt from other hosts on the LAN/WAN**.

#### For systemd activated libvirt

This applies only if the libvirt daemon is configured to use socket activation.
This is currently the case on Fedora 31 (and later) and Arch Linux.

First, you need to start the libvirtd TCP socket, which is managed by systemd:

```sh
sudo systemctl start libvirtd-tcp.socket
```

On Fedora 31, you also need to enable and start the libvirtd TCP
socket, which is managed by systemd:
To make this change persistent accross reboots you can optionally enable it:

```sh
sudo systemctl enable libvirtd-tcp.socket
sudo systemctl start libvirtd-tcp.socket
```

after which you need to restart libvirtd.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this still applies?

Then to enable TCP access to libvirtd, modify `/etc/libvirt/libvirtd.conf` and
set the following:

**NOTE:** that the above configuration disables all encryption and
authentication options in libvirtd and causes it to listen on all
network interfaces and IP addresses. **A connection to this privileged
libvirtd gives the client privileges equivalent to those of a root
shell.** This configuration has a security impact on a par with
running a telnet server with no root password set. It is critical
to follow the steps below to **configure the firewall to prevent
access to libvirt from other hosts on the LAN/WAN**.
```
auth_tcp = "none"
```

Then restart libvirt:

#### Configure qemu.conf
```sh
sudo systemctl restart libvirtd
```

On Debian/Ubuntu it might be needed to configure security driver for qemu.
Installer uses terraform libvirt, and it has a known issue, that might cause
unexpected `Could not open '/var/lib/libvirt/images/<FILE_NAME>': Permission denied`
errors. Double check that `security_driver = "none"` line is present in
`/etc/libvirt/qemu.conf` and not commented out.
#### For permanently running libvirt daemon

#### Configure the service runner to pass `--listen` to libvirtd
This applies only if the libvirt daemon is started only through
`libvirtd.service` and without making use of systemd socket activation (through
`libvirtd.socket` and similar systemd units).

**NOTE:** if the installation of libvirt included support for socket
activation via the `libvirt-tcp.socket` systemd unit, the `--listen`
argument should not be added and thus this step can be skipped.

If socket activation is not available, you'll have to pass an additional
command-line argument to libvirtd. On Red Hat based distros, modify
`/etc/sysconfig/libvirtd` and set:
For RHEL/CentOS, make sure that the following is set in
`/etc/sysconfig/libvirtd`:

```
LIBVIRTD_ARGS="--listen"
```

On Debian based distros, modify `/etc/default/libvirtd` and set:
For Debian based distros, make sure that the following is set in
`/etc/default/libvirtd`:

```
libvirtd_opts="--listen"
```

Next, restart libvirt: `systemctl restart libvirtd`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this #### Configure the service runner to pass --listen to libvirtd section, I think the previous wording was better.

Then to enable TCP access to libvirtd, modify `/etc/libvirt/libvirtd.conf` and
set the following:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this coming so late? I think this is always needed, and must be done in all cases, even in the socket activated case. From looking at this diff, it seems it's only in the "#### Configure the service runner to pass --listen to libvirtd" section

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would keep most of the previous order and not move most of the text. Splitting the libvirt-tcp.socket discussion from its paragraph could be useful indeed, and being more explicit about what enable does is good too.

Copy link
Contributor

@cfergeau cfergeau Jun 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, most of the libvirtd.conf changes are not needed with tcp socket activation.auth_tcp = none is needed:

$ LC_ALL=C virsh -c qemu+tcp://localhost/system
error: failed to connect to the hypervisor
error: l'authentification a échoué : échec de l'authentification

```
listen_tls = 0
listen_tcp = 1
auth_tcp = "none"
tcp_port = "16509"
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still a bit dubious with the splitting of the libvirtd.conf instructions, even if they are ignored, it's not harmful to set them in socket-activation setups

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not harmful but it's also unnecessary and might trigger questions. I think we need to have two clearly separated cases to avoid confusion and keeping each case to the point helps with that.


Then restart libvirt:

```sh
sudo systemctl restart libvirtd
```

#### Configure qemu.conf

On Debian/Ubuntu it might be needed to configure security driver for qemu.
Installer uses terraform libvirt, and it has a known issue, that might cause
unexpected `Could not open '/var/lib/libvirt/images/<FILE_NAME>': Permission denied`
errors. Double check that `security_driver = "none"` line is present in
`/etc/libvirt/qemu.conf` and not commented out.

#### Firewall

Finally, if you have a firewall, you may have to allow connections to the
libvirt daemon from the IP range used by your cluster nodes.

Expand All @@ -166,34 +194,43 @@ iptables -I INPUT -p tcp -s 192.168.126.0/24 -d 192.168.122.1 --dport 16509 -j A

#### Firewalld

If using `firewalld`, the specifics will depend on how your distribution setup the
various zones.
If using `firewalld`, the specifics will depend on how your distribution has set
up the various zones. The following instructions should work as is for Fedora,
CentOS, RHEL and Arch Linux.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CentOS8/RHEL8 ? Or even EL7 distros?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the "libvirt version less than 5.1" covers EL7. Maybe the phrasing here is misleading?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, you differentiate between the misc distro versions through libvirt version, which makes sense.


On Fedora Workstation, as we don't want to expose the libvirt port externally,
we'll need to actively block it. We then use the preexisting `dmz` zone for the
traffic between VMs.
First, as we don't want to expose the libvirt port externally, we will need to
actively block it:

```sh
sudo firewall-cmd --add-rich-rule "rule service name="libvirt" reject"
sudo firewall-cmd --zone=dmz --change-interface=virbr0
sudo firewall-cmd --zone=dmz --change-interface=tt0
sudo firewall-cmd --zone=dmz --add-service=libvirt
```

On RHEL8, the bridges used by the VMs are already isolated in their own zones,
so we only need to allow traffic on the libvirt port:
For systems with libvirt version 5.1.0 and later, libvirt will set new bridged
network interfaces in the `libvirt` zone. We thus need to allow `libvirt`
traffic from the VMs to reach the host:

```sh
sudo firewall-cmd --zone=libvirt --add-service=libvirt
```

For system with an older libvirt, we will move the new bridge interface to a
dedicated network zone and enable incoming libvirt, DNS & DHCP traffic:

```sh
sudo firewall-cmd --zone=dmz --change-interface=tt0
sudo firewall-cmd --zone=dmz --add-service=libvirt
sudo firewall-cmd --zone=dmz --add-service=dns
sudo firewall-cmd --zone=dmz --add-service=dhcp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the change from the previous rules?

sudo firewall-cmd --zone=dmz --change-interface=virbr0
sudo firewall-cmd --zone=dmz --change-interface=tt0
sudo firewall-cmd --zone=dmz --add-service=libvirt

I assume the rules in this PR are enough to allow connections from the cluster VMs to qemu+tcp://192.168.122.1/system?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not re-checked this but by default the dmz zone does not allow DNS/DHCP access and the nodes in the cluster need to get DHCP & DNS from the host dnsmasq daemon started by libvirt.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The virbr0 change is never required as far as I understand as nodes traffic will get routed through the tt0 bridge.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not re-checked this but by default the dmz zone does not allow DNS/DHCP access and the nodes in the cluster need to get DHCP & DNS from the host dnsmasq daemon started by libvirt.

Yep, makes sense, dunno why I did not need that when I tried this :)

The virbr0 change is never required as far as I understand as nodes traffic will get routed through the tt0 bridge.

I ran some tests, and seems to be fine indeed. Not quite sure why the VMs on the tt0 bridge can reach virbr0 (we need to be able to reach qemu+tcp://192.168.122.1/system from the guest OS).

```

NOTE: When the firewall rules are no longer needed, `sudo firewall-cmd --reload`
will remove the changes made as they were not permanently added. For persistence,
add `--permanent` to the `firewall-cmd` commands and run them a second time.

### Set up NetworkManager DNS overlay

This step allows installer and users to resolve cluster-internal hostnames from your host.

1. Tell NetworkManager to use `dnsmasq`:

```sh
Expand All @@ -207,8 +244,8 @@ This step allows installer and users to resolve cluster-internal hostnames from
```sh
echo server=/tt.testing/192.168.126.1 | sudo tee /etc/NetworkManager/dnsmasq.d/openshift.conf
```
3. Reload NetworkManager to pick up the `dns` configuration change: `sudo systemctl reload NetworkManager`

3. Reload NetworkManager to pick up the `dns` configuration change: `sudo systemctl reload NetworkManager`

## Build the installer

Expand Down Expand Up @@ -286,41 +323,49 @@ kubectl get --all-namespaces pods
## FAQ

### Libvirt vs. AWS

1. There isn't a load balancer on libvirt.

## Troubleshooting

If following the above steps hasn't quite worked, please review this section for well known issues.

### Console doesn't come up

In case of libvirt there is no wildcard DNS resolution and console depends on the route which is created by auth operator ([Issue #1007](https://github.com/openshift/installer/issues/1007)).
To make it work we need to first create the manifests and edit the `domain` for ingress config, before directly creating the cluster.

- Add another domain entry in the openshift.conf which used by dnsmasq.
Here `tt.testing` is the domain which I choose when running the installer.
Here the IP in the address belong to one of the worker node.

```console
$ cat /etc/NetworkManager/dnsmasq.d/openshift.conf
server=/tt.testing/192.168.126.1
address=/.apps.tt.testing/192.168.126.51
```

- Make sure you restart the NetworkManager after change in the openshift.conf
- Make sure you restart the NetworkManager after change in `openshift.conf`:

```console
$ sudo systemctl restart NetworkManager
```

- Create the manifests
- Create the manifests:

```console
$ openshift-install --dir $INSTALL_DIR create manifests
```

- Domain entry in cluster-ingress-02-config.yml file should not contain cluster name
- Domain entry in cluster-ingress-02-config.yml file should not contain cluster name:

```console
# Assuming `test1` as cluster name
$ sed -i 's/test1.//' $INSTALL_DIR/manifests/cluster-ingress-02-config.yml
```

- Start the installer to create the cluster
- Start the installer to create the cluster:

```console
$ openshift-install --dir $INSTALL_DIR create cluster
```
Expand All @@ -341,6 +386,7 @@ FATA[0019] failed to run Terraform: exit status 1
it is likely that your install configuration contains three backslashes after the protocol (e.g. `qemu+tcp:///...`), when it should only be two.

### Random domain creation errors due to libvirt race conditon

Depending on your libvirt version you might encounter [a race condition][bugzilla_libvirt_race] leading to an error similar to:

```
Expand All @@ -349,9 +395,11 @@ Depending on your libvirt version you might encounter [a race condition][bugzill
This is also being [tracked on the libvirt-terraform-provider][tfprovider_libvirt_race] but is likely not fixable on the client side, which is why you should upgrade libvirt to >=4.5 or a patched version, depending on your environment.

### MacOS support currently broken

* Support for libvirt on Mac OS [is currently broken and being worked on][brokenmacosissue201].

### Error with firewall initialization on Arch Linux

If you're on Arch Linux and get an error similar to

```
Expand All @@ -368,6 +416,7 @@ error: internal error: Failed to initialize a valid firewall backend
please check out [this thread on superuser][arch_firewall_superuser].

### Github Issue Tracker

You might find other reports of your problem in the [Issues tab for this repository][issues_libvirt] where we ask you to provide any additional information.
If your issue is not reported, please do.

Expand Down