Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing tcg support #8195

Closed
tomkukral opened this issue Jul 27, 2022 · 46 comments · Fixed by #8433
Closed

Missing tcg support #8195

tomkukral opened this issue Jul 27, 2022 · 46 comments · Fixed by #8433
Labels

Comments

@tomkukral
Copy link

tomkukral commented Jul 27, 2022

What happened:
I was trying to start kubevirt on my AWS instance.

What you expected to happen:
I was expection virt-handler to start on my AWS instance.

How to reproduce it (as minimally and precisely as possible):
Try to start kubevirt on AWS.

Additional context:

virt-launcher container in virt-handler pod is failing on detecting emulator capabilities. tcg libraries are probably missing

kubectl -n kubevirt logs -f virt-handler-27s6r -c virt-launcher
error: failed to get emulator capabilities
error: internal error: Failed to start QEMU binary /usr/libexec/qemu-kvm for probing: Could not access KVM kernel module: No such file or directory
qemu-kvm: failed to initialize kvm: No such file or directory
qemu-kvm: falling back to tcg
**
ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces: assertion failed: (ops != NULL)

Environment:

  • KubeVirt version (use virtctl version): v0.54.0
  • Kubernetes version (use kubectl version): v1.21.7
  • VM or VMI specifications: ami-05e5abbfdd4424640
  • Cloud provider or hardware configuration: AWS EC2 instance
  • OS (e.g. from /etc/os-release):
CentOS Linux release 7.7.1908 (Core)
Derived from Red Hat Enterprise Linux 7.7 (Source)
NAME="CentOS Linux"
VERSION="7.2003.13 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7.2003.13 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"
CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
OSTREE_VERSION=7.2003.13
CentOS Linux release 7.7.1908 (Core)
CentOS Linux release 7.7.1908 (Core)
cpe:/o:centos:centos:7

- Kernel (e.g. `uname -a`): `Linux ip-192-168-1-33.eu-central-1.compute.internal 4.18.0-147.5.1.ves4.el7.x86_64 #1 SMP Mon Mar 16 08:47:16 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux`
- Install tools: N/A
- Others: N/A


I'll be grateful for any suggestion.
@vasiliy-ul
Copy link
Contributor

I think you need to enable emulation in the KubeVirt CR config: https://github.com/kubevirt/kubevirt/blob/main/docs/software-emulation.md. Run:

kubectl -n kubevirt edit kubevirt kubevirt

and try to add

spec:
  configuration:
    developerConfiguration:
      useEmulation: true

@Omar007
Copy link
Contributor

Omar007 commented Jul 27, 2022

Hitting the same on a bare metal 1.24.2 kubeadm cluster (aside from the KVM initialization error):

error: failed to get emulator capabilities
error: internal error: Failed to start QEMU binary /usr/libexec/qemu-kvm for probing: **
ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces: assertion failed: (ops != NULL)

It seems it might even need them even if it's not actively going to be using them?

Been a while since I've actively made use of this but on the same physical systems (just older versions of kernel, cluster and kubevirt) and with the same deployment configuration it has worked without issues before.
I think the QEMU update in 0.53.0 is relevant but I've not had the time to try older KubeVirt versions.

EDIT: Yup it comes online just fine with 0.52.0

@tomkukral
Copy link
Author

I think you need to enable emulation in the KubeVirt CR config: https://github.com/kubevirt/kubevirt/blob/main/docs/software-emulation.md. Run:

kubectl -n kubevirt edit kubevirt kubevirt

and try to add

spec:
  configuration:
    developerConfiguration:
      useEmulation: true

Yes, emulation is enabled on my cluster

k -n kubevirt get kubevirt kubevirt -o yaml
apiVersion: kubevirt.io/v1
kind: KubeVirt
metadata:
  annotations:
    app.kubernetes.io/name: kubevirt
    app.kubernetes.io/version: 20220725-2857
    kubevirt.io/latest-observed-api-version: v1
    kubevirt.io/storage-observed-api-version: v1alpha3
  creationTimestamp: "2022-07-25T09:55:49Z"
  finalizers:
  - foregroundDeleteKubeVirt
  generation: 7
  name: kubevirt
  namespace: kubevirt
  resourceVersion: "4406"
  selfLink: /apis/kubevirt.io/v1/namespaces/kubevirt/kubevirts/kubevirt
  uid: 5dcfa293-3f57-43d4-98af-5eb1aa8b666d
spec:
  certificateRotateStrategy: {}
  configuration:
    developerConfiguration:
      useEmulation: true

@vasiliy-ul
Copy link
Contributor

vasiliy-ul commented Jul 28, 2022

Yeah, that now looks more like an issue with qemu 'modularization'. I guess some libraries (i.e. for tcg) are now packaged in a separate RPM which is not pulled as dependency into the virt-launcher container. Therefore the dynamic loading of tcg fails. Just a thought...

Ping @andreabolognani , @rmohr , WDYT?

... on the other hand, emulation mode should be tested in CI. therefore not sure...

@andreabolognani
Copy link
Contributor

Yeah, that now looks more like an issue with qemu 'modularization'. I guess some libraries (i.e. for tcg) are now packaged in a separate RPM which is not pulled as dependency into the virt-launcher container. Therefore the dynamic loading of tcg fails. Just a thought...

Ping @andreabolognani , @rmohr , WDYT?

... on the other hand, emulation mode should be tested in CI. therefore not sure...

@vasiliy-ul thanks a lot for the heads up!

I don't think the issue is related to QEMU modularization, as TCG support is part of the qemu-kvm-core package which we include in the image.

From inside a virt-launcher pod (HEAD points to f77d505 here):

sh-4.4# ls -l /usr/lib64/qemu-kvm/accel-tcg-x86_64.so
-rwxr-xr-x. 1 root root 24832 Jan  1  1970 /usr/lib64/qemu-kvm/accel-tcg-x86_64.so
sh-4.4# /usr/libexec/qemu-kvm -M q35 -accel tcg
VNC server running on ::1:5900

So TCG support is present and appears to be working.

@tomkukral are you running an upstream build of KubeVirt or a downstream one? If the latter, there might be some downstream packaging decision affecting the behavior.

@tomkukral
Copy link
Author

Yeah, that now looks more like an issue with qemu 'modularization'. I guess some libraries (i.e. for tcg) are now packaged in a separate RPM which is not pulled as dependency into the virt-launcher container. Therefore the dynamic loading of tcg fails. Just a thought...
Ping @andreabolognani , @rmohr , WDYT?
... on the other hand, emulation mode should be tested in CI. therefore not sure...

@vasiliy-ul thanks a lot for the heads up!

I don't think the issue is related to QEMU modularization, as TCG support is part of the qemu-kvm-core package which we include in the image.

From inside a virt-launcher pod (HEAD points to f77d505 here):

sh-4.4# ls -l /usr/lib64/qemu-kvm/accel-tcg-x86_64.so
-rwxr-xr-x. 1 root root 24832 Jan  1  1970 /usr/lib64/qemu-kvm/accel-tcg-x86_64.so
sh-4.4# /usr/libexec/qemu-kvm -M q35 -accel tcg
VNC server running on ::1:5900

So TCG support is present and appears to be working.

@tomkukral are you running an upstream build of KubeVirt or a downstream one? If the latter, there might be some downstream packaging decision affecting the behavior.

I'm running upstream build and using image quay.io/kubevirt/virt-launcher:v0.54.0. Do you want me to run some testing on my site?

Everything is working in case of physical HW and it is broken only on AWS.

@andreabolognani
Copy link
Contributor

I'm running upstream build and using image quay.io/kubevirt/virt-launcher:v0.54.0.

Okay, that should rule out downstream-specific issues.

Do you want me to run some testing on my site?

Everything is working in case of physical HW and it is broken only on AWS.

More information would be excellent, thanks!

You could start by verifying that a minimal VM (such as the one defined in examples/vmi-nocloud.yml triggers the issue.

Then you could collect the debug logs produced by adding

metadata:
  labels:
    debugLogs: "true"

to the VMI definition both on physical hardware and AWS. Information about the specific type of AWS instance could be useful as well.

@Omar007 mentioned that v0.52.0 works on AWS, so if either one of you could provide the logs for both a successful run on v0.52.0 and a failed run on v0.54.0 that would be great.

I'll try to ping a few QEMU developers to see whether the error message rings any bell for them.

@tomkukral
Copy link
Author

I'm sorry for late response ... I'll boostrap kubevirt this week and send debug logs.

@jkurek1
Copy link

jkurek1 commented Aug 24, 2022

I have the same error. I tested kubevirt on old nodes and everything worked, I switched to the new nodes and there are errors. The same Kubernetes Cluster - just different node.

error: 2022-08-24T23:12:59.210342512+02:00 failed to get emulator capabilities
error: internal error: Failed to start QEMU binary /usr/libexec/qemu-kvm for probing: **
ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces: assertion failed: (ops != NULL)
2022-08-24T23:12:59.210342512+02:00

@hw-claudio
Copy link

hw-claudio commented Aug 31, 2022

Hi, is this something you can somehow reproduce with qemu running outside of kubevirt?
I could try to reproduce but it would be easier for me to do without kubevirt in the mix.

Generally if ops is NULL at that point, it would mean that the accelerator has not provided its interface to register to QEMU.

Something is going wrong with the initialization order of things, or the tcg .so module is not registering/working correctly. There might be a difference between having the .so module, and having the code built-in in the qemu binary.

What is the exact version of QEMU, and can you pinpoint roughly when it started failing?

Btw I see that you have: /usr/lib64/qemu-kvm/accel-tcg-x86_64.so . Have you tried configuring qemu with TCG built-in instead of a module?

Ciao,

Claudio

@hw-claudio
Copy link

@philmd

@hw-claudio
Copy link

I am not even sure it is possible to explicitly build tcg as built-in anymore, if --enable-modules is true. This has been finalized IIRC in qemu-6.1 (Gerd, Paolo) as RH needed it quickly, but in my view the work on tcg modularization was not concluded yet, ie there is still the whole problem of tcg_available() vs tcg_enabled() unclear distinction that was never as far as I know brought to conclusion.

@tomkukral
Copy link
Author

I have created this vmi:

---
apiVersion: kubevirt.io/v1
kind: VirtualMachineInstance
metadata:
  labels:
    special: vmi-nocloud
    debugLogs: "true"
  name: vmi-nocloud
spec:
  domain:
    devices:
      disks:
      - disk:
          bus: virtio
        name: containerdisk
      - disk:
          bus: virtio
        name: cloudinitdisk
      - disk:
          bus: virtio
        name: emptydisk
    resources:
      requests:
        memory: 128Mi
  terminationGracePeriodSeconds: 0
  volumes:
  - containerDisk:
      image: registry:5000/kubevirt/cirros-container-disk-demo:devel
    name: containerdisk
  - cloudInitNoCloud:
      userData: |
        #!/bin/sh
        echo 'printed from cloud-init userdata'
    name: cloudinitdisk
  - emptyDisk:
      capacity: 2Gi
    name: emptydisk

but pod is not starting

ip-192-168-32-141 tom-2022-08-31-140632 ~/debug k -n ves-system describe po virt-launcher-vmi-nocloud-dgff4
Name:           virt-launcher-vmi-nocloud-dgff4
Namespace:      ves-system
Priority:       0
Node:           <none>
Labels:         debugLogs=true
                kubevirt.io=virt-launcher
                kubevirt.io/created-by=a90df346-0466-4412-a06e-d2514cdb373d
                special=vmi-nocloud
                vm.kubevirt.io/name=vmi-nocloud
Annotations:    kubectl.kubernetes.io/default-container: compute
                kubevirt.io/domain: vmi-nocloud
                kubevirt.io/migrationTransportUnix: true
                post.hook.backup.velero.io/command: ["/usr/bin/virt-freezer", "--unfreeze", "--name", "vmi-nocloud", "--namespace", "ves-system"]
                post.hook.backup.velero.io/container: compute
                pre.hook.backup.velero.io/command: ["/usr/bin/virt-freezer", "--freeze", "--name", "vmi-nocloud", "--namespace", "ves-system"]
                pre.hook.backup.velero.io/container: compute
                ves.io/pod-id: bdf6c786-b950-4b76-ac80-6b3675c17eea
Status:         Pending
IP:             
IPs:            <none>
Controlled By:  VirtualMachineInstance/vmi-nocloud
Init Containers:
  container-disk-binary:
    Image:      gcr.io/volterraio/kubevirt-launcher@sha256:c9f1b45c79b4c1fd0e2041d9997a838c85bf9418c06116531e00733aa6c06230
    Port:       <none>
    Host Port:  <none>
    Command:
      /usr/bin/cp
      /usr/bin/container-disk
      /init/usr/bin/container-disk
    Limits:
      cpu:     100m
      memory:  40M
    Requests:
      cpu:        10m
      memory:     1M
    Environment:  <none>
    Mounts:
      /init/usr/bin from virt-bin-share-dir (rw)
  volumecontainerdisk-init:
    Image:      registry:5000/kubevirt/cirros-container-disk-demo:devel
    Port:       <none>
    Host Port:  <none>
    Command:
      /usr/bin/container-disk
    Args:
      --no-op
    Limits:
      cpu:     100m
      memory:  40M
    Requests:
      cpu:                10m
      ephemeral-storage:  50M
      memory:             1M
    Environment:          <none>
    Mounts:
      /usr/bin from virt-bin-share-dir (rw)
      /var/run/kubevirt-ephemeral-disks/container-disk-data/a90df346-0466-4412-a06e-d2514cdb373d from container-disks (rw)
Containers:
  compute:
    Image:      gcr.io/volterraio/kubevirt-launcher@sha256:c9f1b45c79b4c1fd0e2041d9997a838c85bf9418c06116531e00733aa6c06230
    Port:       <none>
    Host Port:  <none>
    Command:
      /usr/bin/virt-launcher-monitor
      --qemu-timeout
      258s
      --name
      vmi-nocloud
      --uid
      a90df346-0466-4412-a06e-d2514cdb373d
      --namespace
      ves-system
      --kubevirt-share-dir
      /var/run/kubevirt
      --ephemeral-disk-dir
      /var/run/kubevirt-ephemeral-disks
      --container-disk-dir
      /var/run/kubevirt/container-disks
      --grace-period-seconds
      15
      --hook-sidecars
      0
      --ovmf-path
      /usr/share/OVMF
      --allow-emulation
    Limits:
      devices.kubevirt.io/tun:  1
    Requests:
      cpu:                      100m
      devices.kubevirt.io/tun:  1
      ephemeral-storage:        50M
      memory:                   348416Ki
    Environment:
      LIBVIRT_DEBUG_LOGS:  1
      POD_NAME:            virt-launcher-vmi-nocloud-dgff4 (v1:metadata.name)
    Mounts:
      /var/run/kubevirt from public (rw)
      /var/run/kubevirt-ephemeral-disks from ephemeral-disks (rw)
      /var/run/kubevirt-private from private (rw)
      /var/run/kubevirt/container-disks from container-disks (rw)
      /var/run/kubevirt/hotplug-disks from hotplug-disks (rw)
      /var/run/kubevirt/sockets from sockets (rw)
      /var/run/libvirt from libvirt-runtime (rw)
      /volterra/secrets/identity from certs-volume (rw)
  volumecontainerdisk:
    Image:      registry:5000/kubevirt/cirros-container-disk-demo:devel
    Port:       <none>
    Host Port:  <none>
    Command:
      /usr/bin/container-disk
    Args:
      --copy-path
      /var/run/kubevirt-ephemeral-disks/container-disk-data/a90df346-0466-4412-a06e-d2514cdb373d/disk_0
    Limits:
      cpu:     100m
      memory:  40M
    Requests:
      cpu:                10m
      ephemeral-storage:  50M
      memory:             1M
    Environment:          <none>
    Mounts:
      /usr/bin from virt-bin-share-dir (rw)
      /var/run/kubevirt-ephemeral-disks/container-disk-data/a90df346-0466-4412-a06e-d2514cdb373d from container-disks (rw)
      /volterra/secrets/identity from certs-volume (rw)
  wingman:
    Image:      gcr.io/volterraio/wingman@sha256:e587af1d1f4394a456361fda3ab0be16671b81d2900fb2db402ae3d12784e164
    Port:       <none>
    Host Port:  <none>
    Command:
      wingmand
      --config
      /volterra/config/wingman.yml
    Limits:
      cpu:     50m
      memory:  100Mi
    Requests:
      cpu:     5m
      memory:  70Mi
    Environment:
      SECURITY_DOC:  CAESzA...REDACTED
      POD_IP:         (v1:status.podIP)
      POD_NAME:      virt-launcher-vmi-nocloud-dgff4 (v1:metadata.name)
    Mounts:
      /volterra/config/wingman.yml from wingman-config (rw,path="wingman.yml")
      /volterra/secrets/identity from certs-volume (rw)
Readiness Gates:
  Type                                   Status
  kubevirt.io/virtual-machine-unpaused   True 
Conditions:
  Type                                   Status
  PodScheduled                           False 
  kubevirt.io/virtual-machine-unpaused   True 
Volumes:
  private:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  public:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  sockets:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  virt-bin-share-dir:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  libvirt-runtime:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  ephemeral-disks:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  container-disks:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  hotplug-disks:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  certs-volume:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     Memory
    SizeLimit:  5M
  wingman-config:
    Type:        ConfigMap (a volume populated by a ConfigMap)
    Name:        wingman-config
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  kubevirt.io/schedulable=true
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason            Age                  From               Message
  ----     ------            ----                 ----               -------
  Warning  FailedScheduling  49s (x7 over 5m55s)  default-scheduler  0/1 nodes are available: 1 node(s) didn't match Pod's node affinity/selector.

node don't have any kubelet related labels (pod requires kubevirt.io/schedulable: "true") because virt-launcher is failing and it doesn't set node label.

ip-192-168-32-141 tom-2022-08-31-140632 ~/debug k -n kubevirt logs virt-handler-rqfkw -c virt-launcher
error: failed to get emulator capabilities
error: internal error: Failed to start QEMU binary /usr/libexec/qemu-kvm for probing: Could not access KVM kernel module: No such file or directory
qemu-kvm: failed to initialize kvm: No such file or directory
qemu-kvm: falling back to tcg
**
ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces: assertion failed: (ops != NULL)

@hw-claudio
Copy link

Right, now the challenge is how to debug qemu inside kubevirt. Is there any recommended way to do it in kubevirt?
Otherwise your best bet would be to reproduce without kubevirt involved.

@tomkukral
Copy link
Author

tomkukral commented Sep 1, 2022

I'm able to reproduce by running node-labeller.sh directly in docker container

docker run --rm -ti --entrypoint /bin/sh --privileged -v $(mktemp -d):/var/lib/kubevirt-node-labeller gcr.io/volterraio/kubevirt-launcher@sha256:c9f1b45c79b4c1fd0e2041d9997a838c85bf9418c06116531e00733aa6c06230  -c node-labeller.sh
Authorization not available. Check if polkit service is running or see debug message for more information.
error: failed to get emulator capabilities
error: internal error: Failed to start QEMU binary /usr/libexec/qemu-kvm for probing: Could not access KVM kernel module: No such file or directory
qemu-kvm: failed to initialize kvm: No such file or directory
qemu-kvm: falling back to tcg
**
ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces: assertion failed: (ops != NULL)

btw image is same as quay.io/kubevirt/virt-launcher:v0.54.0, just need to repush it out repository due to limited network

@tomkukral
Copy link
Author

I can provide access to this lab, just ping me on kubernetes slack (same username as on github)

@tomkukral
Copy link
Author

It can be AWS specific because same deployment works fine on GCP.

@tomkukral
Copy link
Author

Comparing kubevirt on AWS and GCP:

  • gcp don't have kvm module loaded, I have tried to rmmod it in aws but it was not helping
  • labeller have differet output
# AWS
ip-192-168-32-141 tom-2022-08-31-140632 ~ docker run --rm -ti --entrypoint /bin/sh --privileged -v $(mktemp -d):/var/lib/kubevirt-node-labeller gcr.io/volterraio/kubevirt-launcher@sha256:c9f1b45c79b4c1fd0e2041d9997a838c85bf9418c06116531e00733aa6c06230  -c node-labeller.sh
Authorization not available. Check if polkit service is running or see debug message for more information.
error: failed to get emulator capabilities
error: internal error: Failed to start QEMU binary /usr/libexec/qemu-kvm for probing: Could not access KVM kernel module: No such file or directory
qemu-kvm: failed to initialize kvm: No such file or directory
qemu-kvm: falling back to tcg
**
ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces: assertion failed: (ops != NULL)

# GCP
kubevirt-staging-test01 kubevirt-test01 ~ docker run --rm -ti --entrypoint /bin/sh --privileged -v $(mktemp -d):/var/lib/kubevirt-node-labeller gcr.io/volterraio/kubevirt-launcher@sha256:c9f1b45c79b4c1fd0e2041d9997a838c85bf9418c06116531e00733aa6c06230  -c node-labeller.sh

Authorization not available. Check if polkit service is running or see debug message for more information.
error: failed to get emulator capabilities
error: internal error: Failed to start QEMU binary /usr/libexec/qemu-kvm for probing: Could not access KVM kernel module: No such file or directory
qemu-kvm: failed to initialize kvm: No such file or directory
qemu-kvm: falling back to tcg
**
ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces: assertion failed: (ops != NULL)

Authorization not available. Check if polkit service is running or see debug message for more information.
Authorization not available. Check if polkit service is running or see debug message for more information.
Authorization not available. Check if polkit service is running or see debug message for more information.
Authorization not available. Check if polkit service is running or see debug message for more information.

so it isn't failing on gcp (but tcg error is still there). Kubevirt labels are present on gcp node.

  • systemd is trying to detect kvm on aws but not on gcp
# AWS:
ip-192-168-32-141 tom-2022-08-31-140632 ~ dmesg | grep kvm
[    0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
[    0.000000] kvm-clock: cpu 0, msr 353e01001, primary cpu clock
[    0.000000] kvm-clock: using sched offset of 4644534413 cycles
[    0.000000] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
[    0.000000] kvm-stealtime: cpu 0, msr 42402c080
[    0.001000] kvm-clock: cpu 1, msr 353e01041, secondary cpu clock
[    0.005046] kvm-stealtime: cpu 1, msr 4240ac080
[    0.001000] kvm-clock: cpu 2, msr 353e01081, secondary cpu clock
[    0.006960] kvm-stealtime: cpu 2, msr 42412c080
[    0.001000] kvm-clock: cpu 3, msr 353e010c1, secondary cpu clock
[    0.008313] kvm-stealtime: cpu 3, msr 4241ac080
[    0.113042] clocksource: Switched to clocksource kvm-clock
[    1.020646] systemd[1]: Detected virtualization kvm.
[ 5374.086836] kvm: no hardware support
[ 5462.888528] kvm: no hardware support
[ 5472.877271] kvm: no hardware support
ip-192-168-32-141 tom-2022-08-31-140632 ~ uname -a
Linux ip-192-168-32-141.us-east-2.compute.internal 4.18.0-240.10.1.ves1.el7.x86_64 #1 SMP Tue Mar 30 15:02:49 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
# GCP:
kubevirt-staging-test01 kubevirt-test01 ~ dmesg | grep kvm
kubevirt-staging-test01 kubevirt-test01 ~ uname -a
Linux kubevirt-staging-test01 4.18.0-147.5.1.ves6.el7.x86_64 #1 SMP Mon Aug 31 09:14:43 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
  • kernel version is bit different, I'll try to sync versions

@vasiliy-ul
Copy link
Contributor

@tomkukral, could try to run the same stuff with docker but using another image: registry.opensuse.org/kubevirt/virt-launcher:0.55.0? If the issue is reproducible, we can probably debug it further from there.

@tomkukral
Copy link
Author

registry.opensuse.org/kubevirt/virt-launcher:0.55.0

It works with this image. Let me try to upgrade kubevirt to 0.55 and test again.

I have also discovered gcp test lab was using much older kernel so I'm trying to downgrade aws to same version.

@hw-claudio
Copy link

interesting, seems a CentOS-only kubevirt images problem? I wonder why upstream kubevirt does not use upstream qemu... @fabiand (ciao Fabian)

@tomkukral
Copy link
Author

tomkukral commented Sep 1, 2022

Comparing opensuse build and upstream on same version

ip-192-168-32-141 tom-2022-08-31-140632 ~ docker run --rm -ti --entrypoint /bin/sh --privileged -v $(mktemp -d):/var/lib/kubevirt-node-labeller registry.opensuse.org/kubevirt/virt-launcher:0.55.0 -c /usr/bin/node-labeller.sh
ip-192-168-32-141 tom-2022-08-31-140632 ~ docker run --rm -ti --entrypoint /bin/sh --privileged -v $(mktemp -d):/var/lib/kubevirt-node-labeller quay.io/kubevirt/virt-launcher:v0.55.0 -c /usr/bin/node-labeller.sh
Unable to find image 'quay.io/kubevirt/virt-launcher:v0.55.0' locally
v0.55.0: Pulling from kubevirt/virt-launcher
ebec1dc3291e: Pull complete 
49701e25b80f: Pull complete 
Digest: sha256:43f223a6bf9c40cc86408d9acb49dd3bd95c87f09a120dab90f367547c31c792
Status: Downloaded newer image for quay.io/kubevirt/virt-launcher:v0.55.0
Authorization not available. Check if polkit service is running or see debug message for more information.
error: failed to get emulator capabilities
error: internal error: Failed to start QEMU binary /usr/libexec/qemu-kvm for probing: Could not access KVM kernel module: No such file or directory
qemu-kvm: failed to initialize kvm: No such file or directory
qemu-kvm: falling back to tcg
**
ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces: assertion failed: (ops != NULL)

@vasiliy-ul
Copy link
Contributor

Yeah, looks like something is different in the way qemu is built. And the issue seems to happen only when kvm probe fails and it falls back to tcg. At least /usr/libexec/qemu-kvm -M q35 -accel tcg appears to work.

@tomkukral
Copy link
Author

/usr/libexec/qemu-kvm -M q35 -accel tcg

Yes this works

ip-192-168-32-141 tom-2022-08-31-140632 ~ docker run --rm -ti --entrypoint /bin/sh --privileged -v $(mktemp -d):/var/lib/kubevirt-node-labeller quay.io/kubevirt/virt-launcher:v0.54.0 
sh-4.4# /usr/libexec/qemu-kvm -M q35 -accel tcg
VNC server running on 127.0.0.1:5900

@vasiliy-ul
Copy link
Contributor

Sharing some status updates: it seems that the behavior depends on the host system. I can reproduce the error when running the docker command on my laptop with a recent Tumbleweed. But it's not reproducible on CentOS 8 Stream (from make cluster-up) and on some other older distros.

@tomkukral
Copy link
Author

I can reproduce error running upstream docker image 0.54 but no error is there with opensuse container (bot running on same operation system). I was using same vm for both. Maybe it can be combination of host os and container OS.

@poojaghumre
Copy link

If it helps, I am able to reproduce this on a physical CentOS 7 host with kubevirt upstream virt-launcher v0.55.0 image. And saw somebody else filed another issue (#8362) for Ubuntu 20.04 as well.

Could there be an issue due to API version mismatch between qemu-kvm (v6.2) and virsh (qemu 8.0) versions?

sh-4.4# /usr/libexec/qemu-kvm --version
QEMU emulator version 6.2.0 (qemu-kvm-6.2.0-5.module_el8.6.0+1087+b42c8331)
Copyright (c) 2003-2021 Fabrice Bellard and the QEMU Project developers

sh-4.4# virsh version
Authorization not available. Check if polkit service is running or see debug message for more information.
Compiled against library: libvirt 8.0.0
Using library: libvirt 8.0.0
Using API: QEMU 8.0.0
error: failed to get the hypervisor version
error: internal error: Cannot find suitable emulator for x86_64

@vasiliy-ul
Copy link
Contributor

vasiliy-ul commented Sep 2, 2022

@tomkukral, @poojaghumre, could you check on your side the permissions of the directory (using docker run ... with the upstream image)?

# ls -la /usr/lib64/qemu-kvm

After that run

# chmod +rx /usr/lib64/qemu-kvm
# /usr/bin/node-labeler.sh

@poojaghumre
Copy link

@vasiliy-ul I confirmed that your suggested fix works just fine. I modified the virt-handler (v0.55.0) daemonset config as below and that worked:

initContainers:
        - args:
          - chmod +rx /usr/lib64/qemu-kvm; node-labeller.sh;
          command:
          - /bin/sh
          - -c
          image: quay.io/kubevirt/virt-launcher:v0.55.0

@poojaghumre
Copy link

Permissions inside virt-launcher container when using v0.55.0 image as is:

sh-4.4# ls -la /usr/lib64/qemu-kvm
total 296
drw-------.  2 root root  4096 Sep  1 02:21 .

sh-4.4# /usr/bin/node-labeller.sh
Authorization not available. Check if polkit service is running or see debug message for more information.
error: failed to get emulator capabilities
error: internal error: Failed to start QEMU binary /usr/libexec/qemu-kvm for probing: **
ERROR:../accel/accel-softmmu.c:82:accel_init_ops_interfaces: assertion failed: (ops != NULL)

sh-4.4# chmod +rx /usr/lib64/qemu-kvm
sh-4.4# ls -la /usr/lib64/qemu-kvm
total 296
drwxr-xr-x.  2 root root  4096 Sep  1 02:21 .

sh-4.4# ./node-labeller.sh
Authorization not available. Check if polkit service is running or see debug message for more information.
Authorization not available. Check if polkit service is running or see debug message for more information.
Authorization not available. Check if polkit service is running or see debug message for more information.
Authorization not available. Check if polkit service is running or see debug message for more information.
sh-4.4# echo $?
0

@vasiliy-ul
Copy link
Contributor

When querying the host capabilities (inside the node-labeller.sh script), libvirtd starts qemu using qemu user 107:107. The wrong permissions prevent a non-root user from accessing the tcg accel plugin. Definitely there is some space for improvement in the qemu error reporting IMHO.

Now the question is why the permissions get screwed. The directory /usr/lib64/qemu-kvm comes from the base image. I do not have a reasonable explanation why it works fine on some hosts while fails on the others. This does not seem to depend on the host OS. I checked on two different machines with the same OS installed. One works fine, while the other show this error because of the permissions.

@vasiliy-ul
Copy link
Contributor

For the sake of further investigation, @poojaghumre, @tomkukral, could you also share the output of

 docker info | grep Storage

@tomkukral
Copy link
Author

I can confirm chmod is helping. Thanks a lot for your help!

docker run --rm -ti --entrypoint /bin/sh --privileged -v $(mktemp -d):/var/lib/kubevirt-node-labeller quay.io/kubevirt/virt-launcher:v0.54.0 
sh-4.4# ls -la /usr/lib64/qemu-kvm
                                                                                                                                                                                                                                                               
total 296
drw-------.  2 root root  4096 Sep  1 08:33 .
dr-xr-xr-x. 29 root root 16384 Jan  1  1970 ..
-rwxr-xr-x.  1 root root 11792 Jan  1  1970 accel-qtest-x86_64.so
-rwxr-xr-x.  1 root root 24832 Jan  1  1970 accel-tcg-x86_64.so
-rwxr-xr-x.  1 root root  7568 Jan  1  1970 hw-display-virtio-gpu-gl.so
-rwxr-xr-x.  1 root root  7576 Jan  1  1970 hw-display-virtio-gpu-pci-gl.so
-rwxr-xr-x.  1 root root 12688 Jan  1  1970 hw-display-virtio-gpu-pci.so
-rwxr-xr-x.  1 root root 53792 Jan  1  1970 hw-display-virtio-gpu.so
-rwxr-xr-x.  1 root root  7568 Jan  1  1970 hw-display-virtio-vga-gl.so
-rwxr-xr-x.  1 root root 17368 Jan  1  1970 hw-display-virtio-vga.so
-rwxr-xr-x.  1 root root 47688 Jan  1  1970 hw-usb-host.so
-rwxr-xr-x.  1 root root 67584 Jan  1  1970 hw-usb-redirect.so

sh-4.4# chmod +rx /usr/lib64/qemu-kvm

sh-4.4# /usr/bin/node-labeller.sh
Authorization not available. Check if polkit service is running or see debug message for more information.
Authorization not available. Check if polkit service is running or see debug message for more information.
Authorization not available. Check if polkit service is running or see debug message for more information.
Authorization not available. Check if polkit service is running or see debug message for more information.

Docker is using devicemapper in this site

ip-192-168-32-141 tom-2022-08-31-140632 ~ docker info
Client:
 Debug Mode: false

Server:
 Containers: 134
  Running: 109
  Paused: 0
  Stopped: 25
 Images: 100
 Server Version: 19.03.12
 Storage Driver: devicemapper
  Pool Name: docker-253:0-6302274-pool
  Pool Blocksize: 65.54kB
  Base Device Size: 10.74GB
  Backing Filesystem: xfs
  Udev Sync Supported: true
  Data file: /dev/loop0
  Metadata file: /dev/loop1
  Data loop file: /var/lib/docker/devicemapper/devicemapper/data
  Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata
  Data Space Used: 19.14GB
  Data Space Total: 107.4GB
  Data Space Available: 16.8GB
  Metadata Space Used: 34.79MB
  Metadata Space Total: 2.147GB
  Metadata Space Available: 2.113GB
  Thin Pool Minimum Free Space: 10.74GB
  Deferred Removal Enabled: true
  Deferred Deletion Enabled: true
  Deferred Deleted Device Count: 0
  Library Version: 1.02.170-RHEL7 (2020-03-24)
 Logging Driver: json-file
 Cgroup Driver: systemd
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 7ad184331fa3e55e52b890ea95e65ba581ae3429
 runc version: dc9208a3303feef5b3839f4323d9beb36df0a9dd
 init version: fec3683
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 4.18.0-240.10.1.ves1.el7.x86_64
 Operating System: CentOS Linux 7.2009.29 (Core)
 OSType: linux
 Architecture: x86_64
 CPUs: 4
 Total Memory: 15.29GiB
 Name: ip-192-168-32-141.us-east-2.compute.internal
 ID: LRJZ:ZWSF:WJ2Q:5UMP:7OIL:F72F:VHZE:GNZO:PUGB:ATN6:IDUF:6R7Y
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: true

WARNING: bridge-nf-call-ip6tables is disabled
WARNING: the devicemapper storage-driver is deprecated, and will be removed in a future release.
WARNING: devicemapper: usage of loopback devices is strongly discouraged for production use.
         Use `--storage-opt dm.thinpooldev` to specify a custom block storage device.

It was working previously on gcp which is using overlay2

kubevirt-staging-test01 kubevirt-test01 ~ docker info
Client:
 Debug Mode: false

Server:
 Containers: 133
  Running: 107
  Paused: 0
  Stopped: 26
 Images: 94
 Server Version: 19.03.12
 Storage Driver: overlay2
  Backing Filesystem: xfs
  Supports d_type: true
  Native Overlay Diff: true
 Logging Driver: json-file
 Cgroup Driver: systemd
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 7ad184331fa3e55e52b890ea95e65ba581ae3429
 runc version: dc9208a3303feef5b3839f4323d9beb36df0a9dd
 init version: fec3683
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 4.18.0-147.5.1.ves6.el7.x86_64
 Operating System: CentOS Linux 7.2006.3 (Core)
 OSType: linux
 Architecture: x86_64
 CPUs: 4
 Total Memory: 15.66GiB
 Name: kubevirt-staging-test01
 ID: LOQW:UZR4:BLKJ:RREK:D7SV:45TX:L3QM:TGM2:54CF:I2DY:UTTS:PMZB
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: true

WARNING: bridge-nf-call-ip6tables is disabled

@vasiliy-ul
Copy link
Contributor

Thanks for sharing the info. The assumption is that the behavior varies depending on the storage driver used. It seems that the container gets correct permissions with overlay2. I see wrong perms with btrfs. Now also devicemapper is confirmed to misbehave.

@tomkukral
Copy link
Author

I have already fixed it in my downsteam build and I'm ready to provide any information to debug this further.

@vasiliy-ul
Copy link
Contributor

@tomkukral, what is your Kubernetes distro? Especially what container runtime is it based on (I assume docker is just for local testing)?

@tomkukral
Copy link
Author

@tomkukral, what is your Kubernetes distro? Especially what container runtime is it based on (I assume docker is just for local testing)?

Using docker container runtime, custom kubernetes installation.

@poojaghumre
Copy link

Here is the output of docker info storage command on my centos 7 server:

[root@kubevirt-c7 ~]# docker info | grep Storage
 Storage Driver: devicemapper

Can the permission fix be added to node-labeller.sh script itself to unblock this issue, if the permissions are correct in base image?

@vasiliy-ul
Copy link
Contributor

I am afraid that will be just a workaround if we explicitly adjust the permissions. Other directories might be affected as well. Besides, the virt-launcher base image is used to run other containers (and those do not call node-labeller.sh).

Ping @rmohr, @xpivarc, maybe you have some thoughts on that? In short, the issue is the following:

In the virt-launcher image, there is a directory /usr/lib64/qemu-kvm which contains .so files (i.e. qemu module drivers). For unknown reasons, this directory gets wrong permissions, so it is only accessible by the root user: drw-------. 2 root root. This breaks the capabilities querying in the node-labeller.sh script. According to the observations, the issue happens only when docker is used as the runtime, and it is set up to use either btrfs or devicemapper storage drivers (with overlay2 it works just fine). I would suspect a bug in the code which unpacks the tarball.

The directory /usr/lib64/qemu-kvm appears to be 'not owned' by any package from the rpmtree. I checked what bazeldnf does and was not able to find issues with that, though. It simply creates the full path with the default permissions when handling the files in that directory. The expectation is that by default it should have 0755 as most of the remaining directories.

@xpivarc
Copy link
Member

xpivarc commented Sep 6, 2022

Interesting problem and great findings. Can you share the mount output on both the working and not working setups? Also please check the underlayer if possible.

@vasiliy-ul
Copy link
Contributor

Yeah, I also checked the unpacked filesystem on the host, and it already has wrong perms:

 $ docker run --rm -ti --entrypoint /bin/bash quay.io/kubevirt/virt-launcher:v0.55.0
bash-4.4# mount
/dev/nvme0n1p2 on / type btrfs (rw,relatime,ssd,space_cache,subvolid=35753,subvol=/@/var/lib/docker/btrfs/subvolumes/883a8dcbd06749e0b7a3aed62b3b961d2c589078957a7138e6f2442e3de9c2b5)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev type tmpfs (rw,nosuid,size=65536k,mode=755,inode64)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=666)
sysfs on /sys type sysfs (ro,nosuid,nodev,noexec,relatime)
cgroup on /sys/fs/cgroup type cgroup2 (ro,nosuid,nodev,noexec,relatime,nsdelegate,memory_recursiveprot)
mqueue on /dev/mqueue type mqueue (rw,nosuid,nodev,noexec,relatime)
shm on /dev/shm type tmpfs (rw,nosuid,nodev,noexec,relatime,size=65536k,inode64)
/dev/nvme0n1p2 on /etc/resolv.conf type btrfs (rw,relatime,ssd,space_cache,subvolid=257,subvol=/@/var)
/dev/nvme0n1p2 on /etc/hostname type btrfs (rw,relatime,ssd,space_cache,subvolid=257,subvol=/@/var)
/dev/nvme0n1p2 on /etc/hosts type btrfs (rw,relatime,ssd,space_cache,subvolid=257,subvol=/@/var)
devpts on /dev/console type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=666)
proc on /proc/bus type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/fs type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/irq type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/sys type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/sysrq-trigger type proc (ro,nosuid,nodev,noexec,relatime)
tmpfs on /proc/asound type tmpfs (ro,relatime,inode64)
tmpfs on /proc/acpi type tmpfs (ro,relatime,inode64)
tmpfs on /proc/kcore type tmpfs (rw,nosuid,size=65536k,mode=755,inode64)
tmpfs on /proc/keys type tmpfs (rw,nosuid,size=65536k,mode=755,inode64)
tmpfs on /proc/latency_stats type tmpfs (rw,nosuid,size=65536k,mode=755,inode64)
tmpfs on /proc/timer_list type tmpfs (rw,nosuid,size=65536k,mode=755,inode64)
tmpfs on /proc/scsi type tmpfs (ro,relatime,inode64)
tmpfs on /sys/firmware type tmpfs (ro,relatime,inode64)
# ls -la /var/lib/docker/btrfs/subvolumes/883a8dcbd06749e0b7a3aed62b3b961d2c589078957a7138e6f2442e3de9c2b5/usr/lib64/qemu-kvm/
total 272
drw------- 1 root root   466 Sep  5 08:12 .
dr-xr-xr-x 1 root root 14218 Jan  1  1970 ..
-rwxr-xr-x 1 root root 11792 Jan  1  1970 accel-qtest-x86_64.so
-rwxr-xr-x 1 root root 24832 Jan  1  1970 accel-tcg-x86_64.so
-rwxr-xr-x 1 root root  7568 Jan  1  1970 hw-display-virtio-gpu-gl.so
-rwxr-xr-x 1 root root  7576 Jan  1  1970 hw-display-virtio-gpu-pci-gl.so
-rwxr-xr-x 1 root root 12688 Jan  1  1970 hw-display-virtio-gpu-pci.so
-rwxr-xr-x 1 root root 53792 Jan  1  1970 hw-display-virtio-gpu.so
-rwxr-xr-x 1 root root  7568 Jan  1  1970 hw-display-virtio-vga-gl.so
-rwxr-xr-x 1 root root 17368 Jan  1  1970 hw-display-virtio-vga.so
-rwxr-xr-x 1 root root 47688 Jan  1  1970 hw-usb-host.so
-rwxr-xr-x 1 root root 67584 Jan  1  1970 hw-usb-redirect.so

Same on the working setup with docker+overlay2:

# docker run --rm -ti --entrypoint /bin/bash quay.io/kubevirt/virt-launcher:v0.55.0rt-launcher:v0.55.0
bash-4.4# mount
overlay on / type overlay (rw,relatime,lowerdir=/var/lib/docker/overlay2/l/NGSTBJZW5TKGLHETYCLMTZ47K7:/var/lib/docker/overlay2/l/7CPUWR24G3LAFSLIGXHLPHOK4T:/var/lib/docker/overlay2/l/UW2KWGUELOV2IF2YXD4AS6KFS4,upperdir=/var/lib/docker/overlay2/62915a1dc9bfcabc69ee2e5e04ff32c27cc24f75a8c497998b2b0b32a0379cac/diff,workdir=/var/lib/docker/overlay2/62915a1dc9bfcabc69ee2e5e04ff32c27cc24f75a8c497998b2b0b32a0379cac/work)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev type tmpfs (rw,nosuid,size=65536k,mode=755,inode64)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=666)
sysfs on /sys type sysfs (ro,nosuid,nodev,noexec,relatime)
cgroup on /sys/fs/cgroup type cgroup2 (ro,nosuid,nodev,noexec,relatime)
mqueue on /dev/mqueue type mqueue (rw,nosuid,nodev,noexec,relatime)
shm on /dev/shm type tmpfs (rw,nosuid,nodev,noexec,relatime,size=65536k,inode64)
/dev/sda3 on /etc/resolv.conf type ext4 (rw,noatime)
/dev/sda3 on /etc/hostname type ext4 (rw,noatime)
/dev/sda3 on /etc/hosts type ext4 (rw,noatime)
devpts on /dev/console type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=666)
proc on /proc/bus type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/fs type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/irq type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/sys type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/sysrq-trigger type proc (ro,nosuid,nodev,noexec,relatime)
tmpfs on /proc/acpi type tmpfs (ro,relatime,inode64)
tmpfs on /proc/kcore type tmpfs (rw,nosuid,size=65536k,mode=755,inode64)
tmpfs on /proc/keys type tmpfs (rw,nosuid,size=65536k,mode=755,inode64)
tmpfs on /proc/latency_stats type tmpfs (rw,nosuid,size=65536k,mode=755,inode64)
tmpfs on /proc/timer_list type tmpfs (rw,nosuid,size=65536k,mode=755,inode64)
tmpfs on /proc/scsi type tmpfs (ro,relatime,inode64)
tmpfs on /sys/firmware type tmpfs (ro,relatime,inode64)
# ls -la /var/lib/docker/overlay2/l/UW2KWGUELOV2IF2YXD4AS6KFS4/usr/lib64/qemu-kvm/
total 300
drwxr-xr-x  2 root root  4096 Sep  5 00:12 .
dr-xr-xr-x 29 root root 20480 Dec 31  1969 ..
-rwxr-xr-x  1 root root 11792 Dec 31  1969 accel-qtest-x86_64.so
-rwxr-xr-x  1 root root 24832 Dec 31  1969 accel-tcg-x86_64.so
-rwxr-xr-x  1 root root  7568 Dec 31  1969 hw-display-virtio-gpu-gl.so
-rwxr-xr-x  1 root root  7576 Dec 31  1969 hw-display-virtio-gpu-pci-gl.so
-rwxr-xr-x  1 root root 12688 Dec 31  1969 hw-display-virtio-gpu-pci.so
-rwxr-xr-x  1 root root 53792 Dec 31  1969 hw-display-virtio-gpu.so
-rwxr-xr-x  1 root root  7568 Dec 31  1969 hw-display-virtio-vga-gl.so
-rwxr-xr-x  1 root root 17368 Dec 31  1969 hw-display-virtio-vga.so
-rwxr-xr-x  1 root root 47688 Dec 31  1969 hw-usb-host.so
-rwxr-xr-x  1 root root 67584 Dec 31  1969 hw-usb-redirect.so

@vasiliy-ul
Copy link
Contributor

Also one more thing to note: podman with btrfs driver sets the permissions correctly

@xpivarc
Copy link
Member

xpivarc commented Sep 7, 2022

I think this will be a specific problem with Docker and how it handles the layers. One last thing I would check is the layers of our images(It should be 1:1 with what you see with overlay2). I think the next step would check the code in Docker/file bug there(We could also check one more runtime - e.g crio).

@vasiliy-ul
Copy link
Contributor

Meanwhile, what do you think about applying a workaround in KubeVirt? Pre-creating the directory with proper permissions seems to solve the issue:

diff --git a/cmd/virt-launcher/BUILD.bazel b/cmd/virt-launcher/BUILD.bazel
index d9cc5f252..4190794fb 100644
--- a/cmd/virt-launcher/BUILD.bazel
+++ b/cmd/virt-launcher/BUILD.bazel
@@ -154,6 +154,15 @@ pkg_tar(
     package_dir = "/etc",
 )
 
+pkg_tar(
+    name = "qemu-kvm-modules-dir-tar",
+    empty_dirs = [
+        "usr/lib64/qemu-kvm",
+    ],
+    mode = "0755",
+    owner = "0.0",
+)
+
 container_image(
     name = "version-container",
     directory = "/",
@@ -169,6 +178,7 @@ container_image(
             ":libvirt-config",
             ":passwd-tar",
             ":nsswitch-tar",
+            ":qemu-kvm-modules-dir-tar",
             "//rpm:launcherbase_x86_64",
         ],
     }),

@tomkukral
Copy link
Author

@vasiliy-ul thank you for fixing it

@vasiliy-ul
Copy link
Contributor

Well, it's more like a workaround rather than a fix. But hopefully it should handle this specific problem for now. Also, raised a docker issue for that. Let's see if it gets some feedback there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants