Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kernel headers missing #1889

Closed
japhar81 opened this issue Feb 15, 2024 · 1 comment
Closed

Kernel headers missing #1889

japhar81 opened this issue Feb 15, 2024 · 1 comment

Comments

@japhar81
Copy link

Describe the bug
We need to build nvidia drivers to enable GPU nodes, and are unable to find kernel headers. I started by just trying to use rpm-ostree to install kernel-devel, but the versions don't match. So I stumbled on https://www.redhat.com/en/blog/loading-kernel-modules-to-openshift-nodes-using-coreos-layering and started trying that.

oc adm release info registry.ci.openshift.org/origin/release@sha256:4510c50834c07e71521a68c021320b5e44978245c8031a858f74328e54e89e1c --image-for=driver-toolkit returns an image, so I ran it, and it does, indeed have kernel headers:

docker run -it quay.io/openshift/okd-content@sha256:2b40bd7b2226a15709fb3610784b2cf7b997a51868fac96a43bf9bf690a071fe /bin/bash
[root@b3130ac3f63e /]# yum list installed | grep kernel
kernel-core.x86_64                            4.18.0-372.57.1.el8_6                  @rhel-8-baseos
kernel-devel.x86_64                           4.18.0-372.57.1.el8_6                  @rhel-8-baseos
kernel-headers.x86_64                         4.18.0-372.57.1.el8_6                  @rhel-8-baseos
kernel-modules.x86_64                         4.18.0-372.57.1.el8_6                  @rhel-8-baseos
kernel-modules-extra.x86_64                   4.18.0-372.57.1.el8_6                  @rhel-8-baseos
kernel-rpm-macros.noarch                      129-1.el8                              @rhel-8-appstream
kernel-rt-core.x86_64                         4.18.0-372.9.1.rt7.166.el8             @rhel-8-nfv
kernel-rt-devel.x86_64                        4.18.0-372.9.1.rt7.166.el8             @rhel-8-nfv
kernel-rt-modules.x86_64                      4.18.0-372.9.1.rt7.166.el8             @rhel-8-nfv
kernel-rt-modules-extra.x86_64                4.18.0-372.9.1.rt7.166.el8             @rhel-8-nfv

[root@b3130ac3f63e /]# uname -r
6.5.11-linuxkit

However, the nodes are not running that, since it's clearly RHEL8. From a node:

sh-5.2# uname -r
6.5.5-200.fc38.x86_64

Is there any way to actually get 6.5.5-200.fc38 packages? I cant find them on rpmfind or anywhere else. I'm not seeing any other possible way to get the right headers to build modules.

Version
4.14.0-0.okd-2023-12-01-225814 registry.ci.openshift.org/origin/release@sha256:4510c50834c07e71521a68c021320b5e44978245c8031a858f74328e54e89e1c
IPI on AWS

How reproducible
100%

@vrutkovs
Copy link
Member

So I stumbled on redhat.com/en/blog/loading-kernel-modules-to-openshift-nodes-using-coreos-layering and started trying that

Things are much simpler in FCOS world - packages don't require subscription to download, so there is no need to use driver-toolkit image. Matching kernel can be downloaded on koji

Is there any way to actually get 6.5.5-200.fc38 packages?

machine-os-content image should have repos enabled:

# podman run --rm -ti `oc adm release info quay.io/openshift/okd:4.14.0-0.okd-2024-01-26-175629 --image-for=machine-os-content`
bash-5.2# rpm -qa kernel
kernel-6.5.5-200.fc38.x86_64
bash-5.2# rpm-ostree install kernel-devel-6.5.5
...
Installing: kernel-devel-6.5.5-200.fc38.x86_64 (updates-archive)

See https://docs.okd.io/latest/post_installation_configuration/coreos-layering.html and https://github.com/coreos/layering-examples/tree/main/replace-kernel

@vrutkovs vrutkovs closed this as not planned Won't fix, can't repro, duplicate, stale Feb 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants